Monday, May 2, 2011

Cloud and Big Data Meetup

May 1st, 2011 Micorsoft Campus, Cloud Computing Camp !
A very Rewarding Experience indeed !

My personal interest was to understand how Cloud Computing, Hadoop, NoSql helping Developers to solve problems from purely a Programmer's Viewpoint.

Food for thoughts

End of the day, it answered some of the most important questions
- how to choose a cloud service provider ?
- how to leverage PaaS and IaaS and whats the thin line between them?
- why companies like Netflix very selectively has chosen Amazon Cloud  and how they are creating platform on AWS like Cassandra as a Service ?
- How to adopt Open Sourced CloudFoundry and Microsoft Azure PaaS ?
- What OpenStack can offer to individual developer / start-ups to enjoy the fruits of IaaS at little cost ? 

And of course there were very thoughtful reflections on how to choose what type of CAP model, deep-diving into MongDB internals and talk on a wonderful array of tools to make op's and dev's life smooth while building a scalable system !!

Brush the basics and sky-rocket into Clouds

It all started with a great Keynote Speech from Cisco CTO -
A quick reminder of basic properties of a Cloud Provider -
Shared Resource, Scalability, Self-Service, Multiple Data-Centers, Measurable Resources, Design-for-failure, Auto-Recovery and Centralized monitoring.

These are well-known facts. But the most interesting point to note was evolution of OpenStack.
- Open eliminates Vendor Lock-in. Freedom to federate and move between clouds.

The intricate details of Network Virtualization, Hypervisor integration didn't attract me much ! Rather was interested in tools around Open Source Clouding.
http://www.slideshare.net/socializedsoftware/crash-course-in-open-source-cloud-computing

Convergence of Network, Compute and Storage is going to be the driving point for handling information explosion and incredible video transmission in coming years.
http://www.slideshare.net/lewtucker/openstack-time-is-now-lew-tucker

Case Study - Simplify life

Netflix being one of the largest Amazon Cloud Consumers, shared its pain points and need for migrating to the Cloud.
- How Cloud helps us get rid of the 'indefinite wait-cycle' problem in traditional data-center.
- Keep waiting for 'Permission Grant', 'More Space Allocation' , 'Re-organize Capacity','Endless meetings with IT' etc. ... 

No more waiting ! By 2012 probably Cloud Model will be seamlessly part of every single software maker !!
In case of Netflix - Amazon API is the IT Dept !!
The best part is the development to deployment flow - Build, War, Rpm, bakes AMI, launch in cloud! Huh !
How did Cloud simplify life ?
Quoting Adrian ....   (Cloud Architecture)
" Central SQL -> Distributed Key/Value NoSQL
  Sticky In-Memory Session > Shared Memcached Session (for Others cold cache like MySQL Native Memory / Terracotta Big-Table)
 Chatty Protocol > Latency Tolerant Protocol
  Tangled Service Interface > Layered Service Interface
  Components as Jars > Components as Service
  Fat Complex objects > Lightweight Serializable Objects .."
Here goes the great story - http://www.slideshare.net/adrianco/migrating-to-public-cloud

On a different context, another interesting article from Adrian Cockcroft on 'creating NoSql service over Amazon Cloud' - http://prezi.com/veagqhsz38u8/nosql-maslows-hierarchy-of-reads-and-writes/

PaaS - the Jewel Box

It was really exciting to know that Windows Azure is actually an OS on Web which allows user to run any windows-compatible application like accessing database blobs, NoSql data storage, VPN, CDN, Service Bus, Access Control, Monitoring.

Anyone interested to try out Azure, can just login at http://windowsazurepass.com/ with passcode 'meetup'

Well ! Finally the much-awaited .. CloudFoundry !
VMWare-SpringSource means so much for the Java Community after the demise of Sun !!!
Now you get Grovvy-on-grals, Redis, Node.js and plenty of other services out-of-the CloudFoundry PaaS !
Tthe open source advocate Ezra gave a lightning talk about the internals of CloudFoundry :
"First user develops application foo.
vmc push foo -> talks to Cloud Controller through Rest
Cloud Controller takes a snapshot of the 'foo' app structure.
Converts this application to a Droplet runnable in queue.
Droplet Execution Agent node can sit on Amazon / RackSpace completely abstracted from User.
Staging process finds what runtime to launch and load all infrastructure for example it will introspect a war file  and find the require jdbc driver and load it and so on.
Staging process send messages to all DEA nodes to find who is least-loaded and can execute the request.
All components are connected through ESB.
Also there is a HealthManager Tool that polls Status Table and matches with Real World State and broadcasts messages !  ..."   ..  Thats  a Long Story cut Short !
Here goes the full story - http://blog.cloudfoundry.com/post/4754582920/cloud-foundry-open-paas-deep-dive
Well if you are lucky to get your CloudFoundry Passcode , you can start playing with it through great STS IDE !

Build your Dinosaur to crunch Big Data

Okay ! Now that we were surfing long on clouds .. its time to understand best practices of handling Big Data (that eventually may fly on cloud) !

Paco Nathan of IMVU fame delivered a very valuable lecture !
http://www.slideshare.net/pacoid/hidden-gems-found-with-hadoop

The Take Away message :
- Select Data Frameworks based on your Data Access Patterns :
- Relational Data is not good for Queues, Polling Operations, Social graph, Data Analysis and so on !
Just can't resist to highlight the following  from the slides !
How to apply CAP in various scenarios ?
" Financial Transactions - General Ledgerin RDBMS -  CAx
ad-hoc queries - hosted MySQL - CAx
log - rotation - Riak - xxP
Search Index - Lucene, Solr  - xAP
Static Content Archive - S3 - xAP
Customer facts - Redis , Membase  - xAP
Distributed counters, sets - Redis - xAP
CRUD - key/value   - CxP
Data preparation - Hadoop/ Hive - CxP
Graph Analysis - Hadoop + Redis + Graph - CxP
Data Mart - Hadoop / Hive / Hbase  - CxP.."
In the same line of identifying correct methodology for data analysis based on data access pattern, Apple threw some light about its in-house analytics flow :
"Sensors connect to Cassandra over rest to push data (the Click / Navigation / other User Events).
Aggregators read data from Cassandra Key-Value store then aggregate the data through offline batch and replace the data for the keys (compaction) inside K-V Store..."
Instrument, Measure , Manage Chaos ! Celebrate !

After the thought-provoking sessions, it was time to ride though the Twitter Roller Coaster !
The Mantra
- 'We can be Successful only if We can measure'
- Adopt this early and correctly as per your enterprise system architecture !
- Remember to minimize - MTTD (Mean time to Detect) and , MTTR (Mean time to Recover) !
-  Instrument Everything ! Cache all decoupled data layers ! 
- Use the correct tools
 - for Ruby on Rails  use Passenger (apache load-balancer fix) , Unicorn (Server) in place of Mongrel, use  
    Google Perf Tool.
  - In general use Puppet and Chef as Configuration Tools.
  - Explore these tools and use as per need .. Whales, RainBird, Ivy, Artifactory, AppDynamics, EpicNMS..
http://www.slideshare.net/mattray/scale-2011-deploying-openstack-with-chef

Enough of appetizer ... the meal is served hot here http://www.slideshare.net/netik/john-adams-talk-cloudy

Crash Course - learn MongoDB

Next what could be more enticing than plunging into the internals of MongoDB !
Alvin Richards is simply great -- http://www.scribd.com/doc/50019946/MongoAsia-Scaling !
Sharding could be ridiculously simple just by creating a compound key {server:1, application:1, time:1} and participating in shard - db.runCommand({addshard :  "shard1"}) ...  Huh !

He reflected on - Right-balanced-Indexing, Parallel execution of Queries,  Range-based partitions, Automatic Sharding, Consistent Hashing, Replication with asynch master/slave, automatic failover through consensus election and many more ! 
For  a technical deep-dive into consistency models look into - http://www.10gen.com/video/mongosv2010/consistency
For hands on refer to https://github.com/SpringSource/spring-data-document-examples and follow http://www.10gen.com/video/mongosv2010/spring

Build a successful SaaS Business ! Sweet dreams !

Well !  After the rewarding technical sessions it was just perfect to wrap up the day with some insights into building successful SaaS Business !
http://www.slideshare.net/KenRutsky/bridging-to-saas-success-a-basic-blueprint

Big thanks to Sebastian and other Event organizers !

References : http://www.meetup.com/cloudcomputing/events/16701362/

No comments: