Monday, September 19, 2011

Cloud Computing - PIE in the sky


"Pie in the sky" is an American idiom coined by Swedish-born migrant Joe Hill in 1911. It simply means something good that is unlikely to happen or a way of referring to any prospect of future happiness which is unlikely ever to be realized.

Cloud Computing, inspite of all the attention and focus in the IT world, continues to have its skeptics.

 

As you begin to introduce Cloud Computing to your organization or your Clients you take logical steps to educate them, answer their questions on benefits, viability and sustainability. You also start something small as a proof of value to gain support. A host of factors including current economic conditions, an appetite for investment, a business case to demonstrate value and most importantly your ability to rapidly build awareness with your audience will drive the adoption of Cloud Computing.


To achieve this I would focus on three key areas : 
Performance. Integration. Economics.  (P-I-E)


Performance: 
Cloud computing models will drive performance for your service level agreements. Big data platforms such as Hadoop and MapReduce can significantly improve performance characteristics as we have seen in the implementation of leading retail, information aggregation and social media sites. Measuring real world Cloud performance is now beyond philosophical debates. There are tools out there to measure performance of each component of your Cloud solution. By validating their individual contribution you can identify the weak links for engineering in your performance model.

Integration:
If you look at the continuum of information management from information acquisition to information delivery and the intermediate translation, integration and packaging there are components that can be owned and managed by appropriate Cloud components. At some point data in these heterogeneous components has to be integrated with data from the organization that is not in the Cloud.  Though integration-as-service providers address Cloud to Cloud integration, a good amount of planning is required from bringing information blocks together with the perspective of security, governance, ownership and even performance.

Economics: 
There is significant economies of scale (labor, electricity, purchasing power with hardware vendors etc) with Cloud Computing. Housing thousands of servers in large data center reduces total cost of ownership (TCO) per server. Multi-tenant occupancy works in favor of distributing and hence reducing individual owner costs. Efficient utilization of resources for varying loads and spikes keeps a lid on demand side costs. Keep in mind, the economics is more than just reduction/elimination of capital expenditure or the costs to install/configure/buy infrastructure and services. It is also about the cost of security (or lapse there of), control, compliance, auditing and the cost of transparency (or lack there of). 

If you are a Cloud Computing practitioner, smile when you hear "oh this cloud computing is such a pie in the sky". Do not be deterred by the skepticism. The motor car, radio, electricity, television, CD, computer, cell phone .. the list is long ... they all had skeptics.

Take time to educate your organization and your Clients on the PIE with factual data. Bring awareness around technical complexities and adoption hurdles. Empower yourself and give the phrase "pie in the sky" a new and positive connotation. 

Now go and be a Rain Maker ! 

Saturday, September 26, 2009

Data Ownership in the Cloud

Data ownership in the Cloud is a widely debated topic. Data security and reliability are two key concerns. "What happens to my data once it gets into the Cloud" is getting a wide variety of answers from vendors, systems integrators and evangelists of the Cloud. This is making the consumer of Cloud computing nervous, contributing to slower adoption of the Cloud. Where do we begin? Should we focus on the data acquisition, the malicious threats, location of data, privacy, transport mechanisms or all of them and more?

This is a time to revisit (and even question) the conventional wisdom used to identify and establish data ownership.

Information management leaders are leveraging artifacts from their enterprise/global data strategy to assess elements of information value chain that can benefit from the Cloud. Assumption is that organizations who want to manage information as a strategic asset have both the data strategy and clear visibility into the information value chain. Defining what "data ownership" means is critical to identify stakeholders, align expectations and deliver value from data in the Cloud.

Along the information value chain, everyone "owns" either the tangible or intangible value of data depending on their role within or across elements of the value chain. These roles include data creators, data providers, data enrichers, data buyers, data consumers, data sponsors, data regulators et al. It is reasonable to assume that an individual or a group takes on one or more roles at any given point in time.

Clarity on data ownership is the first step to alleviate the worries about data in the Cloud and to unlock the potential of Cloud computing.



Saturday, August 22, 2009

The Ideal Cloud

Couple of weeks ago I wrote about Cloud Classification with a note to write about the Ideal Cloud.

Ideal for whom? why? when? and I cycle through the W's of Zachman framework. There are more than ten definitions of Cloud Computing out there. Draw a scatter plot and you see the confidence and correlation of elements distributed amongst the Grid, SaaS, distributed application design, utility computing and virtualization models.

Below are my thoughts on characteristics of an Ideal Cloud. Comments are always welcome.

1) Business economics - Simple. No descriptions necessary. Finito.

2) Empowering the User - Giving the end user (line of business or individual) the power of independence through minimum or no administrative work, easy to use user interfaces and improved accessibility. Focus is improving business agility and taking away the "distractions" of managing computing resources (servers, applications)

3) Dynamic Computing - Infrastructure is ready and available when required based solely on utilization, independent of the provider (in-house or a service provider). Call it elasticity, flexibility, ease of provisioning/de-provisioning, on-demand et. al.

4) Self Managing, Self Healing - Precise responsiveness to usage characteristics, seasonal peaks, complex access patterns, disruptions, service level agreements and event based reservations.

5) Pay-at-the-pump - Extend the concept of paying for gas, utilities and such. Pay for what you use. Get discounts based on usage, scale, loyalty and life time value.

6) Data ownership - Unambiguous ownership and stewardship of acquiring, integrating, packaging (partitioning, synchronization, security) and distribution of data. Supplement this with data asset integrity framework to address governance, compliance and regulatory requirements. It is all about the data in my blogs, rest is just details :)

Cloud Computing must help you achieve on all three fronts: reduce costs, extend capital investments, improve business agility.

Aiming for anything less or have a different agenda? This Cloud will usher in storms. Invest in an umbrella.

Saturday, July 11, 2009

Cloud Computing - Classifying your Clouds

Classification is part of information science whether we use bibliographic or faceted notation. Since we are classifying Clouds, let us take literary meaning from a weather perspective to make this a little more fun!

Clouds form when water vapor condenses onto microscopic dust particles (or other tiny particles) floating in the air. Think of particles as your dispensable computing resources. Water vapor are your computing or transactional requirements. Cloud computing simply is dispensable resources brought together at a rapid pace to address your business requirement of large scale computing such as business intelligence, large volumes of transactions or to deliver rich content to your consumers.

Include dimensions of cost, security, governance and transparency in provisioning services (including service level agreements) and you draw a complete picture of your Cloud.

We have four major categories of clouds in the troposphere. Stratocumulus, Altostratus, Cirrus and Cumulonimbus. In English, they are the low clouds (under 6,000 ft), the middle clouds (6,000-20,000 ft), the high clouds (over 20,000 ft) and clouds with vertical development (ground up to 50,000 feet).

Drawing a parallel with the Cloud Computing vernacular:

1. Low Clouds are your Departmental Clouds

Computing resources used by a line of business. Marketing effectiveness, Ready to launch activities etc.

2. Middle Clouds are your Bridging Clouds

Mainframes acting as Cloud to integrate with CRM providers such as SalesForce.com. Traditional data integration meeting ontology driven Semantic Web etc.

3. High Clouds are your Community Clouds

Multiple business entities coming together to address common computing challenges. Homeland security, Cancer research, Value at Risk frameworks etc

4. Clouds with vertical development are the Ideal Clouds
Self-learning, intelligent Clouds that go through metamorphosis based on events, predictability and projections. I plan to write about them on this blog.

A combination of the above can be grouped into Public, Private or Hybrid clouds based on economy of scale, governance and quality of service.

Cloud Computing components include providers of "X" as a service (XaaS, X=software, platform, infrastructure, storage et al), managed service providers, chargeback utility computing and so on. These are members of the new Troposphere.

Any water cooler conversations around Cloud Computing, silver linings (pun intended) and myriad of possibilities can be classified as a fog. Be cautious jumping on a bandwagon a.k.a Hype Cycle (term coined by Gartner Group) without strategic planning. You may end up with Mammatus cloud...mostly associated with tornadoes.

How does Cloud Computing influence and impact the value of data, your strategic asset. I plan to write about it next.

Share with me your real world experiences of preparing and realizing benefits of provisioning in the Cloud. In the world of Cloud Computing are you a Rainmaker ?

Monday, June 29, 2009

Collaboration and visualization using Google Fusion tables

Google introduced the Fusion Tables as a beta service in early June'09.

I was driven to test FusionTables on the following premises :
a) The promise of managing large data sets in the Cloud
b) Validating the simplicity of collaborating and interpreting on two or more data areas through available visualization features
c) Leverage other Google features such as GoogleDocuments, GoogleTalk etc
d) Readily available and free features from my Google account

I used two sources of data (one each for depth, width):
a) Product consumption/demographics data generated on googleDocuments (100,000 rows, 10 attributes)
b) SAP Material Master generated using Microsoft Excel (10,000 rows, 350 attributes)

Generated data density : 80-85%

Invited 4 Google users as collaborators (2 viewers, 2 contributors)
- 2 in USA, 1 in India, 1 in Brussels, Belgium

Used GoogleTalk for IM/conference calls

Observations:
1. Visualization options were very good with options for map, intensity, standard (line/bar/scattered/pie), motion and timeline.
2. Merge options were simple, yet powerful. Expecting Google Research to add powerful features in data integration.
3. Filter, Aggregate options were limited but good for beta
4. Simultaneously engaging the team on googletalk was simple & productive for changes /feedback/interpretation of aggregates/data gen/sorting/filtering et al.


Completed the entire exercise in little over an hour.

FusionTables is a promising Google Lab service for managing tabular data sets in the Cloud.

Next, I plan to test a data management framework :
a) Introduce "poor quality" data into the larger data set
b) Assign 2 viewers as business data stewards
c) Use GoogleTimeline to plot the journey of data quality assessment and enrichment
d) Deliver data movement and progress status using twitter service to 1 data owner

All this time-boxed @ 1.5 hours (approx)

Email me if you are interested in participating.