Monday, June 29, 2009

Collaboration and visualization using Google Fusion tables

Google introduced the Fusion Tables as a beta service in early June'09.

I was driven to test FusionTables on the following premises :
a) The promise of managing large data sets in the Cloud
b) Validating the simplicity of collaborating and interpreting on two or more data areas through available visualization features
c) Leverage other Google features such as GoogleDocuments, GoogleTalk etc
d) Readily available and free features from my Google account

I used two sources of data (one each for depth, width):
a) Product consumption/demographics data generated on googleDocuments (100,000 rows, 10 attributes)
b) SAP Material Master generated using Microsoft Excel (10,000 rows, 350 attributes)

Generated data density : 80-85%

Invited 4 Google users as collaborators (2 viewers, 2 contributors)
- 2 in USA, 1 in India, 1 in Brussels, Belgium

Used GoogleTalk for IM/conference calls

Observations:
1. Visualization options were very good with options for map, intensity, standard (line/bar/scattered/pie), motion and timeline.
2. Merge options were simple, yet powerful. Expecting Google Research to add powerful features in data integration.
3. Filter, Aggregate options were limited but good for beta
4. Simultaneously engaging the team on googletalk was simple & productive for changes /feedback/interpretation of aggregates/data gen/sorting/filtering et al.


Completed the entire exercise in little over an hour.

FusionTables is a promising Google Lab service for managing tabular data sets in the Cloud.

Next, I plan to test a data management framework :
a) Introduce "poor quality" data into the larger data set
b) Assign 2 viewers as business data stewards
c) Use GoogleTimeline to plot the journey of data quality assessment and enrichment
d) Deliver data movement and progress status using twitter service to 1 data owner

All this time-boxed @ 1.5 hours (approx)

Email me if you are interested in participating.