Pages

Saturday 22 March 2014

Big Data

The IT buzz word –“BIG DATA” has been making news since a year.But,what is this BIG DATA? Is it the next ground breaking technology which will change everything or is it just a hype which will die down after sometime? Is it really good or is it bad?

Here’s a short explanation for the BIG DATA....

To begin with,let’s understand what does Data alone mean. Data are the values belonging to a set of items.It can be qualitative or quantitative values such as texts,audios,videos or any files .

Imagine all the information you alone generate each time you post to social media sites such as Facebook or Twitter,enter registration info on a online shopping site such as Amazon.Now try to imagine your data combined with the data of all humans, corporations, and organizations in the world! From health care to social media, from business to the industries, humans are now creating more data than ever before. 98,000 tweets, 695,000 Facebook status updates, and 11 million instant messages are sent through the Internet every 60 seconds. A shocking 90% of all the world’s data was created in the last two years.It is predicted that 40 zettabyes (43 trillion GB) of information will be created by 2020.

Welcome to “BIG DATA”, an innovation that can no longer be avoided.

Big Data is the name given to the classes of technologies that needs to be used when your data volume becomes so much that the RDBMS(Relational Database Management Systems) technologies can no longer handle it.To help us understand about "BIG DATA", let’s break it down into four dimensions: volume, velocity, variety, and veracity as shown in the following infographic (Source:IBM )
From the Infographic,it is clear that if your data volume can be handled efficiently by RDBMS you need not worry about Big Data.

Where is this huge amount of data coming from?

To understand the sheer size of this Big Data, view the infographic below (Source: DOMO and Column Five Media )



How did it all start?

With the advent of cloud computing which provided easy access to massive amount of distributed computing power there was a realization RDBMS cannot be effectively parallelized.That’s when non-relational databases sprang up. Google and Facebook began looking for ways to handle Big Data which requires large distributed processing power.It lead to Google publishing the paper on the “Map-Reduce” algorithm. This algorithm involves processing of highly distributable problems across huge datasets using a large number of computers. Then there was Apache open source “Hadoop” project which created its own implementation of Map-Reduce. The largest Hadoop implementation is probably at Facebook.

Why to process so much data?

The in-depth analysis on the data collected by several healthcare organisations will help us to make better decisions on the treatment methods.The data on Twitter or Facebook about some product could be used to come up with innovative product design.Therefore, for most of us, Big Data is a solution which is in search of a problem.

Can We Really Trust Big Data?


Fairy tales usually start with ‘Once upon a time ...' and end with ‘... And they lived long and happily ever after'. But nobody explains ‘how' the heroes live long and happily ever after. Big data promises to bring transformation, but just as in fairy tale endings, big data will not explain ‘how' to transform .This is just the beginning .In a way,we have begun to realise that Big Data is affecting every one of us.But,what will happen next ?Is it really a solution to a problem? We need to wait and watch....

No comments:

Post a Comment