The reality behind Hadoop Threat

The news of the hour is that Hadoop is not going to replace the data warehouse any time soon. However, another big news is that Big Data and Hadoop Framework do poses a big threat to the profit margins of data warehouse vendors including IBM, Microsoft and Oracle.
According to a Big data survey, over 60% of big data practitioners which have worked with Hadoop have moved workload from mainframe warehouse to Hadoop. It is expected that over 30% of users will shift in the same way in coming six months. Another unexpected finding of this survey is that the majority of the work load involves a huge data transformation. Business intelligence reporting workloads are not that far behind and are considered as the most important thing in the data warehouse enterprise.

It is also expected that this trend will continue because Hadoop vendors are continuously trying to improve the SQL on Hadoop offering. Technologies such as Actian, hortonworks and Cloudera allows less sophisticated analyst and data scientists to query the data in Hadoop while using the SQL like tools. It also helps developers to build applications driven by Hadoop based data.

Feedback from Big Data survey and Big Data practitioners shows that Hadoop’s involvement on the data warehouse’s territory is partially driven by cost involved in the process. One of the practitioner at a big national insurance company reported that his company has actually frozen its data warehouse spend basically by shifting workloads to Hadoop and it expects to literally decrease its data warehouse bill as compared to last five years bills.

Some other vital factors are also at work which is Hadoop’s ability to store and process almost all kind of data which can be semi-structured, structured and unstructured.

YARN is the reason that Hadoop is becoming more real-time and capable of multi-application capable. Few startups have actually built real-time fraud detection applications which are based directly on Hadoop and Spark that are live and in use now at many companies.

Most of the data warehouse vendors blames Hadoop’s immaturity to deviate practitioners from creating a mission-critical applications directly on the open source framework. The actual reality is that Hadoop will surely replace the data warehouse but only for some specific workload and not for everything. That is the reason why Hadoop venders and data warehouse vendors are competing which is helping Hadoop in getting mature.


Read More Related This :

Big data hadoop developers are posting this article to let the development community know how to get top N words frequency count via distinct articals in a sorted way using hadoop MapReduce paradigm. You can try your hands on the code shared in this post and feedback your experience later.