Besides Cloudera, there are few other popular Hadoop distribution which are well implemented for commercial and development purposes.

EMC: Pivotal HD, a Hadoop distribution by EMC, primarily integrates EMC’s powerful parallel processing (MPP) DB technology (initially known as Greenplum in early days, and now popularly called as HAWQ) with Apache Hadoop. The outcome is a very high-performance and scalable Hadoop distribution with an enabled SQL processing for Hadoop. SQL-based queries and several business intelligence (BI) tools can be used for the data analysis which is stored in HDFS.   


Another prominent market leader in the Hadoop, Hortonworks has very large count of committers, developers and code contributors in the Apache Hadoop ecosystem components. (Committers are referred as the gatekeepers of Apache projects and got powers to approve code changes.) Hortonworks is a spin-off from Yahoo!, who was the initial and original corporate driver of the Hadoop project since it needed a massively large-scale platform to support and maintain its search engine business.

 From all the Hadoop distribution vendors, Hortonworks is one of the most committed members for the open source movement, based on the huge volume of the development work it has contributed to the Hadoop community, and because all its development efforts are (eventually) moulded into the open source codebase. The Hortonworks business development model is based on its capability to leverage

It’s become extremely popular due to HDP distribution and now also provides paid services and support. But, it does not sell any proprietary software. Rather, the firm enthusiastically supports the idea of working within the open source community to design solutions that resolves enterprise feature requirements (for instance, very fast query processing with Hive). Hortonworks has developed a number of relationships with established corporate houses in the data management domain like Teradata, Informatics, Microsoft and SAS in software development. Though these organizations don’t have their own, in-house Hadoop offerings, but they collaborate with Hortonworks to offer integrated Hadoop solutions and their own product stacks.

The Hortonworks Hadoop provides the Hortonworks Data Platform (HDP), which consists of Hadoop with several related tooling and projects. Also unlike Cloudera, Hortonworks publish only HDP versions with production- level code for the open source community 


MapR offers a full package, a complete distribution for Apache Hadoop and other related components that are independent of the Apache Software Foundation. Boasting no Java dependencies or any reliance on the Linux based file system, MapR is by far the only Hadoop distribution that offers full data protection, no single points of failure(SPOF’s), and significant ease-of-use advantages. There are three kinds of  MapR editions available : M3, M5, and M7. The M3 Edition is free and available for unlimited production use; MapR M5 is intermediate-level subscription software offering; while MapR M7 is a complete distribution (full- package) for Apache Hadoop and HBase that consists of Pig, Hive, Sqoop, and many more.

  Modified On Mar-14-2018 03:37:49 AM

Leave Comment