We have seen that the Hadoop ecosystem has several component parts, all of which exist as their own Apache projects. Since Hadoop has become extremely popular and widely used, it also going through few significant further changes, various versions of these open source community components may not be fully compatible with the rest of the components. This issue led to considerable complexities for users trying to get an independent start for Hadoop by downloading and compiling components directly from Apache.
Red Hat is, for most of the users, defines a very smart and great model of how to successfully make money in the open source software industry. Red Hat has taken Linux (an open source OS), bundle it with all required components, and generated a simple installer, with providing paid support to their users. In a very similar fashion, how Red Hat has distributed a handy packaging for Linux, many of the companies have also bundled Hadoop and some related components into their own Hadoop distributions.
Perhaps the well-known player of the field, Cloudera is able to claim Doug Cutting, Hadoop’s co-founder, as its chief architect. Cloudera is also seen by most of the users as the market leader in the Hadoop domain since it released the very first commercial Hadoop distribution and it is one of the highly active contributor of code to the Hadoop ecosystem.
In software development and for commercial purposes Cloudera is no doubt the most prominent and widely used distribution of Hadoop. Cloudera Enterprise, a product developed by Cloudera at the heart of what it calls the “Enterprise Data Hub,” comprises the Cloudera Distribution for Hadoop (CDH), an open-source-based distribution of Hadoop and its related components with its proprietary Cloudera Manager. It also included a technical support subscription for the key elements of CDH. Cloudera’s most important business model has long been based on its capability to leverage its poweful CDH distribution and provide paid services and support.
In the end year of 2013, Cloudera officially announced that now they are going to focus on adding proprietary value-added components on the top of open source Hadoop layer so as to act as a market differentiator. Also, Cloudera has made it a consistent and most important practice to speed up the adoption of alpha- and beta-level open source code for the latest Hadoop releases. Its strategy is to take components it deems to be mature and retrofit them into the existing production ready open source libraries that are added in its distribution. As a market leader in Hadoop, Cloudera is a true game changer for Hadoop and also support the other players of the same market.