Home > Categories > Big Data > The End of Hadoop and Cloudera?

This content is currently locked.

Your current Info-Tech Research Group subscription does not include access to this content. Contact your account representative to gain access to Premium SoftwareReviews.

Contact Your Representative
Or Call Us: 1-888-670-8889 (US) or 1-844-618-3192 (CAN)

The End of Hadoop and Cloudera?

Hadoop was designed as a NoSQL and non-proprietary technology to persist large volumes of structured and unstructured data with minimal risk of loss using inexpensive hardware. I believe it has fulfilled this goal pretty well. So, where’s the problem?

I think the root cause of all “failed” Hadoop implementations lies in the “enterprise data warehouse” mentality, architecture and methodology used to build enterprise data lakes. Trying to squeeze all phases of a data lifecycle – creation, accumulation, integration, augmentation, packaging for BI and analytics – into one monolithic process implemented in a technology that was not designed for such use case is doomed for failure.

Instead of acknowledging the methodological mistake, the implementers started blaming the technology – which was never designed for their use case in the first place. This blame resulted in an economic downturn for the main vendor – Cloudera – which apparently positioned the technology to match the “enterprise data warehouse” expectations rather than what it is best suited for. This strategy resulted in the sharp drop of the stock price and the departure of the CEO in June 2019. However, the market is still betting on the technology and the company – they may not be ready to go yet.

Source: SoftwareReviews Big Data Data Quadrant, Accessed August 21, 2019.

Our Take

Hadoop may be still a good choice for structured and unstructured data accumulation and “as is” storage. Its technology may still be too rudimentary for data augmentation and is absolutely a misfit for data packaging for BI and analytics. Hadoop is in the trough of disillusionment – and this is good. I hope that Hadoop failure stories will be regarded as lessons learned against the magic-technology-first approach to creating data management solutions. There’s no “magic technology” – even AI is not magic. Every good solution requires a combination of business, information, and technology architectures. Data creation, accumulation and persistence, data integration and augmentation, and data packaging for BI and analytics are all distinctly different phases of data progression and require different architecture, governance, and technologies.

Want to Know More?

Create a Customized Big Data Architecture and Implementation Plan
Architect Your Big Data Environment
Main Hadoop Developers – Hortonworks and Cloudera – Under One Roof
SAS Hadoop on SoftwareReviews

Other Recent Research in Big Data

Big Data

IBM Raises Price on Software Support; Shoves Customers Toward the Cloud

IBM is changing the terms of its ubiquitous Passport Advantage agreement to remove entitled discounts on over 5,000 on-premises software products, resulting in an immediate price increase for IBM Software & Support (S&S) across its vast customer landscape.

Big Data

PHEMI: A Data Privacy Tool for Healthcare Providers

PHEMI is a data privacy solution focused on keeping data-processing activities secure by redacting information based on the role of the accessor. Thus, allowing such data to be used for multiple use cases without compromising privacy.

Big Data

Immuta Named to Fast Company’s 2020 List of the World’s 50 Most Innovative Companies

Joining the ranks of giants such as Snap (Snapchat’s parent company), Microsoft and Tesla, Immuta the automated Data Governance company has been named to Fast Company’s 2020 list of the World’s 50 Most Innovative Companies.

Big Data

Databricks Lakehouse Combines the Best of Data Lake and Data Warehouse in a Single Platform

Databricks has launched a new Data Ingestion Network, made up of partners whose integrations to Data Ingest provide hundreds of connectors and enable automation to move disparate data into Databricks’ new storage layer, eliminating the need to maintain siloed data in a data lake and data warehouse.

Visit our COVID-19 Resource Center and our Cost Management Center
Over 100 analysts waiting to take your call right now: 1-519-432-3550 x2019