Comprehensive software reviews to make better IT decisions
Databricks Lakehouse Combines the Best of Data Lake and Data Warehouse in a Single Platform
Databricks users can now work with a network made up of Fivetran, Qlik, Infoworks, StreamSets, and Syncsort to automatically load data into the lakehouse. “Lakehouse” is a new term coined by Databricks to combine the best aspects of data warehouses and data lakes. This can be a significant value-add for an organization looking to combine business intelligence (BI) and machine learning (ML) use cases.
Source: Databricks, 2020
Databricks’ lakehouse provides the following key features:
- ACID transaction support
- Schema enforcement and governance
- BI support
- Storage is decoupled from compute
- Open data, integration, and tools standards
- Support for diverse data types ranging from unstructured to structured data
- Support for SQL, ML, and other frameworks
- End-to-end real-time streaming
The data lakehouse and the idea of providing a single unified data platform is not new. Vendors like Azure Synapse, Snowflake, and Amazon Redshift try to innovate the traditional data storage and processing platform. However, many of them are not fully functional. Technology offerings from some of these vendors are a mix of strengths and weaknesses. An organization must carefully evaluate their core mandatory requirement prior to adopting such a broad platform, as some critical functions needed in ETL or SQL may be missing in these technologies for some time to come.
Want to Know More?
Joining the ranks of giants such as Snap (Snapchat’s parent company), Microsoft and Tesla, Immuta the automated Data Governance company has been named to Fast Company’s 2020 list of the World’s 50 Most Innovative Companies.
The EU plans to invest €6 billion to build a single European data space, reports EURACTIV. The envisioned space will house personal, business, and “high-quality industrial data” and create the infrastructure for data sharing and use across businesses and nations.
Microsoft claims its newly announced Azure Synapse Analytics service is four times faster than Amazon Redshift and 75 times faster than Google BigQuery. This announcement positions Microsoft as a leader in this market, but it is also likely to generate counterclaims from its competitors.
AWS Lake Formation makes it easier for users to set up and manage data lakes. But organizations will face challenges in determining how to derive value from their data lakes.
Tableau and AWS Expand Strategic Relationship to Bring Analytics in the Cloud Closer to Their Customers
Leading analytics player Tableau recently announced its new initiative – Modern Cloud Analytics (MCA) – which sees it partnering with Amazon Web Services Inc. (AWS) to make cloud-based analytics more achievable for their customers.
Cambridge Semantics enhanced its Anzo platform to enable data management and analytics over both structured and unstructured data, the firm announced in an August 22 press release.
Several discussion threads on LinkedIn and other social media have been dedicated to the status of Apache Hadoop and the merged Cloudera/Hortonworks. Many predict their demise is not far off. How substantiated are those predictions?
Snowflake has announced a new data exchange allowing businesses to generate new revenue.
The two major Hadoop developers – Hortonworks and Cloudera – merged into one company at the dawn of 2019.