Cloudera Shares Customer Lessons on How to Scale Production Machine

To make machine learning (ML) repeatable and scalable, you need to invest in serving infrastructure (the “last mile”), ML operations, and governance, says Cloudera’s Sr. Product Manager Alex Breshears in the MIT-Cloudera webinar “How to Scale Production Machine Learning in the Enterprise.”

In the webinar, Breshears shared key challenges and lessons learned from Cloudera customers who have built large-scale production ML systems.

The webinar also featured Tom Davenport, a distinguished professor and author of several books including Competing on Analytics and The AI Advantage: How to Put the Artificial Intelligence Revolution to Work.

Many organizations experimenting with AI and ML learn very quickly that ML models make up only a small fraction of real-world ML systems – the small black box in the middle of the diagram below, said Breshears, citing a diagram from a paper by Google researchers. Production ML requires a lot more.

Courtesy: Sculley, D. et al. “Hidden Technical Debt in Machine Learning Systems”, NIPS 2015

Organizations intending to put ML into production and run it at scale need to invest in the following:

Serving infrastructure: How will the output of an ML model be served to its consumers or integrated with applications they are using? (The “last mile” delivery of ML predictions.)
Model operations and monitoring at scale: These models need to be packaged, deployed, monitored for performance and drift, and retrained on a periodic basis. If you only have a handful of models, you can do that manually. If you have thousands of them in production like Cisco Systems (the example Davenport gave), you’ll need ModelOps. Cisco has gone from having a few models in production to 60,000 sales propensity models covering 160 million of its customers. The only way to achieve this without hiring an army of data scientists was by creating a “model factory.”
ML governance: You will also need to think about – and plan for – model security, model governance, model catalogue, etc.

Our Take

To achieve scale with ML and truly start reaping its benefits by embedding it everywhere, you will need to automate as many components in the ML development and deployment lifecycle as possible. While production ML projects are largely custom, a platform like Cloudera (and other tools – see “Want to Know More?”) can help you achieve that automation.

Want to Know More?

Get Started With AI: Fast-Track Your AI Explorations by Learning From Early Adopters

Databricks Raises $400 Million in Series F Funding Led by Andreessen Horowitz to Accelerate R&D

KenSci Wins Garner and Microsoft Awards for its AI-Powered Predictive Healthcare Platform

Dessa Launches Atlas 2.0, a Foundations Suite of Tools for Building ML at Scale

Latest Research

Cloudera Shares Customer Lessons on How to Scale Production Machine Learning

Our Take

Want to Know More?

Latest Research

Latest Research

Our Take

Want to Know More?

Latest Research

LastPass Update: Roadmap and Recent Developments, 2H 2025

Microsoft Adopts Ontology-Based IQ Layer for Agentic AI

Three Predictions for CX Technology Markets in 2026

Zoho Announces Major Updates to Billing and Spend Solutions for the Enterprise

AI Meets Enterprise Architecture: From Knowledge Graphs to Digital Twin Strategies

CLDigital Adds AI to Simplify GRC & BCM Tasks

Amazon Connect – A Significant Beneficiary to Product Announcements at AWS's re:Invent 2025

The Newest at UKG: The Workforce Operating Platform

Agentforce 360 – Marketing Ploy or Genuine Transformation?

Title