Recently, with the publication of our ML Platforms ebook, and our conference on the topic quickly approaching, I’ve been talking a lot about scaling machine learning, deep learning and AI in the enterprise.
Every so often, in a conversation about the topic, I’ll get a lot of vigorous head nodding, and then, out of the blue, a comment that’s squarely focused on compute issues, such as GPUs vs CPUs, distributed training, clusters and the like. It’s always a bit disorienting when this happens, as if we were in two different conversations. I mean, I get it. It’s satisfying to think you can just throw hardware at a problem like enterprise ML/AI. But “scaling” ML/DL for business (vs in research or academia) is a much broader challenge, and opportunity.
More Than Meets the Eye
Here are three perhaps under-appreciated aspects of scaling ML/AI in the typical enterprise context:
- Velocity. How long does it take your data scientists or ML engineers organization to get a model developed and into production. Yes, I know “it depends,” but when the business needs the nth model based on customers and purchases, have you built the infrastructure to deliver that without reinventing the wheel each time?
- Volume. In a given timeframe, how many experiments are your data science/MLE teams able to run? How many models are you able to get into production? How many are you able to manage once they’re in production? Do you have tooling in place to manage the entire ML lifecycle, from experiments to models in production?
- Impact. How do you get the most “bang” out of the bucks you’re spending on ML and data science? If you’re like, oh, EVERYONE I’ve ever spoken to, your org doesn’t have enough data scientists or MLEs to pursue all the solid opportunities you’re presented with. So how do you scale the impact of those you have? Are you making the investments required to ensure that they’re spending their time on high value parts of the ML/DL workflow and automating the rest?
When I talk to ML/AI, data science, and business leaders, these are the issues I’m hearing that they’re concerned about: How do they move more quickly? How do they keep up with the demands of the business? How do they build infrastructure and processes that will allow them to quickly get new teammates productive.
Want to learn more about addressing these challenges and scaling your organization’s machine learning, deep learning and data science efforts? Join us at TWIMLcon: AI Platforms for two full days of technical presentations, case studies, expert panels and live podcast interviews and leave with a plan for maximizing your team’s velocity, volume and impact. https://twimlcon.com