MLOps: Emerging Trends in Data, Code, and Infrastructure

MLOps: Emerging Trends in Data, Code and Infrastructure

Madrona partnered with AWS on a white paper to examine MLOps — what it is and what is changing. The AI Fund and Sequoia also contributed to the piece. Read on for Tim Porter and Jon Turow’s investment thesis around MLOps, which also highlights portfolio companies WhyLabs and OctoML.

To read the full white paper, which dives more into defining MLOps and how startups and AWS work together to enable business-critical deliverables and outcomes, click here for the PDF: MLOps: Emerging Trends in Data, Code, and Infrastructure

What is your investment thesis for operational ML systems?

We feel we are still in the early innings of this massive wave of every application becoming an intelligent application and have only begun to see the potential positive impacts of machine learning.

At Madrona Venture Group, the single biggest theme we have been investing in over the last 5+ years is the evolution of nearly every application into an intelligent application. We define application intelligence as the process of using machine learning models embedded in applications that use both historical and real-time data to build a continuous learning system. These learning systems solve a business problem in a contextually relevant way – better than before, and they typically deliver rich information and insights that are either applied automatically or leveraged by end-users to make superior decisions. To build and scale intelligent applications into production, organizations need operational ML systems – MLOps products and tools. As such, we have not only been investing in “finished” intelligent applications, like Highspot in sales enablement or SeekOut in recruiting, but also in “intelligent application enablers” that provide the products and platforms that allow organizations and engineers to successfully build, deploy and manage their ML models and intelligent apps. We began investing in this wave well before this category became known as MLOps. Previous investments in companies like Turi (acquired by Apple), Lattice Data (acquired by Apple) and Algorithmia (acquired by DataRobot) helped pave the way and inform our decisions for future investments.

There are a number of core tenets of our MLOps investment thesis. Certainly, the category has proliferated and now there are defined subcategories with multiple vendors in each. We believe that the hyperscale clouds, with AWS the clear leader, will continue to offer excellent services that will provide great alternatives for many organizations, particularly those that either want (a) quick and easy tools for rapid prototyping and deployment or (b) an end-to-end platform backed by a single vendor.

MLOps: Emerging Trends in Data, Code, and Infrastructure. A white paper with AWS
Managing Director Tim Porter

However, we also think potentially even more organizations – particularly those with the most mission-critical workloads that often drive the most usage and spend – will want to take a composable approach of choosing best-of-breed products from a variety of vendors, selectively use open-source software, build on hyperscaler infrastructure, and combine these with their own code that best addresses their business and application needs. Sometimes this is because their data or customers are in multiple places and their application needs to run on multiple clouds and even on-prem. Almost always it’s because customers view certain products across the MLOps pipeline as being easier to use, providing deeper features or functionality, and perhaps even more cost-effective.

Madrona is excited to continue our investment thesis in MLOps. We feel we are still in the early innings of this massive wave of every application becoming an intelligent application and have only begun to see the potential positive impacts of machine learning. We will continue to look for companies with visionary, customer-focused founders with differentiated approaches for providing key functionality across the MLOps spectrum, who are also savvy about partnering with the hyperscale clouds to best serve customers.

Why is open source important/interesting as part of that thesis?

In general, we feel that having an open-source component has become essential to a successful MLOps offering. First, given the customer persona is typically an ML engineer, software engineer and/or data scientist, they have come to expect open source as a means to trial and experiment, and then maintain a degree of control and extensibility once implementing into production. They also value the community that develops around open-source projects for validation, troubleshooting and best practices.

MLOps: Emerging Trends in Data, Code, and Infrastructure. A white paper with AWS.
Partner Jon Turow

The best open-source offerings provide value to a single engineer and make her job easier or solve a problem that they personally are facing. From there, open source can spread “bottoms up” through an organization. This often then creates an opportunity for a commercial company to provide the additional team and enterprise features and support that organizations need once a broader open-source option has occurred, as well as the option of providing a “we’ll run it for you managed service.”

Management teams appreciate that open source is a hedge against vendor lock-in. Investors and customers alike recognize the power of continuing to ride the wave of community innovation, which can often be more rapid and powerful than any single vendor. As investors, we also look to early open source adoption metrics as a key leading indicator as to what projects are taking off and solving important customer problems. For all these reasons, we view a robust open-source offering as key to any successful company in MLOps, as is true for almost every area of software infrastructure today.

How are you viewing companies that build modular ML systems vs. all-in-one solutions?

We like to invest in companies that provide a modular ML system, with a focused offering that solves an important customer problem in a differentiated way and can work multi-cloud, meeting customers wherever they are and their data resides. A trend in MLOps has been that companies who become successful in an MLOps subcategory seem to evolve or expand into offering end-to-end solutions. Sometimes we question, however, whether this is in response to customer pull or just broader startup imperialism and ambition. We have seen some of both – if a vendor is providing a strong solution in the “front end” of the ML pipeline (say, labeling, training or model creation), customers might also want deployment and management that can close the loop back to training. However, in many more cases, we see customers want a modular system where they can knit together best-of-breed solutions that best fit their environments and business needs.

Who are you investing in?

A successful investment we’ve made in this space is WhyLabs, the leader in ML observability across both model performance and data quality Our investment started with a belief in the ability and vision of the incredible founding team led by ex-Amazonian Alessya Visnjic. They founded WhyLabs with the goal of equipping every ML team with the tools necessary to operate ML applications responsibly. Once in production, the ML application is prone to failure due to issues like data drift and data quality. WhyLabs equips ML teams with an observability platform that is purpose-built to proactively identify and fix issues that arise across the ML pipeline. With WhyLabs, teams from Fortune 100 enterprises to AI-first start-ups are able to eliminate model failures and significantly reduce manual operations, shifting focus to building more and better models.

Just like best-of-breed winners like DataDog and New Relic emerged in previous years for application performance management, we believed that ML observability would produce multiple large winners that provide this functionality cross-platform and integrate with whichever existing pipelines and tools customers choose. We also deeply believed in WhyLabs commitment to open source. They established the open standard for data logging, whylogs, which enables a single ML builder to enable basic monitoring and create value immediately, and then “graduate” to a self-serve SaaS, and ultimately to a much deeper product suite with all of the features needed for enterprise ML teams. We also believed in the company’s conviction that to provide true observability, builders need to be able to monitor both model health and data health from a single pane, to truly understand root causes and the interactions between the two sides of this coin.

Another successful investment that highlights our MLOps thesis has been OctoML The company was founded out of the University of Washington by Prof Luis Ceze and an amazing group of co-founders. It is based on an open-source project called Apache TVM that the OctoML founding team created TVM takes models built on any leading framework and, by using ML-based optimizations, compiles, runs, and accelerates them on any hardware.

While OctoML continues to be a core contributor to TVM, the project has blossomed widely and grown to include a broad consortium of industry leaders, including AWS, Microsoft, Intel, AMD, Arm, Nvidia, Samsung, and many more. OctoML’s core mission is to bring DevOps agility to ML deployment, automating the process of turning models into highly optimized code and packaging them for the cloud or edge. This reduces ML operating expense, obviates the need for specialized teams or infrastructure, accelerates time to value for models, and offers hardware choice flexibility to meet SLAs. Here again, a core tenet of the investment thesis started with open source and the community innovation it leverages. Further, it illustrates the core belief that intelligent applications will be deployed everywhere – across clouds and end point types – from beefy cloud GPUs to performance-constrained devices on the edge.

Do these modular systems become all-in-one in the long term?

In short, we do not view the MLOps world as ultimately converging into monolithic end-to-end platforms. There is a large demand, and a rich opportunity, for new companies to provide these critical cross-platform products and services. An interesting related question is whether MLOps continues to exist as its own category, or whether it converges simply into DevOps, as nearly every application becomes a data-centric intelligent application. Madrona’s belief is these two spaces will begin to largely converge, but certain unique customer needs specific to MLOps will continue to persist.

Related Insights

    Why Madrona is Investing in Web3 and Why Startups Should Care
    Snorkel’s Alex Ratner talks data-centric AI and ‘one of the most historic opportunities for growth in AI’
    The Progress of Low-Code No-Code and an Update to our Thesis
    Starburst’s Justin Borgman on entrepreneurship, open source, and enabling intelligent applications

Related Insights

    Why Madrona is Investing in Web3 and Why Startups Should Care
    Snorkel’s Alex Ratner talks data-centric AI and ‘one of the most historic opportunities for growth in AI’
    The Progress of Low-Code No-Code and an Update to our Thesis
    Starburst’s Justin Borgman on entrepreneurship, open source, and enabling intelligent applications