TLDR:
- Voice AI agents are coming. Advances in LLMs and voice infrastructure are now at the point where voice agents can communicate clearly, understand intent, and actually work.
- We believe voice agents can be most impactful in verticals where they can leverage deep domain data, there is a high volume of real-time interactions, and the cost of false positives is relatively low.
- We are seeing new voice AI agents being formed in areas like home services, insurance, loan servicing, auto dealerships, and many more!
It’s not too long ago that voice assistants were thought of as annoying chatbots that were mostly used for checking the weather or playing a song (even using Siri today can be a pain). But things have evolved quickly. With the onset of LLMs and sophisticated voice infrastructure, we are already seeing AI agents being adopted across enterprises from customer service to complex enterprise applications, and rapidly infiltrating industries at a pace that the last generation of voice innovation wasn’t equipped to handle.
At Madrona, we have been excited by the thesis with AI voice agents for a while now – underpinned by our early investment in Deepgram that provides speech-to-text (STT) and more recently, text-to-speech (TTS) models for developers to build applications across a wide range of use cases. While multimodal is still critical in today’s ecosystem, HubSpot ran a survey that showed ~70% of consumers still prefer to use phones to contact customer service. In fact, over 80% of Enterprise data lives today in voice, and we have reached a tipping point where this data is being unlocked by different products in various applications, and not just in batch but in sophisticated real-time formats. The advent of leveraging LLMs and a customer base that has grown more comfortable with conversational tools (through Siri, Alexa, Google Voice in the last generation) has led to a Cambrian explosion of voice data and, what we believe is, a monetizable market.
We believe there will be large winners in various layers of the voice stack. Over the past 6-12 months, we have been seeing a widespread shift towards domain specific agents and customizable workflows, which is now made possible with fine-tuning, synthetic data generation, and lower costs of models. To be clear, there are players like Deepgram who build foundation models while moving up the stack to launch “full stack” applications – we will discuss “verticalization” in the sense of industry specific workflows built on top of voice infrastructure. Our hypothesis is that by leveraging domain specific data, verticalized agents can be built on top of existing voice APIs that truly transform industry processes and drive sticky workflows.
Let’s dive deeper into what that really means and where we see the market heading.
Why now?
After years of incremental progress, the pieces needed to make voice agents effective (speed recognition, natural language understanding, speech synthesis) have all hit inflection points in quality. With performant GPUs, lowering cost of infrastructure, and the rise of synthetic data generation, Voice AI enables a host of low latency, real-time applications. Real business usage has exploded – it’s happening now, and it’s happening quickly across industries.
We are seeing large-scale enterprises within highly regulated and slow-moving industries like healthcare and finance at the forefront of the adoption curve. For example, Microsoft recently released their clinical voice assistant, Dragon Copilot, that leverages decades of medical speech expertise to become the first AI assistant for clinical workflow. We are seeing hospitals adopting “ambient” voice agents to automate clinical documentation and relieve burnt-out staff, automating up to 30% of nurses’ documentation and saving hospitals $12B annually. Another example is Bank of America’s Erica Assistant, which has been used over 2.5 billion times by 20 million customers as both a personal concierge and mission control system to drive financial decision making. Besides incumbents, emerging startups are tackling workflows that have remained stagnant for years – with companies like Abridge powering a platform for clinical conversations within healthcare systems and Prepared911 transforming operational efficiencies within emergency response centers.
Who should benefit?
There is an argument that every industry can and should be leveraging voice AI in some form. At Madrona, we are particularly interested in “verticalized” domains that present the largest near-term opportunity for rapid disruption to outdated industry processes. This is the framework we believe makes an industry ripe for voice AI:
- Labor intensive: Labor costs are a major pain point, and operational efficiency directly impacts the bottom line.
- Preference for voice as a modality: Voice and phone-based workflows are more effective, typically where there are two-way conversations and non-binary outcomes.
- Necessity for real-time interactions: Applications that require real-time over batch workflows, the latter being easily commoditized.
- High volume of repetitive interactions: Large teams handle the same questions, processes or updates on repeat.
- Data collection at scale: Heavy reliance on gathering massive amounts of data that can be leveraged to drive better outcomes.
- Time-sensitive or always-on communication requirements: Being available 24/7 can be a competitive advantage.
- Relatively low cost of false positive: The cost of imperfection does not entail major operational risk and is low compared to the upside of efficiency. 80/20 rule suffices until models improve over time.
- Regulatory or compliance-heavy workflows: Precise documentation, audit trails, and consistent processes are legally or operationally critical.
We believe that AI raises the bar for sectors that have historically been underserved by tech yet represent an enormous opportunity. By specializing in specific domains, these agents can not only understand the right language and jargon but also plug directly into operational systems that power everyday workflows. This enables end-to-end automation of high-friction processes like customer outreach, data collection, and specialized task execution – all at scale and with the right set of compliance and procedures.
These agents are not just replacing human calls, but they are also reshaping operational efficiency by integrating directly with core systems, learning from every interaction, and adapting to complex regulatory and procedural environments. Unique to LLMs and nearly impossible to scale perfectly with humans, each incremental call that agents take means more leverage and efficiency for the hundreds that follow.
Where are the opportunities today?
We are excited by several verticals that fit our framework that we expect to experience rapid, near-term adoption of voice agents – to call out a few:
- Logistics & Freight: Fleetworks, HappyRobot, and LoadPartner are companies building AI workers for the supply chain through voice agents that autonomously handle carrier-facing calls. Without the constraints of human scale, their agents can confirm loads, book shipments, relay tracking updates, integrate with internal systems, and log data quickly and cost-effectively. While they are designed to mimic human interaction, they ultimately deliver a faster and more effective carrier experience.
- Insurance Claims: Liberate, FurtherAI, and Sonant build voice agents for insurance carriers to automate underwriting calls and claims intake. They provide human-like conversation, along with 24/7 support, multi-lingual voice support, and zero wait time, creating a seamless claims experience.
- Home Services: Lace AI, Sameday, and Rosie develop AI-driven customer service software for home service companies (HVAC, plumbing, roofing, etc.) by analyzing incoming calls to detect lost revenue opportunities. They monitor 100% of the calls and analyze each interaction to ensure no lead or opportunity is missed, creating a new standard for how revenue is captured.
- Financial Services: AviaryAI provides AI-powered outbound voice agents for banks, credit unions, and insurance companies. Customers have used Aviary for use cases including reactivating dormant accounts, obtaining voice authorization for disclosures, and cross-selling ancillary products. On the other hand, Salient is an AI loan servicing platform tailored for the automotive finance industry, already processing 39M+ unique customer interactions and driving -60% reduction in handle times. Their multi-modal AI agents collect payments, process changes / extensions, manage payoffs, and update insurance information all in real-time.
The Next Generation of Vertical Agents
For most companies, the initial wedge for vertical voice agents is clear. Replacing even a small call team with a voice agent is compelling in terms of cost reduction alone. While a 15-20 person call center costs upwards of $1M annually, a voice agent costs less than a single full-time hire – and it comes pretrained, integrated into systems, and immune to burnout (more importantly, turnover).
Beyond the case of cost savings, we believe the broader implication of voice agents is that it unlocks new growth engines for top-line revenue. Once deployed, voice agents immediately pick up and answer every call, 24/7 and in any language, ensuring that no opportunity is missed, no customer is left hanging, and every call becomes a touchpoint for conversion or retention.
Perhaps the most compelling, voice agents don’t just answer calls, but they document, structure, and execute workflows with a level of consistency and scale that human labor has never been able to match. Over time, as these agents continue to capture data and build contextual understanding, they are well positioned to become the system of record for every carrier negotiation, every claim, every customer interaction. As agents become more powerful, the question then becomes, what happens when both sides of interactions are driven by agents, without a human in the loop? At that point, the system can effectively become the ultimate market maker, driving programmatic exchanges between parties. Like many markets that programmatically match demand and supply, this unlocks a massive opportunity for industries that have been slow to innovate and constrained by labor for decades.
We believe the voice layer will evolve from just UI into the data backbone and operating system for entire industries. This is not just a labor and productivity play, but a new system for how businesses will operate, interact, and create value. All that said, companies building this layer today aren’t just riding the wave of AI digitalization, but they are defining what the next generation of vertical voice agents could and should look like.