Typeface Founder Abhay Parasnis on Shaping Enterprise GenAI Strategy

Typeface Founder Abhay Parasnis on leaving ‘established’ companies to found a startup, finding the right partners, and shaping GenAI strategy

Today, Madrona Managing Director Soma Somasegar talks with Typeface Founder and CEO Abhay Parasnis. Typeface combines generative AI platforms with its own brand-personalized AI, so all businesses can create content that is multimodal and on-brand. Typeface just announced $100 million in new funding, and Madrona couldn’t have been happier to participate in the round.

Abhay and Soma have known each other for almost 20 years from their time at Microsoft. But In 2022, Abhay left Adobe, where he was CTO and CPO, because he saw an inflection point coming that he wanted to be a part of. This was, of course, before ChatGPT and all the other popular GenAI tools that we’ve all been playing with this year even came out. Typeface has made waves quickly, attracting Fortune 500 customers and partnering with Salesforce and Google Cloud.

In this week’s episode, Abhay shares about the need to balance your passion and recklessness for going after a dream with a product that you know will resonate with customers. He explains that people relationships are the real currency when launching a new company, and how advisers are just as important as what he calls the flag planters and road builders that founders need to seek out, to help them on their startup journey.

These two industry veterans share all of this and so much more.

This transcript was automatically generated and edited for clarity.

Soma: Good afternoon. This is Soma, and I’m a managing director at Madrona. Today, I’m really, really excited to have this conversation with Abhay. Somebody that I’ve known for the last 20 years or so, and I’ve had the fortune to work alongside at Microsoft for many years.

More recently, Abhay has started a new company in the generative AI space called Typeface. He’s the founder and CEO of the company. This is really a great opportunity and something that I’m looking forward to, to having this conversation with Abhay. Welcome, Abhay.

Abhay: Thanks, Soma. Great to be here.

Soma: Great to have you on our Founded and Funded podcast series. But before we dive into the great work that you are doing with Typeface, you had a pretty successful career at Microsoft. Then went on to Oracle and did some great work in Oracle Cloud.

Then more recently, you were at Adobe as their chief technology and product officer. Great experiences across what I call amazing companies in the technology space. How did your experiences at these organizations shape and prepare you for launching your own company?

Abhay: I think, as you know, first of all, when you look at those journeys in the moment, there are different lessons. When you look back, there are different things you take away. I think if I had to synthesize across all those years, and amazing companies and different experiences, probably three or four things that come to mind. First, certainly with Microsoft and Adobe, it always starts with the product.

Building that deep technology moat, deep product moat, products that resonate with users. It is something maybe you and I maybe take for granted, having spent years in these world-class companies. But I think having that long-term orientation around building products and experiences that customers actually really care about, and building defensible IP and moats in them that can last decades.

As you know and from your journey, there are products that Microsoft has had 2, 3, 4 decades later that are still very relevant at Adobe. There are products like Photoshop and Acrobat that are 30-year-old products that are still at the gold standard. That’s amazing in an industry that changes every six months. I think having that long-term orientation product at the core moat, it’s probably the first lesson, I would say.

As we think about building a long-term business, is what are those long-term moats that customers are going to value and care about? The converse of that, which may seem a little bit contradictory, is that this is an industry that doesn’t really respect your yesterday’s success as much. Unless you really reinvent yourself today and tomorrow, what you did yesterday doesn’t really mean much, and so,

That second contradictory lesson, which is you got to go through a cycle of reinvention. Over my last two/three decades in these companies, some we got it right, some we didn’t do it in time. But I do think that notion of reinvention and facing disruptions in the industry that are bound to happen every five, 10 years, is probably the second lesson. Both, some we did right and some painfully we got wrong, and I think that teaches you as much.

Then maybe the last thing I’ll say is I think maintaining a beginner’s mind, where you’re constantly willing to learn new things. Because one of the challenges as you work in these big companies, you have amazing products, amazing successes, and amazing talent all around you, but sometimes that can get you very myopic. I think maintaining a beginner’s mind to keep looking around the corner and reimagining what the new world could look like, is probably, in some ways, the third big lesson, I would say.

Soma: That’s fantastic, Abhay. One thing that you mentioned and that resonated with me a lot is this notion of unlearning and learning or reinventing kind of thing. The good news is now you had the fortune to go from Microsoft to Oracle. Back then, it was a smaller startup called Kony and then to Adobe, so you had a variety of experience. Every time you go from one environment and one culture to another, there is some amount of reinvention.

You’ve gone through that multiple times, but all of these are what I call established companies in some way, shape, or form. Going from that to saying, “Hey, I’m going to be the first guy, and I’m going to start building everything from scratch.” The beginner mindset that you talked about, how easy or hard was it for you to go through the transition, particularly as you left Adobe and you decided to get on the Typeface journey?

Abhay: I would say this in two parts. There are things that you think you know when you are about to make a decision like that, and then there are things that you actually every day live through as you go through that journey. Hopefully, there is enough overlap between the two, but there are also things you can never anticipate. First, I would say for me, Adobe was an amazing journey, amazing company, amazing products.

The reason for me to start this company was this extreme burning desire that there is an inflection point coming in the industry. I’m sure we’ll talk about it. It’s interesting sitting here today when GenAI is all this rage in the market and I left Adobe and started this company. None of the ChatGPT, Stable Diffusion, DALL-E, none of those things had happened.

But there was a deep desire and conviction that there is a shift coming. I didn’t want to have any regret of not having participated in that in a deep, meaningful way. I would say that overriding desire almost makes you not really be very thoughtful about all the other dimensions you have to go through if you’re really passionate about something. I do think a little bit of that recklessness actually helps, if I can say that.

That said, I think when you do start as the first person, and when you are the person having to figure out how to do payroll, and how to actually register a company, and every state has a different law, there are a lot of things you take for granted that these big companies and platforms offer. I do think there is an amount of learning new things that you just don’t anticipate.

I won’t lie and say all of that is enjoyable or some of that I wish I didn’t have to go through, but it’s all actually good learning. I would say it’s been a fun ride. What’s been really gratifying so much, is that the people relationships you accumulate and build over the years, ultimately that is actually the real currency. In fact, I remember calling you when I was starting the company.

I think there are an amazing number of people around the industry, such as yourself, who actually just helped me quite a bit at no level of involvement and interest. I remember talking to you over breakfast before I had registered the company. I think that’s been the other incredible journey of entrepreneurship. In big companies, you have lots of people around you anyways.

But here, you really value the relationships and all the advice and perspective that others bring to you, as you go on that journey.

Soma: I think that’s a fantastic point, Abhay. Relationships matter, and you never know when they come in handy, but normally people say that as a founder and CEO, sometimes you are in a very lonely job. It’s really, really important to have the right support structure around you, in terms of people and relationships and how they can be helpful to your networking.

I think that’s a fantastic point. Let’s maybe now switch a little bit of focus and talk about Typeface. We know that Typeface is focused on delivering some valuable service to enterprise customers. Talk to us a little bit about, like, “Hey, what is the genesis of Typeface, and what unique challenge is it looking to address for enterprise customers?”

Abhay: If I step back and even before we get to all the exciting things happening with GenAI, and why that represents a unique moment in time in our industry and for Typeface. If I just zoom out, one of the amazing things that Adobe and my role at Adobe gave me a perspective with a lot of customers who were in the world of data, in the world of content and creativity. I would say one of the observations I had that led to the Typeface genesis, was you look at the last decade or so with all the transition to the cloud.

The data architecture in a typical enterprise has gone through a pretty big transformation. All the things around big data and ecosystems like Spark and amazing companies like Snowflake and Databricks and others, amazing innovation has happened in that ecosystem. You guys obviously have participated in quite a few of those companies. The key insight, Soma, in some ways, was correspondingly, the content stacks in most companies, have not yet gone through that reinvention and reimagination in the last couple of decades.

Yes, the mobile shift has happened. Yes, platforms like TikTok, YouTube, Instagram, Netflix, and Amazon have happened, but they all have their proprietary content systems. Unlike data, which went from those companies to open source into enterprise architectures, content had not had its inflection point driven by fundamental architecture change. The key idea was is there a step function change coming.

I think that’s where generative AI dialogue will come in, where there was an architectural shift coming, that allows you to reimagine the entire enterprise content lifecycle. Specifically, what Typeface.ai wanted to address is — most companies, when you look inside their content systems, today they will describe a content paradox. Either they can produce extremely personalized, high-quality content where they hire professional creatives or agencies, the marketing department leads.

That’s very much on-brand personalized content, but it’s not very fast and cheap to produce. Or you can do extremely high-speed, high-velocity content creation using modern tools, but you don’t get a lot of the personalization that you want. The unique thing we wanted to solve with Typeface, is can we finally bring the world of personalization and the world of content velocity into one unified stack?

That’s really the origin of where we started. Unfortunately, GenAI was the technology fuel, if you will, that allows us to reimagine that.

Soma: You mentioned this earlier, Abhay. Sometimes we are so caught up in things today, that you forget what the world was like even 12 months ago kind of thing. As you said, the world hadn’t heard about ChatGPT, DALL-E, Stabile Diffusion, or any of these other large language models. I remember when you and I first started talking about this where you talked about, “Hey, that is the modern data stack and everybody’s talking about data.”

What is the modern equivalent of that for content? I at least didn’t realize that we are right around the cusp in terms of large language models taking the world by storm. Thinking about, “Hey, where technology is evolving, though we necessarily did not at that point in time realize that large language models were going to take off like wildfire in this timeframe kind of thing.”

But I think being there at the right time with the right idea, is always helpful, and I think it has given you a fantastic start thus far.

Abhay: Yeah. No, you’re absolutely correct. As much as I would like to claim I had a complete insight into exactly how this was. But if you had asked me back last May, which you did, I would’ve probably said this is still three to five years out, and it’s going to take us a while to get AI systems.

By the way, it still may take us that time for enterprises to fully get there. But clearly, what happened with ChatGPT, it has accelerated this into the broader consciousness at a much faster rate than I would’ve thought. It is exciting, but I don’t think we should short change still the road ahead. It is a long road. There’s a lot to do to make this a mission-critical fabric for companies.

Soma: Completely agreed, Abhay, completely agreed. You and I have been in the technology industry in some way, shape, or form for many decades now. We’ve seen the advent of client-server computing at Microsoft. We’ve seen the advent of the web, and we’ve seen the mobile platform taking off amazingly well. Then more recently, the cloud has taken the world by storm. Each of these platform shifts has been progressively and almost exponentially becoming larger and larger and larger.

When we looked at cloud computing, we felt like, “Hey, for the first time, this could be a multi-trillion dollar opportunity for those who decide to play versus not kind of thing.” You fast-forward 10 years or 12 years later, we are now at the cusp of what we call the AI revolution, and some people call it like generative AI, but it’s broadly AI. In your opinion, what distinguishes generative AI from previous technological waves, such as whether it’s an internet or mobile or cloud thing?

Then, more importantly, do you see generative AI being a key differentiator and an opportunity for enterprises, and for how enterprises and the future of work happen for people in a variety of ways?

Abhay: If I had to distill this maybe into a couple of frameworks that I use right now to think about what’s happening with generative AI, first of all, I think you will agree that the rate of change in this particular wave is unlike anything else we have seen before. I think we are fortunate, you obviously, with the desktop shift at Microsoft. But then, even the cloud shift or mobile shift that I was fortunate enough with Adobe and others, they were amazingly profound.

But the rate at which right now generative AI is shifting all layers of the stack simultaneously. There is a foundational platform being built by big players like Microsoft, OpenAI, Google, and others. There is workflow-level innovation being driven by existing companies, and, hopefully, new companies like Typeface.ai. Then there is an experience-level breakthrough right in front of our eyes where natural language becomes the new experience.

But if I had to give you three things in why this shift is different, I think in technology, the role of computers is going to change and evolve from machines that were just computation or automation machines in our lives.

They will do number crunching, they will drive productivity. To then, as soon as with these AI models, computers become machines that can see, hear, sense, and understand the world around us. These machines go from just being computational, number-crunching devices, to true personal assistants in our personal and work life. I think that the change of role of computers in our life, I think is going to be very, very profound, number one.

The second thing I would say is what I call escaping the glass, which is, for the first time, we are going to get to a much more natural way of interacting with these devices and computers. As you remember, when iPhone came out, multitouch felt like such a profound change because it was direct manipulation versus indirect with a mouse and keyboard. Now, imagine escaping the glass entirely, and being able to use natural language in your voice or in how you express.

If you could communicate with machines at that high fidelity, it’s going to feel as big a jump as multitouch was, if not bigger. I think to me, that escaping the glass is the second change that generative AI is going to drive. The last one is what you asked, is what I call rewiring the enterprise. Which is as profound as these GenAI systems will be in our personal lives, like ChatGPT shows, I think the real profound impact is going to be in how entire industries and economies and companies get rewired.

If I had to just succinctly say today — if you look at most companies and the role IT systems and cloud systems play, they are a bunch of siloed, computational apps and systems. Then we, as users, knowledge workers, extract information from one system, and we do the job of brokering and connecting across six systems, and synthesizing insights out of it. I think I imagine a world with GenAI, where enterprises will become extremely fluid knowledge fabrics.

Where the entire fabric of systems in the natural language layer will let you tap into any application, any system. The marginal cost of getting insights and telling stories and expressing yourself in compelling ways, is going to go down so much. That we will look five to 10 years back at what the first-generation SaaS applications looked like, and they will look far worse than what green screens look like when you compare it to iPhone.

Because I think they’re fundamentally going to change the semantic understanding of how we communicate with these applications in enterprise.

Soma: I love the thinking and the articulation, Abhay. Particularly when you think about enterprises having the layer of the knowledge graph, for lack of better words, and then being able to tap into it using natural language. I think even just thinking about it, the opportunities are boundless kind of thing. I’m sure what we are going to see and what we are going to experience in the coming years, is going to be fascinating here. Abhay, before I forget, let me congratulate you.

Recently you had a bunch of phenomenal announcements that all came together. On the one hand, you announced $100 million of new funding in a funding round, which is fantastic for a company of your size, scale, and your aspirations. On the other hand, you also announced the launching of your product, so congratulations on that. But then the thing that also caught my attention was some strategic partnerships that you announced with industry leaders like Google Cloud on the one hand, and Salesforce on the other hand.

Congratulations on all these things coming together. Looks like great building blocks for what is potential and possible in the future. But the thing I want to ask you specifically is, how do you envision these strategic partnerships helping you or accelerating your company’s growth and success in the market?

Abhay: Yeah. First of all, thanks for that and kind words. Before I answer your question, also great to have you, as Madrona and you personally have been involved in my journey with this company from day one.

It was exciting to have you guys officially also join that round. I think you guys have been incredibly helpful to us, even before you were investors.

Soma: Thank you, Abhay.

Abhay: It’s a pleasure to be part of this journey. Thank you. Look, at the end of the day for us, as you said, the investment is a great milestone, but really the bigger one is what you said, which is some of these partnerships and strategic partnerships. The way, Soma, we think about this is in a couple of dimensions. On the product front, a big shift like this GenAI that is happening at the industry level, you are not going to be able to go alone.

You really have to find ways to partner with other players in this industry that have strengths at different layers of the stack. One way we look at these partnerships, and we have a deep partnership with Microsoft as well in the work we do with Microsoft and OpenAI. With Google, we announced a partnership with their AI models. One way we think about it is can we stand on the shoulders of giants? They’re doing some amazing work at the platform layer of the GenAI stack.

We don’t really want to be building that capability. Having deep collaboration, and deep access to what they’re building, allows us to innovate faster at our application tier. That’s number one. Number two, as I talk to a lot of users of GenAI use cases, they want these GenAI capabilities like Typeface to be delivered in the flow of the work where they already are. That they don’t want to go to some new application every time they want to use some new generative workflow.

The second part of these partnerships for us with Google announced, for example, we are going to bring Typeface right inside your Google Workspace application. Or if you’re a Salesforce Marketing Cloud user, we want to be able to bring Typeface content generation right inside your email marketing application, so you don’t have to go. I think in the flow of work is a key strategy for us as a company, and these partnerships accelerate that.

But lastly, I would say probably the most exciting from a business standpoint, is there is an incredible opportunity, as you know, with GenAI. Every enterprise around the world is starting to actually ask the question around who are the partners who have best-in-class solutions. For us, a big part of these partnerships was how can we rapidly scale Typeface.ai to the opportunity that exists in the market?

If these large ecosystems and companies like Salesforce, Google, and Microsoft, if they can help us scale the company and get in front of a lot more customers a lot quicker, then that’s actually incredibly not just exciting for us. But frankly, we think it’ll accelerate the overall adoption of generative AI in the marketplace.

Soma: Absolutely, absolutely. One of the other things that you guys announced recently, is that you launched the product, and you’ve got a set of customers now using your product day in and day out kind of thing. I’ve always been fascinated by when you’re going through the early days of designing a product, you want to have customer input.

You want to have early design partners working with you to say, “Hey, what is working? What is not working? What is good? What is not good?” Can you talk a little bit about that iterative process that you went through this past year to get the product to where it is today?

Abhay: I agree with you. That’s been one of the most fascinating aspects of both the entrepreneurial journey because when you have big companies with big ecosystems, that’s a little bit different. But also when you’re on a bleeding edge like GenAI, there is a lot of stuff happening not just with technology, but how these companies adopt new tools, their processes, their culture. I would say maybe if I had to give you my introspection on last year with customers.

First, the level of excitement and interest from customers around generative AI is just off the charts. I know you know that with all the investment and activity going on. But a little bit of what’s different, Soma, in my mind, is this is not just blind interest in terms of just some cool demo or let me just put a cool app out there. There is a real value orientation even in these early days that I’m finding.

For example, they all love the promise of these GenAI systems being capable of generating amazing content, but they’re asking, “Okay. Tell me how it’s going to help my top line, either customer acquisition or retention goals.” One, I actually think that value orientation is a good thing in the long term for both customers and startups. But frankly, is a little bit different than some of the maybe other hype cycles where sometimes you are looking for a use case, you don’t really know what exactly this thing is going to be.

One, I think from day one, and when we engaged, customers pushed us very quickly towards, “Hey, here are the three use cases that we would like to get some major ROI in. Can Typeface and GenAI help there?” That’s going to be number one. Number two, while the interest has been there all the way from C-level audiences in every company, a lot of the practitioners are already out there trying out these tools on their own. We have all seen that. Our kids are using ChatGPT for their homework assignments.

This is one of those where the collective awareness of these techniques, do mean that enterprises are more inclined to figure out how to really safely adopt this. That’s been the second thing. But I will say maybe the third thing, which has been extremely instructive for us, and we are actually positioning Typeface to do this, that this kind of change is not just about technology. There is a whole process, culture, organizational change, rapid re-skilling safety and compliance issues around AI.

I think there is a full 360 dialogue that we are finding we are having with customers. In some ways, Soma, even as a startup, we are not just playing the role of a technology provider, which we obviously are. But they are really looking for a thought partner who’s going to shape their generative AI strategy and evolution within their own business. I would say maybe the third thing, which is still early, is developing a maturity model for generative AI.

Which is how do you adopt, and what are the stages of maturity a typical company goes through? I think that’s been fascinating to jointly work with a lot of our customers.

Soma: That’s a good set of things to hear about, Abhay. Thank you. As an investor, I get asked a lot about this, “Hey, what do you focus on when you decide to make an investment?” I always say, “For me, it starts with the team, it starts with the people.” Because I truly believe that building a world-class team is absolutely crucial for any successful and durable company. You put together, you grow rapidly when this past year in terms of your early team kind of thing.

Can you talk a little bit about, “Hey, how did you pull together your team?” What are some of the attributes or qualities that you’re looking for in your founding team and in your early team members? Then equally importantly, culture is something that everybody talks about, culture is important. Are there specific things that you had to do to get the right culture from day one, or how is that coming along for you?

Abhay: First, I would just start by acknowledging how fortunate and lucky I feel — this is one of those where I can’t sit here and tell you that I planned this exactly this way, and this was all exactly choreographed. But as you know, especially with inspiring and attracting and getting people on a journey, especially world-class people, they all only do this when they believe deeply in a shared mission and shared conviction, and shared values and culture, as you talked about.

First, it’s been incredibly gratifying to see back to our earlier discussions around the relationships you build over the years. One of my litmus tests in my career, Soma, and I know you share this in your own career quite a bit. Is how many people across different stages of your career would be willing to blindly follow you down a dark alley without knowing where it leads? In a very fortunate way, a lot of the early team are people actually I have been fortunate enough to work over the decades either at Microsoft or at Adobe, or at Google, LinkedIn.

First, it’s been incredibly lucky in terms of how people actually joined early on. I do think there are some things we have been very intentional and thoughtful about, and we remain so. Which is first, for a journey like this, you really want people who are deeply, deeply passionate about technology and building breakthrough products, because different people are wired for different kinds of journeys. This is one that’s super exciting, but it’s also full of ambiguity. People who are going to thrive on ambiguity.

Then one terminology I use sometimes, Soma, internally is there are two kinds of people you want to bring on a journey in at least a software product. There are flag planters, who are going to plant new flags around new ideas, new innovations. Then there are road builders, which is once you know where you are going, you need a very systematic operational excellence. I think for us in the first year, certainly, we needed a lot more of flag planters, because it’s a space that’s so new and dynamic.

We wanted to make sure there are enough people who are scrappy and have an agile mindset, who will thrive on ambiguity — but are really inspired by exploring ideas that nobody else has explored. I think that’s been one of our core tenets is can we bring people? One of the interesting balancing acts for us is to find people who are that scrappy and nimble, and these agile mindsets and are willing to go on journeys like that. But if we could also then find people who are at the same time seasoned in the enterprise software, and understand the world of enterprise. Understand the experience of working at large-scale companies like Microsoft and Adobe, and Google, and we have been very fortunate to find that rare breed of talent.

You know quite a few of the team members, but we have folks like Vishal Sood, who is our head of product. He is an amazing leader with large-scale experience in big companies like Microsoft, but is as startup wired as any. I think finding those people has been extremely gratifying. Maybe the last thing I’ll say, a lot of people think about team building as who are the members of the team which matter, but I actually think it’s equally important around who are the advisers.

Who are the people you surround yourself with? Again, I’ve been very fortunate. You were one of the first people I had called on, and I think those people help quite a bit in your formative stages because they’ll warn you around the blind spots you may not see. Or they have seen the pattern matching across many other companies or ventures. I think finding enough sounding boards and people who are really invested in your success is as much a part of team building as the core team itself.

Soma: That’s cool. That’s great to hear that, Abhay. This last year has been fascinating, as you’ve seen a number of what I call generative AI applications coming into existence. I can’t talk to a company anymore without them talking about generative AI in some way, shape, or form, whether they’re an existing company or a new company. But as you very well know, Abhay, developing generative AI applications is a complex task.

Okay. You got all kinds of different large language models to think about, which off-the-shelf models to use versus not. Which one do you take a bet on for which use case or which scenarios, keeping costs and performance in mind? Furthermore, while applications like ChatGPT engender what I would call general-purpose solutions, enterprise customers often want customization and personalization.

So on the one hand, the world of generative AI is going through what I call a rapid cycle of innovation. What stood six months ago may or may not be standing today, and what is standing today, may or may not be standing six months from now. The rate of innovation is very rapid. From a Typeface perspective, how do you stay ahead of these advancements, these innovations, these changes?

How do you make sure that Typeface.ai is A) on the leading edge of technology adoption? And B) marrying that with, “Hey, what do my enterprise customers need in terms of personalization, customization? How do I bring that all together?” How has it been for you?

Abhay: That’s a great question, and in fact, I would say it’s a constant tweaking and learning journey, as we said at the beginning. But I do think there are a few principles we have evolved over the last year or so. As you said, it’s only a year, so it’s still early days. But first, in this space, if you’re trying to be a leader in your category, you have no choice but to be very close, I would even say dangerously close to bleeding edge.

There’s so much stuff happening every day, and you have to have these lightning rods in your team that are going to constantly stay close to where the bleeding edge is. Now, the trick in generative AI is so much happening in ecosystems — open source, proprietary platforms — you cannot chase every single idea. I think the trick becomes which of these are fundamental shifts that you should pay attention to, and which of these are okay for you to just ignore?

In fact, saying no to some really good ideas becomes actually quite an important skill in this space, because there’s just so much happening. You could easily get distracted with 10 new, shiny objects every Monday morning, and you can’t really build a business that way. I would say one, I think I’m fortunate enough, we have people who are what I would call our GenAI scouts. That they are out there. They are in the ecosystem. They are hanging out in all the Hugging Face and all the community papers.

They will bring back in some ways the signal versus noise around, “Hey, LangChain is worth paying attention to, but maybe this other thing is not right now worth paying attention to.” We do that, and I think we do that reasonably well, but we obviously need to keep at it because it’s every single day thing. I think the second thing we are trying to do is constantly remind ourselves and the team that our job is not to just exercise these cool, new frameworks and technologies and models for the sake of it.

But it’s the product and experience centricity in what the enterprise customer is actually going to want. In fact, one of the things I’ll share as an example. When we started adopting generative AI models for some of the marketing use cases, turns out we could use a lot of the classical computer vision models to do a lot of other things that customers wanted, that had nothing to do with generative AI. But when combined with generative AI, they become a lot more interesting.

Being able to maintain that experience and product centricity so that you don’t get enamored with, “Now, there is a 50 billion parameter model, and now there’s a 300 billion parameter model, but does it matter? Does it matter to the customer and the use case?” Then maybe the last thing I’ll say, this is especially important for enterprise. Not every single thing enterprise customers care about is the most flashy, glamorous, sexy demo of a new GenAI feature. They do care a lot about compliance and security, and governance, and IP leakage.

We try to make sure that while we innovate on the GenAI side, you also innovate on bringing that into the existing “meat and potatoes,” if you will, of their existing environment. That’s been great. Maybe I’ll say one last anecdote. One of the things that’s been fascinating — the team actually just organized a hackathon. I know lots of startups do hackathons. But the team just, without asking anyone of us, they planned it, and 48 hours later, basically, they showcased six or seven projects that came out of those 48 hours. I mean, I was blown away by not just the sheer pace of innovation that they were able to bring with GenAI, but then they had a lot of ideas around how to deliver it as a value to enterprise customers. I think fostering maybe, that harnessing that energy is probably the ultimate answer to your question. How does the team go innovate in that space?

Soma: I’m glad you guys are doing that because these hackathons in my mind, and we’ve done this at Microsoft, we’ve done this in other companies. Hackathons give people a chance to show what is possible. The energy that people bring to the table and what they walk away with is transformational.

It’s a transformational kind of thing, so glad that you guys are doing that. Before we wrap up, I thought we’ll close with one little fun thing here. The next three or four questions I’m going to ask you, let’s do it in a rapid-fire format. I’ll ask the question. You don’t need to think too much, just like whatever comes to your mind, boom. Okay?

Abhay: That’s dangerous.

Soma: But I got four questions here, so let me go through them one by one. The first one, besides Typeface, which company building an intelligent application are you most excited about today?

Abhay: Yeah. There’s a lot going on, as you know, and I do try to stay by using lots of applications. It’s dangerous to call out one. I would just say in my personal workflow, there are lots of companies and tools I’m excited about, but there’s a company called Perplexity, which is building a very interesting hybrid of search with a Q&A. I’m finding that very useful and insightful in a daily workflow that I’m spending some time with new modalities, like what’s next with video and 3D.

A company like Common Sense Machines I’m experimenting with what comes next with generative AI, being able to generate entire games, if you will. That’s been exciting. But even then in the market, companies like Runway, they’re doing some phenomenal work in reimagining video workflows. I’m very excited with the new modalities around video, audio, 3D and how that changes all the workflows.

Soma: That’s great. That’s great. Next question. In your opinion, what would be the greatest source of technological disruption over the next few years?

Abhay: I would say the notion of natural language as a way to manipulate software, is going to actually change what we consider the role of software in our life. In fact, they’re going to probably start feeling more directly embedded into various industries and workflows like biology and health.

Soma: If you look at the last 15 months, Abhay, since you started Typeface, what is the most important lesson you’ve learned, and how has it shaped your approach to entrepreneurship?

Abhay: That’s a big one. I know you said one. I’ll give you two that are probably close. First, I would just say adaptability in the face of change. I know lots of people say it, but I’ll just share maybe 10 seconds anecdote. We had come out of stealth, lots of positive reviews. We had raised some significant capital back in February, and then everything was looking great. Customers were excited, and two weeks later, the Silicon Valley Bank crisis hits.

As a startup, you never know what’s going to hit you from what angle. I think the notion of adaptability, if you can master it, that becomes your single biggest strength against big guys, which is the speed with which you can adapt and move. I think that probably is something I’ve definitely appreciated in the last year. The second would be individuals and teams are capable of fundamentally incredible things when they’re truly bought in and aligned. I think if you can get to that point, you can just do amazing things.

Soma: I think two great pools of wisdom there, Abhay. For my final question, how do you personally use generative AI to enhance your productivity on a daily basis? Are there any specific tools or techniques that you find particularly useful in your day-to-day work?

Abhay: Yeah. You said productivity in work, but I’m going to broaden that a little bit to you in my hobbies. I would say I don’t spend as much time these days, but I love landscape photography. The photography workflow is just undergoing significant change powered by AI tools, and I’m loving that because it makes me a lot more productive in a limited amount of time. I would say that’s one area. By the way, Adobe tools and teams are doing amazing work.

I’m a longtime user, so that’s exciting. Part of my daily workflow is getting AI enriched if you will. Maybe one thing I’ll say, I have a 17-year-old son who’s a junior, about to go to college. One of the things we are spending a lot of time on various research and college applications and all that. What I’m finding is it used to be Google Search and YouTube were the two places you would go. I’m increasingly so much starting in these Q&A research type of tools like ChatGPT and Perplexity.

That’s actually starting to occupy more and more of my starting point of my workflow of information assimilation, knowledge, and understanding. I just think that’s early days, but it’s very exciting because you start with a very different frame when you start with those tools.

Soma: I should tell you recently, I was going to give a speech in some context kind of thing, and I was really tired, and I said like, “Hey, let me maybe get AI to help me.”

I wrote a couple of sentences about what the intent was. I was blown away by the caliber of output that I got back.

Abhay: I hope you use Typeface to do that. If not, it’ll get you even further.

Soma: Absolutely. But it is just amazing to see what is possible with generative AI. Yeah. Then I think just sharing about what people are doing day in and day out, I think it’s fascinating, and I think there is so much more to learn and experience for all of us.

Abhay, I do want to say a big thank you again, both for us being a part of your Typeface journey and, more importantly, for the last 45 minutes or so here, having this conversation with us as part of our Founded and Funded podcast series. Thank you so much.

Abhay: Thanks so much. It was great to be here.

Coral: Thank you for listening to this week’s episode of Founded and Funded. If you’re interested in learning more about Typeface, please visit Typeface.ai. Thank you again for listening, and tune in a couple of weeks for our next episode of Founded and Funded, where we’ll bring in new VP of Google Cloud James Phillips – and former head of Power BI at Microsoft – and former CEO of Tableau Mark Nelson. These two former competitors talk about product-led growth, data & analytics, and scaling in the face of stiff competition.

Airtable CEO Howie Liu on Product-Led Growth, Combining AI with No-Code UX

Airtable CEO Howie Liu Founded & Funded

In this week’s IA40 spotlight episode of Founded & Funded, Investor Sabrina Wu talks with Airtable Co-founder and CEO Howie Liu. Airtable is a low-code platform that enables teams to easily build workflows that modernize their business processes. The company launched in 2012 and has been on a product-led journey since then. Last year, Airtable ranked number three in the growth stage section of the intelligent applications 40. And just in May, the company announced new embedded AI capabilities to make it possible for teams to integrate powerful AI into their data and workflows.

In this episode, learn about Howie’s transition from a first-time founder to a second-time founder, the lessons he took with him from that journey, and how he decided to go up against the dominant forces in the low-code productivity tools space when he was only a few years out of school. As Howie explains it, to be a founder, you really have to have the perfect balance of naivety and pragmatism, but you’ll have to listen to hear his explanation.

This transcript was automatically generated and edited for clarity.

Sabrina: Hi everybody — my name is Sabrina Wu, and I’m an investor at Madrona Venture Group. I’m very excited to be here today with Airtable CEO and Co-founder Howie Liu. This is a particularly exciting conversation for me because I am a huge fan of Airtable, and I would bet many people listening to the podcast today also are, and if not, people have some homework to do to go check it out. So it’s been a lot of fun for me watching the progress and the growth and seeing how many use cases have really emerged over the years, and recently with the launch of Airtable AI, which we’ll spend some time digging into today toward the end of the podcast. So Howie, congrats on the success, and welcome to the Founded & Funded podcast.

Howie: Thank you all. Thank you for having me, Sabrina.

Sabrina: Howie, I’d like to start by going way back. So you’re not actually a first-time founder. In 2010, you founded your first company, a company called Etacts, which was an intelligent CRM. Etacts later sold to Salesforce, I think about a year after the founding, and you spent about a year at Salesforce before leaving to found Airtable. It’d be great if you could share with the listeners the journey of deciding to become a second-time founder. What made you decide to take the jump again, and what was the original inspiration behind Airtable?

Howie: So in many ways, I see Etacts as a warmup act to Airtable. As the first-time founder, I really didn’t know what it was like to start a company — I worked on some small web apps, but nothing that was really formal. And Etacts was the first company that I co-founded where we actually went out and we raised some money, went through YC, hired some people, launched a product, got some real traction, and, near the end, even turn on monetization on one of our features. It got some small amounts of real revenue. But in many ways, it was trying to do all of those things for the first time. And not just the first time as a founder, but even the first time as a product operator. This was the first job I had really meaningfully out of college. So it’s not like I had built great products before, knew how to scale them, etc.

So I think, in many ways, I had to learn as I went along and was able to apply a lot of those learnings to the second time around with Airtable to do things with more of a deliberate approach. If I could characterize that first company Etacts as just trying to figure out what we’re supposed to do at every part of the company. With Airtable, what we wanted to do was start with a lot more conviction of here’s the opportunity, almost like create effectively a business plan and a roadmap, and have more forethought of if we build this, what’s going to happen? How are we going to validate every step of the way?

And in fact, the time that I spent at Salesforce after being acquired by them directly inspired a lot of the ways that we thought about Airtable — both the massive opportunity of if you could democratize the process of building business apps and distill it into these elegant building blocks, which in a way Salesforce did, but just to a very different side of the market, much more complex, heavyweight applications. You build it on Salesforce and it’s a really great platform for that. But with Airtable, we saw this opportunity to actually disrupt that and democratize the building of apps. So actually getting to see from within Salesforce, here’s what it’s like to build a great app platform to take it to market to many different industries and use cases, etc., was definitely a direct inspiration to Airtable.

Sabrina: And I think one of the things that I really have found fascinating by Airtable is that it makes it easy for all users regardless of how technical they might be. I think you used the word no-code in terms of being able to build applications in a really powerful way without having to write the code. But when you founded Airtable in 2012, I think the idea of taking on somebody like Microsoft, who has predominantly been dominating the productivity software market for decades, must have been a scary concept, especially as you noted maybe a couple of years out of school. So I’m curious, how did you think about creating a new collaboration tool? Can you tell us about the journey of tackling this problem and some of the challenges that you had along the way?

Howie: First of all, I think philosophically, when I reflect on the journey of being a founder, I think you have to have this perfect balance of naivety. So you actually think that you can do something as bold as take on these massive giants, whether it’s Salesforce or ServiceNow or Microsoft or whatever, yet pragmatism. So you’re not just doing it in a completely unstrategic way. You’re finding a place where either structurally or otherwise, there’s a weakness or a gap where you can exploit it. For Airtable, when we thought about the productivity landscape, there was a lot of incremental innovation. If you look at G Suite and what they did with Google Docs and Google Sheets, etc., it’s really cool, but in my opinion, incremental innovation on the offline version of Word and Excel, etc. They brought it online. There was a lot of technical magic that had to go into creating real-time collaborative versions of those products. There was a technical thing they came up with called Operational Transforms that allows you to deal with all these real-time editing people online and handle all of their merge conflicts in a really seamless way. And yet, from the product standpoint, it didn’t fundamentally unlock completely different use cases of Excel or of Word. It certainly enabled more collaboration. It solved a lot of file saving and sending headaches, and yet it was still basically the same product experience.

And the opportunity that we saw for Airtable was that most people actually are using Excel in one of two very different ways. One is number crunching. And if you think about the original origin of the spreadsheet, Lotus 1-2-3 or even before that VisiCalc, it was like this glorified number-crunching tool for accountants — actually a computerized version of something that happened very offline manual, you would literally be crunching numbers by hand or with calculators. And yet, as Excel became more and more mainstream, I think people ended up using it as their makeshift database. They would come in and build customer lists, inventory lists, or even event or wedding RSVP lists.

So in practice, there was this split in terms of how spreadsheets were used. And the side that we wanted to take on was when people were using spreadsheets not as the number crunching tool for which it was originally invented, but instead as almost like a lightweight database workflow type use case. For those, we knew that we could do a much better job because we didn’t have to compete head-on. We didn’t have to go and recreate all of the advanced number-crunching functionality of a spreadsheet. We could just pick off all of these tabular workflow use cases and do a much better job of building a product that was, actually, at its heart, more of a database and act platform metaphor disguise or masquerading as a spreadsheet interface because we knew that would be a really intuitive way for people to just start using our product.

Sabrina: I think that’s a really important point that, at its core, Airtable is a database, as you just alluded to, and that’s one of the reasons why you can do and use so many applications and it crosses across a variety of different audiences as well. So I’m curious, how did the customers and maybe users surprise you in terms of ways that they leverage the product over time? I’m sure there are some really interesting learnings that maybe… Did you reprioritize or pivot how you thought about building the product over time? How did you think about that?

Airtable CEO Howie Liu: I think the starting thesis that you have for a product then often creates a self-fulfilling prophecy. Meaning because the product was very clearly designed for tabular use cases, we didn’t actually support any number-crunching functionality initially. You could not crunch numbers in Airtable if you wanted to. Now we have some ways of doing that, like formulas or you could create reports and so on. But when we started out, there was no way that somebody could use Airtable as a traditional number-crunching spreadsheet. For instance, when we got our early alpha customers and we discovered the use cases they were building, it was stuff like a nonprofit building programs management and donors management workflows. You know, being able to administer across many different locations their operations. And these are much more apt use cases that otherwise might be powered by something like a workflow.

And I think over time, as we discovered more of these use cases, we leaned into them and built more functionality to really enhance that. We built templates at some point once we discovered initial use cases that were bubbling up, we would go and templatize it. And especially early on, a lot of it was SMB-oriented or even consumer-oriented. So we would take all of the great usage that we heard from the community and make it easier for people to build that if they were to sign up later. So I think you start with this thesis, and then if you’re right or if there’s any inkling of being right, I think you start to see some organic growth around that. And ultimately, we just lean more and more into it.

Sabrina: As I mentioned earlier, I’ve had the privilege of using the product now for probably close to five, six years. And I think one of the reasons I really love the product just because of the Clean UI/UX, I mean, there are many different reasons, but it’s so intuitive in terms of how easy it is to use without any knowledge of coding or even any knowledge of how to use Excel. And I think that is a testament to your understanding of how to solve a customer pain point. When did you realize Airtable had this incredible product-market fit? How did you realize you had this conviction that you actually were solving a really big, important problem?

Howie: I think there’s almost two ways that you can go about finding scalable product market fit. And I would say the first way is maybe what Twitter did. You build something you’re not really sure where it’s going to go. You start with something small and you just get some traction. And if it works, you fan the flames and you build from there. And I think there are plenty of great companies that have been built that way. They start with something really almost seemingly like a joke or a toy that becomes something really, really big. I think for us, we started the other way, which is to identify a really large opportunity that, on first principles, just should be solved. And that opportunity for us was, we identified this need for apps in every part of every company. Functional apps, little apps that currently weren’t being built because it’s too expensive to go and build an app with code or even to go and take a very heavyweight solution and customize it to your needs. So instead, they’re getting solved by makeshift spreadsheets and documents and people emailing things and just not really having a very structured way of doing their work when they should be. So we did a lot of research on that space, and we came to a pretty high conviction that this should be a thing that exists. In fact, there were little glimmers of it in the past. Some of the earliest software products when computing first came to the fore were database products like Ashton-Tate’s dBase. There was Lotos Notes, Microsoft Access, FileMaker Pro, etc. So there were these glimmers of this opportunity. And, of course, on the large enterprise side of things, there were big platforms that solved this problem, but we really felt quite confident that there was a gap in the market for us to go and fulfill.

We felt like if we could just unlock the near-term product usability and the onboarding and the growth mechanics of the product, there would be a big light at the end of the top. It’s not like we’d built this thing and have no TAM. So really, we broke it down into the different phases of product market fit finding. Initially, can we design a prototype of this that actually is intuitive enough for somebody to immediately start using and building a real workflow, real app. So some of our early alpha tests were really designed to do that. It wasn’t about getting as many users as possible, it was making sure that we could actually solve real workflow problems for a small number of invite-only alpha customers. And then from there, and they get every step of the way, even when we launched on Hacker News and got maybe, I want to say, 10,000+ organic signups and started getting a trickle of additional signups on our waitlist — dozens per day from there on. Or once we launched publicly and got even more signups and even more daily organic signups, there wasn’t a single moment where it was like, okay, we’ve made it, this is a thing. It was more like at every phase, we were unlocking the next phase of growth. And we were figuring out clearly the initial product has value and some people are able to figure it out, but a lot of people get stuck because they don’t know how to build or what they’re supposed to build on the platform. So we have to do a better job of onboarding, we have to build more templates. We have to add more sharing functionality for the product so that once they’ve built something, it’s easier to collaborate with others in it. And I think along the way, we built up enough of these unlocks to actually continue sustaining growth. Initially purely from a bottoms-up PLG standpoint and later from also enterprise go-to-market standpoint as well.

Sabrina: I was going to ask about the go-to-market model because Airtable has this really interesting mix of the PLG bottoms-up motion where a user like myself can go on, try out the product, and test it out. And then maybe if enough people from my company come, then you could do more of this enterprise sales motion. But having two motions can sometimes be challenging or maybe has its own set of challenges. So I’m curious, how did you guys navigate that, and is there one that was easier to implement than others? Or in today’s day and age, everyone talks about how the PLG motion is the way to go because you get this long lead of customers, and you don’t have to do the top-down sale. What was your thinking around that? And what light can you shed to the listeners on that point?

Howie: I think so many of our decisions early on were made probably naively. But naively, we thought, “Hey, if we just build a really great product, people are, of course, just going to come and want to use it.” And there were a couple of prior arc examples of PLG companies at the time we started. Dropbox and Evernote were probably the most notable ones. Slack actually didn’t launch until after we had started going. So we started in 2012. I think Slack probably launched in 2014/2015, probably just right before we launched. So there weren’t that many great, especially B2B and team or even department or company-scale applications that had proven out this PLG motion.

So it was definitely early days, and thus very naive for us to assume it would work. And yet somehow, it did. I think, in this case, because we had a low enough barrier to entry for somebody to just pick up the product and start using it. And that was both a usability thing. It didn’t require you to learn this complicated manual to be able to build on Airtable. It was easy to get immediate value. So we really tried to front load how you get an MVP of a use case in Airtable up and running and have it be actually demonstrably better than the prior arc. Let’s say, using a spreadsheet or not using anything at all. So we really tried to front load a lot of the very easy-to-use yet powerful features like having rich text fields or dropdowns or even be able to visualize the content in a way that was not just limited to the spreadsheet grid.

And I think over time, the product funnel just continued to compound. So PLG took us a very long way. We got to tens of millions in revenue at the time that we went out and raised our series, our unicorn round. And this was I think the time when no-code, low-code was starting to become more legitimized as a category and also just the PLG engine was more recognized as maybe a plausible path to building growth. And yet I think one of the limitations of PLG is that from a product standpoint, sometimes you get stuck in smaller-scale use cases. It depends on the product, but in some cases, the mechanics of PLG and any particular product are that you get great bottoms-up adoption. But sometimes you need a little bit more of a push to actually consolidate a bigger data set. In our case, becoming a system of record for something really mission-critical and also becoming the way that an entire larger maybe department-level process as opposed to team-level workflow, is built. And sometimes those things do emerge organically. We were very lucky to see early PLG traction carry us forward into these bigger meteoric use cases within larger enterprises.

But what we also recognized is that we didn’t want to just rely on that organic momentum to bring us there. We wanted to go and start more directly engaging in enterprise-level sales conversations to get those higher value meteoric use cases because we knew that the real opportunity was not just to go and serve lots of little smaller fragmented use cases, but actually to scale up and raise the ceiling of what you can do in Airtable. So it starts small, but you can also grow into a true system of record — something that’s really, really powerful at a departmental or even company-wide scale. So we had to shift into a very intentional mode of execution, both for our product and a go-to-market standpoint to really move, it’s not just up market into larger companies, but really up use case into bigger and more valuable use cases within the enterprise.

Sabrina: That leads to an interesting place to pivot the discussion a little bit to talk about AI and ML because you can do a lot with different data sources across the organization if you’re able to connect in different pieces within product and marketing and sales and how can you enable and create this feedback loop. And I also don’t think you can have a conversation with a tech founder these days without talking about something related to Generative AI. So I know Airtable just announced Airtable AI. I’d love it if you could just tell us a little bit about what some of those features are. What are some of the embedded AI capabilities and maybe tie it into what you were talking about building on that concept of connecting different data sources within the broader organization?

Howie: So our approach to AI is that we think, first of all, the modern models, especially LLMs, are capable of really profoundly useful knowledge works. We’ve gone from maybe over a decade ago, AI being a very narrowly applied thing. If you got a large dataset, you could do predictive analytics. You could create a better recommendation engine. I think of the Netflix data size prize when I was in college as well, an example. We then entered the space where you could do really powerful machine vision, you can identify what’s in images. That was a big breakthrough. But I think now the big moment for LLMs is that they’re not just capable of outputting text in a certain stylized format or writing emails, etc. I mean, sure, those are some of the use cases, but I think we’re actually just scratching the surface where most people who have interacted with ChatGPT are just scratching the surface of how much deep reasoning and creative work these LLMs are already capable of.

If you imagine the product roadmap use case in Airtable, you’re coming up with feature ideas. Maybe those are informed by user research that you’ve done. So you can track both user research and a feature backlog in Airtable, maybe also the release marketing workflow. Every step of that process probably has multiple points into which you can inject AI. AI is not just for very superficial things, but actually really meaningful reasoning work. So I’ll give an example of in the user research tagging phase, you can take actually user research snippets or insights and have AI categorize each of those. We have an AI field where you can actually take any input from a record and then output something that’s basically prompted from the LLM. So in a way, it’s having a little LLM brain embedded in every single cell of Airtable as a primitive and taking whatever inputs you want from the localized data that you have in Airtable and then outputting it seamlessly in the context of that workflow. Another example might be, okay, now you have these insight summaries of user research for each feature. Now pull those together along with the high-level goals of this feature into a product requirement stock and actually generate the first draft of that. And it’s more than just stylistic formatting. It’s actually going and thinking a PM would, what does this product feature need? You’re Uber, and you’re trying to create a new feature — like a loyalty program — what should that entail? And so our goal is really to integrate LLMs into the context of your data, your workflows, and into our interface all with a no-code UX around it so that it just becomes another primitive in the toolkit you have to build apps. And ultimately, I think our thesis is that these LLMs are really powerful, but the real value gets exploited once you put them into the context of data and workflows. And that’s really what we’re all about.

Sabrina: I think that the point around being able to integrate directly into your workflow is a really, really important one. No one wants to leave their workflow to go look for an answer, go ask a question, find the output, or go to ChatGPT and paste it back in. So if you’re able to show that directly in your native Airtable workspace, then it becomes much easier. But I think one of the questions I have for you is how do you think about the UI/UX when it comes to that. That’s one of the big questions is there’s so much going on, h ow do you give the right outputs and continue to gain user trust as they are maybe using the product? These models can tend to hallucinate, for example, so maybe you get the wrong tagging, which may not be the end of the world and this workflow, but how do you think about some of those things to keep the user really engaged?

Howie: So first off, I think there’s been a lot of speculation that AI is going to obviate the need for traditional user experience design. Everything’s going to be replaced by natural language interface as the input, and then you’re just going to magically get the output that you want. It’s going to do all the work for you and perfectly hand it to you on a plate. And I think to your point, LLMs are very, very capable, and you can get accuracy up through a number of means, whether it’s fine-tuning, giving it a few shot examples, or just plugging it in with the right prompt and the right context. And maybe there’s some pre-and post-formatting tweaks. But I think, ultimately, we’re a long way off from having AI that’s so powerful that it can just do everything you want without human intervention. And I think the more powerful applications, at least in the foreseeable future for these LLMs, are going to be making the output of it very visible and interactive so that the error tolerance is very high. When I think about Copilot in GitHub as an example, it’s a really great application of AI because the worst that happens if it generates bad code is the human coder can just review that code and edit out the part they don’t want or change it. You can even have it generate 10 different examples of code and you can use that to inspire your thinking. And I think that’s the best way for LLMs to be used, especially in our context where Airtable is primarily an internal application builder platform. You’re not building external customer-facing use cases on Airtable typically.

So in the internal use cases, you don’t have to worry as much about some of these other issues like is the content output copyright safe? Is it appropriate? Is it going to hallucinate? Our goal is to, in the near future, deploy LLMs or encourage LLMs to be used in the context where the output can be seen by human very easily and edited. And it’s more of almost like a very, very advanced auto-complete step where it can generate the first draft of something, but there’s still very much an expectation that the human comes in. And this is, by the way, where the fact that Airtable is a very visible product. Everything in Airtable is very visible, the data is visible, the steps in a workflow are very visible. You can compose an interface. You can create fields that chain off of each other and the output of one AI field, then you can see before you go and pass it into a formula field or another AI field or trigger some action with it. The fact that all of it is very interactive in terms of the human, I think, helps in the cases where the AI’s output is not perfect, but can be usefully wrong or at least a good starting point. So I think it’s a really, really good call out, and it probably increases the importance of having really strong UX around the human feedback loop.

Sabrina: And then I’m just curious, can you share with us generally, we’ve talked a lot about large language models. There’s obviously been an explosion of new models that come out, seems to be that there’s larger models that are trained on even more parameters every day. There’s now these open-source models that are coming out. How are you guys thinking about the technology, and how are you building the infrastructure so you can maybe easily swap in different types of models based off different use cases, even when we think about different types of data types. Some models are better for structured versus unstructured. How do you think about the tech stack, and what does that look like?

Airtable CEO Howie Liu: We want to be fairly interoperable with any model and initially, it’s going to be LLMs, but in the near future, we’re going to do text-to-image and other models as well. And I think the idea is our strength strategically is that we have really good no-code UX to build apps. And with our existing customer base, we have good data and good distribution in the context of specific customers. So our goal is not to aggregate that data and train our own supermodel on that data. Our goal really is not to go and do anything particularly fancy or deep at the model layer, but really to be quite interoperable with any model.

Right now, we’re really focused on making the product experience very seamless with open eyes model. So I think GPT-4 is a really, really capable model that can do so many different things out of the box. And in many ways, that’s really important to us as a platform because Airtable is also uniquely horizontal. We have all kinds of use cases and almost every industry function company size, we’ve had cattle farmers doing cattle tracking in Airtable to lawyers doing case mapping in Airtable, all the way up to some of the larger scale enterprise processes we’ve talked about.

Sabrina: And I think you mentioned an interesting point there around data, and one question or concern that we’ve heard from some enterprise customers that I speak with is how they want to be able to leverage data into those models. They want to train it because, obviously, if you put in your own data, the model becomes smarter and is able to solve in more contextually aware ways. But with that becomes this question around data privacy and security. I’m sure you’ve thought about this, so I’m curious how the enterprise customers that you work with might be thinking about how they can leverage their data and make sure that the data that they have that’s proprietary to them isn’t fed back into the model. So if I’m using Airtable, I want to make sure that doesn’t happen to me. So how have you guys thought about that?

Howie: All of our offerings, by default, will not have data retention, so your data will not be used to train models. That’s going to be a really important default guarantee that we have, just so that you don’t have to worry about putting your most trusted and high-value data into your Airtable. That should be a given. I think secondarily, there’s still going to be a lot of different preferences within enterprise customers. So I’ve spent a fair amount of time talking to CIOs or CXOs at different enterprises, and I think every company has a slightly different stance on this, and I think it’s quickly evolving. I mean, nine months ago, probably most enterprises didn’t even have a strategy around LLMs and what are the LLM providers that we’re going to partner with or leverage. We need to train our own or use one of the open-source pre-trained models to deploy it on our own infrastructure.

It feels like the beginning of the cloud revolution where everybody’s trying to scramble and figure out what is our cloud strategy. I think the smoke’s going to clear a little bit in the next, call it six to 12 months, and there will be some stabilization of different enterprises falling into a few different buckets of preference. Some are going to want to have in-house, in private cloud deployed offerings, whether it’s something like the Microsoft managed offering or AWS Bedrock offering, etc. Others are going to be fine using OpenAI’s own offering. And our goal is to really be interoperable with as many of those different options as possible, including if an enterprise wants to post its own model. It’s our goal to figure out ways to be able to talk to those models in a secure environment and be able to give you the best of both worlds. So I think the landscape is very quickly evolving, and it’s premature to call where things are going to settle.

Sabrina: With the landscape quickly evolving, one thing that I think where Airtable has an advantage is that you have a large reach, a large customer base, and you have the distribution, and that’s what a lot of early-stage startups are looking for. But with that being said, I think there’s also a lot of innovation happening at a really fast pace. So I’m curious with all these new companies popping up that are built with large language model technology at the core, what keeps you up at night as it relates to AI/ML? How are you continuing to stay ahead of the curve and educating and making sure that you’re building Airtable and positioning yourself in the best way possible?

Howie: I think, on the one hand, rationally, I can say the LLMs are going to continue advancing at pretty incredible speed, even if not just increasing the size of the data sets that they’re being trained on, since we’re exhausting the number of available public domain tokens that we can use to trade them with. But even just improvements, for instance, to how to fine-tune them and improving the performance in specific applications, I think that’s continuing to advance at such a rapid rate. We’re going to see multimodal become a very widely available option for most of these models. And I think I miss it all. The rational thing we could say to comfort ourselves as Airtable is as long as data distribution and that UX of how you present that model, how you integrate it into a useful use case, remains valuable, we’ll still have a role to play. And we need to make sure that we’re keeping up to date on the latest advances in models, what are the new models we need to support, etc.

The paranoid version of me, which I think is similar to the dichotomy of naivety and pragmatism, I think you need a little bit of rational certainty and then also some paranoid uncertainty to always be on top of the game. I think the paranoid version of me says, “Well, at what point do the models become so disruptive that there’s a completely new experience possible of building apps?” And I want to say in the near future or even midterm future, I think again, the no-code UX is actually the ideal way to build apps with LLMs. And you’re still going to want, if anything, more UX around the feedback loops and the affordances for how people can build and then use these LLMs in practice. But I think we want to be very, very plugged in. And I’m personally spending a lot of time in the ecosystem learning from one of the most interesting and disruptive startups in AI, spending time at really every layer of the stack from app companies all the way to the LLM providers just to make sure we’re staying a couple of steps ahead of the game. It’s really exciting because, in many ways, it feels like in that entire world of AI, nobody really has a good census view on where things are going to shake up. And, certainly, in terms of where value will accrue, it’s really not quite clear. In many ways, it’s both terrifying and also very, very exciting because it’s like anything could happen, and we can’t fully even imagine what product experiences and business models will look like five years from now as a result of all these continued advances and compounding of AI capabilities.

Sabrina: Totally agree. Even as VC investors, we always say we try to predict what the future will hold and make bets based off that, but it is incredibly difficult to predict these days. But it makes it really fun, as you point out, to think about all the innovation happening at each layer of the stack. I think it’s a really fun time to be a builder. Just to wrap up here bit. We ask all of our intelligent application i40 winners a lightning round of questions. So going to ask you a few questions here. The first is, aside from your own, what startup or company are you most excited about in the intelligent application space and why?

Howie: I think there’s a lot of really interesting AI app companies that are finding some very specific use case built around AI. One example is Galileo. It’s a way to design interfaces with AI and eventually you can output either a Figma design or code. I don’t know where it’s going to go. I think the founders maybe are actually still figuring out in the big open-ended world of possibilities, where can you take this? And I think that’s actually part of the excitement. There’s so many different entry points of where you can apply an LLM and then build up all of the more specific product functionality and go-to-market execution to turn that into a real business. It’s a lot to be figured out, but I think it’s really cool to see a lot of these specific app companies go and try to find one use case to take and specialize in.

Sabrina: Outside of enabling and applying artificial intelligence to solve real-world challenges, what do you believe will be the greatest source of technological disruption and innovation over the next five years?

Airtable CEO Howie Liu: I think it’s hard to even say because AI itself is so big. In a way, it’s almost like, what are the different permutations of AI? And AI can be applied in both very top-of-the-stack ways. Like hey, let me take one of these LLMs and build a transformative consumer product experience or enterprise. But also, there’s going to be a lot of innovation around taking transformer model architecture and then training it with new data, whether it’s biomedical data or it’s self-driving car data, etc. So I’m going to give a non-answer, which is it’s going to be AI and every single permutation of AI — applying models to new use cases, applying existing models to more interesting consumer-level UX innovation. It’s all of the above.

Sabrina: Last question. What is the most important lesson likely from something you wish you did better, that you have learned over your startup journey?

Howie: I think the importance of moving quickly is not to be understated. In a way, Airtable benefited from being very thoughtful and methodical with our product roadmap and really the TAM and de-risking it. At the same time, every day really counts. And the more that you can start compounding your learnings, that doesn’t mean always go into hyperscale mode right away. I think it was actually a good thing that we took three years to build the product and launch it. We’re very intentional about our early days ,product-market fit finding before we turned on the gas of let’s scale this up.

All that being said, the more you can accelerate that rate of learning, and I see this in the AI space where all these new startups are launching and very, very quickly gaining user feedback, learning what works and what doesn’t work. And maybe not all of them will have durable advantages right away, but I think the faster they get out there into the market and learn, the better. Especially as the world starts accelerating in its pace of change, I think being able to learn very quickly and scale up that process as opposed to just focusing on scaling revenue or growth in traditional terms, I think becomes one of the most important core competencies as the landscape evolves.

Sabrina: Awesome. Well, Howie, this has been a lot of fun. Really appreciate you joining us today on the Founded & Funded podcast. Thanks again.

Howie: Thank you, Sabrina. It was fun to chat with you.

Coral: Thank you for listening to this IA40 Spotlight episode of Founded & Funded. If you’re interested in learning more about Airtable, please visit www.airtable.com. If you’re interested in learning more about the IA40, please visit www.IA40.com. Thanks again for listening, and tune in a couple of weeks for the next episode of Founded & Funded with Typeface Founder Abhay Parasnis.

Numbers Station Founders on Applying Foundation Models to Data Wrangling

Numbers Station Co-founders Chris Aberger and Ines Chami talk applying the transformational power foundation models to data prep and data-wrangling.

This week, Madrona Managing Director Tim Porter talks to Numbers Station Co-founders Chris Aberger and Ines Chami. We announced our investment in Numbers Station’s $17.5M Series A in March and are very excited about the work they’re doing with foundation models, which is very different than what has been making headlines this year. It isn’t content or image generation – Numbers Station is bringing the transformational power of AI inside of those foundation models to the data-wrangling problems we’ve all felt! You can’t analyze data if the data is not prepared and transformed, which in the past has been a very manual process. With Numbers Station, the co-founders are hoping to reduce some of the bifurcation that exists between data engineers, data scientists, and data analysts, bridging the gaps in the analytics workflow! Chris and Ines talk about some of the challenges and solutions related to using foundation models in enterprise settings, the importance of having humans in the loop — and they share where the name Numbers Station came from. But, you’ll have to listen to learn that one!

This transcript was automatically generated and edited for clarity.

Tim: Well, it’s so great to sit down and be able to have a conversation here on Founded & Funded with Chris Aberger and Ines Chami from Numbers Station. How are you both doing today?

Chris: Doing great. Thanks for having us.

Tim: Why don’t we just start off and tell the audience what is Numbers Station? What exactly it is that you’re doing in your building?

Chris: So, Number Station at a high level is a company that’s focused on automating analytics on the modern data stack. And the really high-value proposition that we’re providing to customers and enterprises is the ability to accelerate the time to insight for data-driven organizations. We are all built around and started around this new technology of foundation models. I know it’s kind of the hot thing now, but when we refer to foundation models, we’re referring to technology like GPT-3, GPT-4, and ChatGPT, and bringing the transformational power of AI inside of those foundation models to the modern data stack and analytics workflows, in particular, is what we’re doing here at Numbers Station.

Tim: We at Madrona, we’re super excited to lead the financing in your last round that we announced not too long ago. And those who’ve been listening to our podcast know that we’re all in on foundation models and GenAI, and we think Numbers Station is one of the most exciting teams and approaches that we’ve come across. So, we’re excited to dig in with both of you here. Maybe tell us both a little bit about your background. How did you meet? How did you come up with this idea for a business?

Chris: Yeah, so we both met, and I’ll let Ines jump in here in a minute because she’s the brains behind a lot of our technology that we have at the company. We all met at the Stanford AI Lab, so we were all doing our Ph.D.s on a mix of AI and data systems. So that’s where I met Ines, as well as Sen Wu, who’s another co-founder, and then our fourth and final co-founder is Chris Re, who was our adviser in the Stanford Lab. We came together a couple of years ago now and started playing with these foundation models, and we made a somewhat depressing observation after hacking around with these models for a matter of weeks. We quickly saw that a lot of the work that we did in our Ph.D.s was easily replaced in a matter of weeks by using foundation models. So somewhat depressing from the standpoint of why did we spend half of a decade of our lives publishing these legacy ML systems on AI and data. But also, really exciting because we saw this new technology trend of foundation models coming, and we’re excited about taking that and applying it to various problems in analytics organizations. Ines, do you want to give a quick intro on your side and a lot of the work that you did in your Ph.D.?

Ines: Yeah, absolutely. Uh, and thanks for having us, Tim. So, my background is, as Chris mentioned in AI, I did my Ph.D. at Stanford with Chris Re. My research was focused on applying AI and machine learning to data problems like creating knowledge graphs, for instance, finding missing links in data using embedding-based approach. So, these were the more traditional methods that we were using prior to foundation models. And toward the end of my Ph.D., applying techniques like foundation models and LLMs to these problems. And we realized, as Chris mentioned, that it made our lives much easier. And so that’s where we got really excited and started Numbers Station.

Chris: Ines is being modest, I’ll just throw in a quick plug there, on some of the work that she did. She was actually one of the first people to show by using these foundation models like GPT, that you could apply them and replace a lot of the legacy systems, some of which we built, as I alluded to earlier, on various different data wrangling and data preparation problems. She authored the seminal paper that kind of came out and proved that a lot of these things were possible along with some other team members that are here at Numbers Station, but she has really been at the forefront of a lot of what you can do with these foundation models.

Tim: That’s awesome and it’s a bit of a feeling like getting the gang back together again and how Madrona got involved and how I met both of you. Chris Re had a previous company called Lattice Data that we were fortunate to invest in. where I originally met Chris. It ended up being bought by Apple. And The Factory was the original investor and sort of incubator for the company and Andy Jacks had been the CEO of Lattice Data, it ended up being bought by Apple. And then there’s Diego Oppenheimer, who introduced us all, and he’s another board member, part of The Factory, and former CEO of Algorithmia, which was another investment. So, you know, many times, we invest in brand new founders that we had never met before and had no connections with. In this case, there was some nice surround sound, and to build on your point, Diego first sent me a demo video and was like, “Hey, you’ve got to check this out.” And I thought what you were doing was pretty magical. Then read your data wrangling paper, Ines, and some of the other papers you wrote, and I was just struck by how you’re a team that brings together cutting-edge fundamental research with the business problem that we’ve seen to be red hot and has been a glaring pain point for many years, along with bringing to bear a differentiated, defensible technology in this space, which we’ll talk about. So little bit of the background as well from our end. But it’s so fun to be working together with both of you and the rest of the incredible team that you’ve begun to build.

So, Chris, you mentioned upfront data analytics, maybe say more about that. Why did you choose data analytics? You came from the Stanford AI Lab. Literally the crucible of the research around foundation models, I think coined the term foundation models. Why did you pick this problem? And, uh, then tell us a little bit more specifically about the persona that you’re going after here initially with Numbers Station.

Chris: Yeah, so when we were looking at where we wanted to take this technology and apply it, there were a couple of different observations that we made and why we decided to go into the data analytics space. The first is something near and dear to our hearts. You can look at all of our backgrounds, Chris Re and myself in particular, and we all have a mix of databases plus cutting-edge AI and ML. Data analytics is this nice sweet spot that’s near and dear to our hearts that we’ve all been working in for the better part of our careers. The other observations that we made are we looked at the data analytics space and a lot of the tools that are out there, and we still saw people that were in a ton of pain. So, we looked at what the practitioners were doing today and there were still so many hair-on-fire problems in terms of getting their data in the right format such that they can get usable insights out of their data. And so, this really excited us that there are a lot of tools that have entered in this space, but there’s still a lot of pain from customer’s perspective in terms of their day-to-day jobs.

We’re really excited about taking this transformational technology and applying it to those kinds of hair-on-fire problems that we saw with different customers. And the third point — this one’s changed a little bit since the space has become so hot in, let’s say, the last three or four months, but when we were starting the company, we looked at where most of the AI talent was flocking. So like, where are the Ineses of the world going? And a lot of them were going to, image generation or content generation or image detection, things of that nature. So, for lack of a better word, kind of sexier applications, not how do I normalize the data inside of your database?

So we saw this talent mismatch, too, in that we could bring some of our expertise on the ML side. And really apply that to an area that’s been underserved in our opinion, by ML experts and the community. And so we are really excited about bridging that talent gap as well. And those are all the reasons that we decided to go after the data analytics space as a whole.

Tim: This down-and-dirty enterprise problem that has been, as you said, hair on fire for many years, lots of dollars spent on solving some of these issues. You hear repeatedly that, so much of the time and effort of teams goes into the sort of end-to-end challenge of data analytics. Maybe we can break it down a little bit. There are front-end issues around how you prep the data, wrangle the data, and analyze the data. You mentioned the seminal paper around using FMs for data wrangling. There’s how do you ask questions about it? How do you put it into production? Talk a little bit about how you apply Numbers Station’s product and technology across that pipeline.

Ines: Yeah, so that’s a great question. at Numbers Station, we started with data preparation, or data wrangling as we like to call it, because, for us, we think it’s basically step zero of any data analytics workflow. So, you can’t analyze data if the data is not prepared and transformed and in a shape where you can visualize it. So it’s really where we’re spending our time today, and the first type of workflows we want to automate with foundation models, but ultimately our vision is much bigger than that, and we want to go up stack and automate more and more of the analytics workflow. So the next step would be automating the generation of reports, so asking questions in natural language and answering questions, assuming the data has already been prepared and transformed. And that’s something that foundation models can do by generating SQL, or other types of codes like Python and even more up stack, we can start generating visualization as well as automate some of the downstream actions. So like, let’s say I generate a report, I figure out there’s an issue or an anomaly in my sales data, can we like generate an alert and automate some of the downstream actions that come with it? The vision is really big and there are a lot of places where we can apply this technology. For Numbers Station today, it’s really on the first problem, which is data preparation, which is probably one of the hardest problems. If we can nail this there’s a lot of downstream use cases that can be unlocked once the data is clean and prepared.

Chris: And just to riff off what Ines said. We looked at a lot of tools that were out there in the market. And some of them that kind of skipped steps from our perspective and went straight to the end thing and the bigger vision that Ines just alluded to, and we noticed that over time a lot of those tools had to add in data preparation or data cleaning techniques in order to make their tools work. So, the way that we view this is by, you know, building the bricks of a house first and working on data transformations in particular, these things that can build data preparation pipelines, and then build on top of that to enable our more ambitious vision over time.

Tim: Yeah, I have to say that the data prep challenges are what initially got me really excited as well. The broader vision over time is going to really come to bear. We just see people wasting so much time on this fuzzy front end of getting data ready to actually do the analytics or to do the machine learning on it. It’s been a forever problem. There’ve been other products that have tried to address this but just don’t fully answer it. And you know, our thought was that in seeing your early prototypes that foundation models provide a zero to one here, where previous products fell short. Maybe say a little bit more, Chris or Ines — what’s different now with foundation models that allow you to solve some of these front-end data prep and wrangling problems in really magical ways?

Ines: Yeah, there’s, an interesting shift in terms of the technology, and something that is enabled by foundation model is who can do this transformation and who can do this wrangling. We’ve seen a lot of tools in the self-service data preparation world, like Tableau Prep or Alteryx, to automate some of these workflows. But it’s all drag and drop and UIs, click-based approaches. So, in terms of capabilities, it’s still pretty constrained and limited by whatever is presented in the user interface and whatever role is encoded in the backend. With foundation models, it’s basically empowering users that may not know how to write SQL or how to write Python or may not know anything about machine learning to do these things, the same way as an engineer would do. And so that’s where it’s really interesting and the turning point, we think, in terms of the technology and where we can enable basically more and more users to do this work. And so that’s why we’re pretty excited for Numbers Station, in particular, to enable more users to do data wrangling.

Tim: You know some of the things you’re talking about, writing Python, writing SQL, historically, there’s been a bit of a divide or maybe a lot of divide between the data analyst who sort of works at her workbench. You may be using a tool like Tableau or Looker or others to take data from different sources, create dashboards, outputs that they share with their team, et cetera. And then there are, data engineers who are building, ETL flows, dbt scripts — do you think of Numbers Station more as this workbench product for the data analyst or more a production workflow product for the data engineer?

Chris: I would say it’s even more bifurcated than you just mentioned because you left out one camp, which is data scientists as well, right? You got that whole other team sitting over there that does a lot of ML tasks. A lot of times, it’s the output of the two teams that you just mentioned. So, I think the world is pretty bifurcated right now. One of the exciting things about this technology is that it can cut down this bifurcation. There doesn’t need to be so much of a hard divide between all the teams that you mentioned. I think each of them still serve purposes, and it’ll take a little bit of time to fully mold down and have the intelligent AI system that can bridge the gap between all of them. But at a Numbers Station, what we can do is bring some of that data science capability over to the data engineering teams and data analysts. We can bring some of that data engineering capability up to the data analyst. Our high-level goal is to enable powerful systems such that it’s not just prototyping at the highest layer, it’s prototyping pipelines that can then be easily deployed into production, such that you have kind of less of these handoffs between teams.

Tim: So, you just not too long ago opened your waitlist and are bringing on customers onto the product and people can go to numbersstation.ai and check it out, what have you seen from customers? Where have you seen early traction? Are there certain use cases? I mean, gosh, data analytics literally touches, you know, every business in the world — where do you see early opportunity and results?

Chris: I think this is true with all products when you build it, we had our preconceived notions, of course, going in, of where we thought people would use the tool. Some of those have turned out to be true, but some of the really exciting things from customers hopping on the platform is them using the platform in ways that we never even imagined and didn’t have in mind when we built the platform. And a lot of the early things that we see with customers coming into the platform is a lot of what’s called schema mapping. So onboarding customer data such that you have a consistent view and a consistent schema of that data that can easily be used downstream. A lot of problems that look like entity resolutions. We call this record matching in our system, but doing effectively fuzzy joins, where I don’t have a primary and foreign key yet, still want to get a unified view of my data inside of my system. And then it even opens up further from there in terms of different SQL transformations and AI transformations, which are classifications of transformations that we have inside of our system, where customers have used these for a variety of different things and transformations that are related to their businesses. But to answer your question, really, those first two points, a lot of onboarding problems and a lot of matching problems in particular, are where people are finding a lot of value from our system right now.

Ines: Yeah. And just to add on a lot of the use cases are for organizations that onboard data from, customers that work with different systems. So we’ve seen, for instance, Salesforce data being extremely messy with open text fields and sales assistants that write reasons and comments about their pipelines. In insurance as well, claim agents are inputting some entries. Whenever there are multiple systems like this that don’t talk to each other in marketing, for instance, HubSpot, etc., it becomes really, really challenging for these organizations to put the data in a normalized and standardized schema to deliver their services. And so that’s where using foundation models to automate some of that onboarding process provides a lot of value for these organizations.

Tim: When you’re describing some of the customer scenarios, maybe paint a picture for people. What does this mean when I said this is magical, you know, the, the end user types in what and sees what happen maybe sort of paint the picture for people at home about what the actual power of using something like this is on a day-to-day basis.

Ines: Yeah, absolutely. And we can just take an example like entity resolution and look into the details. Entity resolution, essentially the idea is given two tables that have records like customer names or product names, and we want to find a join key, basically, and there’s no join key in these two tables, so we want to derive that join key based on textual descriptions of the, the entities or, or different rows. The way this used to be done is by having data engineers or data scientists write a bunch of rules either in Python or SQL that say match these two things if they have the same name, and then if there are one or two characters that differ, it’s still a match, and it becomes really, really complex and people start adding a bunch of hard-coded logic, and with the foundation model, we don’t need that. the really amazing thing is out of the box, they can tell us what they think is a match or not, it’s not going to be perfect, but it’s removing that barrier to entry of having a technical expert to write some code and some rules, and then ultimately the user can go and look at the predictions from the foundation model and analyze them and say yes or no, the model was correct to further improve it and make it better over time.

But really the person doing the work now is the person that understands the data, and that’s really where the value comes, is that they understand the definition of the match, they understand the business logic behind it, and that’s a big shift in terms of how it used to be done before and, and how it can be done today with foundation models.

Tim: This was the magic for me as someone who could, maybe on a good day, write a little bit of SQL, to be able to go in, upload a CSV, connect to my data warehouse, and choose columns and type in what I want to happen and watch in real-time as the product takes care of it is the magical zero to one that we’ve been talking about.

So, okay. Throwing out words like magic, foundation models, what are you actually using? Today a lot of people, when you say foundation models, they think ChatGPT. Maybe talk a little bit, Ines, about under the covers, without giving away anything confidential here, what is the secret sauce?

Ines: So for Numbers Station, we need our models to run at scale on very large data sets that can be millions or even billions of rows. And that’s just impossible to do with OpenAI models or, or very large models. And so, part of our secret sauce is distilling these models into very, very small, tiny models that run at scale on the warehouse. At a high level, there are two steps in using a foundation model. At Numbers Station, there’s a prototyping step where we want to try many things. We want that magic capability, and we want things out of the box. And so, for that aspect, we need very large models. We need models that have been pre-trained on large corpuses of data and that have these out-of-the-box capabilities, and that is swappable. It can be OpenAI, it can be Anthropic models, it can be anything that’s out there, essentially. We’re really leaning into open-source models like Eleuther models as well. Part of it is because, of the privacy and security issues. Some customers really want their own private models that can be fine-tuned on their data and, and pre-trained on their data. So that’s for the large model prototyping piece. And then, for the deployment piece, which is running ad scales on millions of records, we’re also using open-source foundation models, but they’re much, much smaller. So, hundreds of millions. of parameters to be more concrete compared to the hundreds of billions, or tens of billions, in the prototyping phase.

Chris: Yeah. I think one of the things just to add on here as well, is our goal is not to reinvent the wheel, right? So, our goal is not to train and compete with OpenAI and all these companies that are in this arms race to train the best foundation model. We want to pick up kind of the best that’s coming out and be able to swap that in per customer. And then have this fine-tuning and personalization to your data where you have model weights that you can own for your organization. And this is always something that we’ve had in mind in terms of architecting the system and vision for the company. And our view on this was always that we believe that foundation models are going to continue to become more and more commoditized over time. This was more of a daring statement, I would say, when we started the company maybe, you know, two years ago. It’s less of a daring statement now. Like I don’t even know how many open-source foundation models were released in the past week. It seems like a safer statement at this point to say that this is going to be continued to be more and more commoditized, and it’s really all about that personalization piece. So how do I get it to work well for the task at hand that you want? So, in our case, looking at data analytics tasks and how do I get it to personalize for your data and the things that are important for your organization. So those are the high-level viewpoints that have always been important in terms of us architecting out this system.

Tim: You know, you, you’ve both used some different terms that I think maybe the audience and I know even I would, would appreciate some discussion around. You mentioned fine-tuning is an approach for personalizing, Ines you mentioned distillation or distilling. There’s another related concept around embedding. Maybe just talk through what are the different ways that Numbers Station, or in general, that you can sort of personalize a foundation model and how some of those things are different?

Ines: Yeah, it’s a great question and I would even start by talking about how these models are trained by using very large amounts of unlabeled data. A bunch of text, for example. And that’s essentially the pre-training phase to make these models really good at, general-purpose tasks. But what fine-tuning is used for — it’s used to take these large pre-train models and adapt them to specific tasks. And there are different ways to fine-tune the model, but essentially, we’re tweaking the weights to adapt them to a specific task. We can fine-tune using label data; you can fine-tune using weekly supervised data, so data that can be generated by rules or a heuristic approach. And we can also fine-tune by using the, the labels that are generated by a much larger and better model. And that’s essentially what we call model distillation. It’s really when we take a big model to teach a smaller model how to perform a specific task. And so, at Numbers Station, we use a combination of all of these concepts to build small foundation models specifically for enterprise data tasks that can be not only deployed at scale in the data warehouse but also privately and securely to avoid some of the issues that can appear with the very, very large models.

The other aspect of the question was embedding. So, embeddings are a slightly different concept in the sense that it doesn’t involve changing the weights of the model. But embeddings are essentially vector representations of data. So, if I have text or images, I can use a foundation model to translate that representation of pixels or, or words into a numerical vector representation. And the reason why this is useful is computers and, and systems can work much more effectively with this vector representation. At Number Station for instance, we use embeddings for search and retrieval. So if I have a problem, like entity resolution, and I want to narrow down the scope of potential matches, I can search my database using embeddings to essentially identify the right match for my data.

Tim: I think a lot of people have heard about fine-tuning and think about, you know, prompt engineering and trying different prompts or putting in some of your own data to get the generative answer that you want. You’re obviously at a, a different level of sophistication here. You mentioned the pre-training piece. So, for a customer today, Numbers Station out of the box, is there training that has to take place? Do they have to give you data? What’s the customer experience as you apply these technologies?

Ines: There’s no required pre-training. They can start using out of the box, but as they use the platform, that log and that interaction is something we can capture to make the model better and better over time. But it’s not a hard requirement the minute they come on the platform, so they can get the out-of-the-box feel without necessarily having the cost of pre-training.

Chris: And that it, that improvement is just per customer, right? We don’t take the feedback that we’re getting from one customer and use it to improve the model for another. It’s really personalized improvement with that continual pre-training and fine-tuning that Ines alluded to.

Tim: Across these different technologies that you’re providing, what do you think provides the moat for your business? Maybe you could even extend it a little bit to other AI builders out there and how others can establish their moat, who maybe, you know, haven’t come from the Stanford AI Lab or other investors who might be listening and how they think about, you know, as they look at companies, where’s the moat there?

Chris: I really think about this in kind of a, a twofold manner. One is where we started. We came from the Stanford AI Lab. Our background is in research, and we still have that research nature in the company in terms of pushing the forefront of what’s possible with these foundation models and how you actually personalize them to customer use cases. A lot of that secret sauce and technical moat, is in that fine-tuning, continual pre-training, private, eventually FM per organization. And when I say FM, I mean foundation model that can be hosted inside of organizations and personalized to their data. So, a lot of our technical moat is along that end.

There’s another whole host of issues, which I would call last-mile problems in terms of using these models as well and actually solving enterprise-scale problems. And there it’s all about making sure that you plug and integrate into workflows as seamlessly as possible. And for that, we’re laser-focused on these data analytics workflows and the modern data stack in particular, and making sure that we don’t lose sight of that and go after a broader, more ambitious vision to solve AGI. It’s really twofold. It’s the ML techniques that we’ve pioneered and are using underneath the scenes, and we’ll continue to push the boundaries of what’s possible on. And the second part is making it as seamless and easy to use for customers where they are today on the modern data stack.

Tim: Any other thoughts on this Ines?

Ines: No, I one hundred percent agree with Chris. Like there are technical moats around how do we personalize the models, make them better. And there’s the UI and experience moat to basically embed these models in existing workflows seamlessly and, and make people love working with these models. Some people may say, “Oh, it’s just a wrapper around OpenAI.” But actually, it’s a completely new interaction with the feedback, etc. And so, capturing that interaction seamlessly is a challenge and an interesting moat as well.

Chris: Just to double-click on that point. I mean, I think UI/UX is a huge portion of it and a huge part of, of that moat in that second bin that I was talking about. But it goes even deeper than that, too. So, if you just think about it, right? I have a hundred million records, let’s say inside of my Snowflake database that I want to run through a model. If you go and try a hundred billion parameter plus model, it’s just not practical to run that today. The cost that it takes to do that as well as the time that it takes to do that is really impractical. And so, when I say solving those last-mile problems, I also mean how do we train and deploy a very small and economical model that can still get really high quality on what we like to call enterprise-scale data. And so, this is really kind of flipping that switch from “Oh, I can hack up a prototype really fast or play with something in ChatGPT”. To — I can actually run a production workflow. And it goes back to that earlier point of kind of workbench versus workflow, Tim, and how these worlds can nicely mold together.

Tim: I’d say, as we were talking to lots of different enterprise customers broadly, about how they’re thinking about using foundation models or how they’re using it today. The first question that always comes up is one we’ve talked about is how do we train on our data? How do we maintain privacy, etc. The maybe tied for the first question that we get a lot, and I’m, I’m curious on how you’re handling, is hallucination or confidently wrong problem? This manifests itself in obvious ways if you’re using ChatGPT and ask a factual question, and it confidently gives you an incorrect answer.

Here you can imagine things like, I use Numbers Station to parse a million-row csv and we’ve de-duplicated. How do I know it worked? I can’t go through all the million. How do you ensure that there’s not this confidently wrong in something that, you know, might cause problems down the road in the business?

Ines: Yeah, that’s a very good question and something we get a lot from our customers because, uh, most of them are data analysts, and they’re used to everything being deterministic. So, either SQL or rule. And so, they’re not used to having a probability in the output space. And they’re sometimes not okay with it. The way we approach this is twofold. We can either propose to generate the rule essentially for them. So, behind the scenes, the model, like let’s say as you said, they’re parsing a csv. I don’t need to use AI to do that, right? I can use a rule to do that. So, the model is basically just generating the rule for the user, and they don’t have to go through the process of writing that SQL or that Python for the parsing. And for use cases where, it’s just impossible to do with the role that’s what we call an AI transform. Like let’s say I want to do a classification task or something that’s just really impossible to do with SQL, we need to educate the users and make them trust the platform as well as show them when we’re confident and show them when we’re not. So, like part of that is also around the workflow of showing confidence scores, letting them inspect the predictions, monitoring the quality of the ML model, and tackling use cases where 2% error rate is still okay. For instance, if I’m trying to build like a dashboard and I want macro statistics about my data, it’s fine if I miss 2% in the predictions. So that’s, that’s the balance we’re playing essentially with. Either generating some code or using the model to generate the predictions, but really making the user comfortable with this.

Chris: And just to add on to that. These models aren’t perfect right now. Right? I think as you said, Tim, anyone who’s played with these models knows that there’s some limitations and flaws in using them. And a lot of our use cases that we’ve seen to date are where humans are manually going through and labeling something or annotating something. And it’s not that we’re completely eliminating the human from the process. We’re just speeding them up 10 to even a hundred x faster in some cases. And going through that process by having this AI system providing suggestions to them downstream. So, it’s not completely, you know, human out of the loop, yet. Of course, that’s the vision where we think the world will eventually go, but right now it’s still very much human in the loop and just accelerating that journey for the analyst to get to the end result.

Tim: I’ll take hundred x speed up. In that vein, maybe change gears a little bit here. I’m curious you’ve built this small, high-performing team already at Numbers Station. How do you use foundation models on a day-to-day basis? I was recently talking to another founder who said he told his dev team, you know, on our next one-month sprint, I just want everyone to stop everything they’re doing for the first week and just go figure out how you can maximally use all the different tools and then start working on your deliverables. And we will finish our sprint faster and more effectively than if you just started working on it and we worked for the next month. Anything you’ve seen in terms of how you use these tools internally and the productivity increases compared to years past?

Ines: I can speak for me. Obviously, code generation is a big thing. Everyone uses it now. One thing that I found funny is wrangling our own data with Numbers Station. So, we have sales data coming in from our CRM with our pipeline, wanting to do some analysis on that and statistics, and ended up using Numbers Station, which was very fun to do as a dogfooding of our own product, analyzing telemetry data as well, product usage and, and people on the platform. So that’s something that we’ve done. And obviously, for all the outreach and marketing, it’s, it’s quite useful to have a foundation model write the emails and the templates. So, I’m not going to lie, I’ve used that in the past. I don’t know, Chris, if you have more to add to this.

Chris: What I was doing right before this call, uh, was using a foundation model to help with some of my work. But, you know, one of the problems that I always had in working with customers and as a kind of ever-present problem is that you always talk to customers and you, you want to create a personalized demo for them, but they can’t give you access to the data, right? Because it’s proprietary, and they’re not going to throw any data over the wall. So, what I’ve been starting to use foundation models a lot for is, okay, I understand their problem, now can I generate synthetic data that looks very close to their problem and then show them a demo in our platform to really hit home the value proposition of what we’re providing here at Numbers Station.

Tim: We were talking about productivity gains from using foundation models broadly at our, our team meeting yesterday at Madrona, and one colleague who has run a lot of big software teams over the years said, Hey, if we wanted to prototype something in the past, you know, you’d put eight or 10 people on it, it would take weeks, maybe months. And now it’s the type of thing one developer, one engineer could potentially do in weeks or days using, code generation in some of the dev tools. And Numbers Station is bringing that type of superpower to the data analyst that some of these code-gen tools and things bring to the developer.

I’ve alluded to the great team you all have assembled in a short period of time, and it is a super exciting area and there are a lot of talented people that want to come work in this, but I think you all have done an extra effective job on hiring quickly, uh, in hiring great culture fits. And Chris, uh, we haven’t talked about it, but you spent, you know, four or five years at SambaNova before. You built a big team of machine learning software folks there.

And how have you been so effective at hiring? What do you think this hiring market is like right now in this interesting time?

Chris: Lots of practice and lots of failures, I would say, is how we’ve gotten here in terms of hiring. At, at SambaNova, you know, it was an ultra-competitive bull market at that time, and hiring ML talent was really tough. So, I had a lot of early failures in terms of hiring engineers, eventually found my groove in hiring, and built a pretty decent size organization around me there. In terms of the market right now, you know, with all these layoffs going on, there’s a lot of noise in the hiring process. But there are a lot of really, really good high-quality candidates that are out there and looking for a job. So, it’s really just judging, hey, do you actually want to work at a startup or do you want to work at a big company? Because those are two very different things, and there’s nothing wrong with either. But kind of getting to the root of that early on is usually a good thing to look at here. And right now, there’s just a ton of high-quality talent and it’s a little bit less competitive, I’d say, to get that high-quality talent than it was, let’s say, three years ago, four years ago, when we were in the height of, of a bull market.

Tim: So many topics, so little time. Uh, would love to dig deep in, in so many of the areas that we’ve only been able to touch on today, but I’ll just end with this. What is a Numbers Station? How did you come up with this name?

Chris: Yeah, so they were towers that I think were used in one of the, the World Wars or, or Cold War that could send encrypted messages to spies. So, it was really about securely broadcasting information. That’s one of the things that we do here at Numbers Station, is broadcast information to various data organizations, and that’s how we decided to use this name.

Tim: Chris, Ines, thank you so much. Really enjoyed the discussion today and look forward to working together here in years to come.

Chris: Awesome. Thank you so much, Tim.

Coral: Thank you for listening to Founded & Funded. If you’re interested in learning more about Numbers Station, visit NumbersStation.ai. If you’re interested in learning more about foundation models, check out our recent blog post at madrona.com/foundation-models. Thanks again for listening and tune in in a couple of weeks for our next episode of Founded & Funded with Airtable CEO Howie Liu.

Panther Labs Founder Jack Naglieri on Cloud-Native SIEM and Self-Growth

Panther Labs Image for Website

This week on Founded & Funded, Madrona Partner Vivek Ramaswami talks to Jack Naglieri, Founder and CEO of 2022 IA40 winner Panther Labs. Jack founded Panther, a leading cloud-native security information and event management platform, because he had experienced first-hand the threat-detection challenges companies have at cloud scale. Growing frustrated with the compromises required by traditional SIEM platforms, Jack took his experiences from Yahoo and Airbnb and set out to build a solution that detects and responds to suspicious activity in real time.

In this IA40 spotlight episode, Jack shares where the inspiration to launch his own company came from – hint it was from a cold email he received. He also breaks down why he decided to take the leap and become an entrepreneur and what it’s like transitioning from a software engineer to a founder and then to a successful founder. Jack also shares details about what it takes to land – and keep — your first customer and provides some advice about how CEOs should be the only ones learning on the job. But you’ll have to listen to get all the details.

This transcript was automatically generated and edited for clarity.

Vivek: Hi, my name is Vivek Ramaswami and I’m a partner at Madrona. Today we’re excited to have Jack Naglieri founder and CEO of Panther Labs, a cybersecurity startup, reinventing security operations, and taking a modern approach to detection and response at scale.

Welcome, Jack. Thanks for joining.

Jack: Thanks for having me.

Vivek: Well, maybe just to get started. Would love if you could share a little bit of background on Panther Labs. What was the founding story? What got you excited about modernizing security operations? How did you get into all this?

Jack: Yeah, it’s a very non-traditional founding story actually, and the gist of it is that an investor found me when I was a security engineer, reached out to me cold via email, and I just responded and decided to quit my job and go pursue it. That’s the very short version. The longer version is, I was part of the team that open sourced this project called Stream Alert. I was the main architect. We built it as an alternative to traditional SIEMs, like Splunk, Sumo Logic, Elastic, and the reason that we decided to build, which is typically the wrong thing to do, to be completely honest. I do not recommend this at all. But we built our SIEM because we really wanted three things. We wanted to be able to operate at a very high scale with a very small team. We wanted to use developer-oriented principles, like detection as code, which we really laid very heavily into that platform. But we wanted, CI/CD, we wanted the automation that comes with developer workflows. And we really wanted higher reliability, and accessibility, and we wanted more control. And then we really wanted structured data. We’ve wanted to put data into a data lake, and we wanted a, just a more formally mature way to handle petabytes of data.

We have failed for so many years as security teams putting this into a tool like Splunk. We’ve just dug ourselves into this hole. The good news is that there’s a ton of alternatives to using something like Splunk, right? You can use data lakes, you can use cloud data warehouses, and there’s so many today. At the time when I was a security engineer, Snowflake really wasn’t a popular option yet. And even Athena, which was the data warehouse on top of s3, was still fairly new as well. So these were really early concepts, but the thing I learned at that time was the phrase security’s data problem. I always think I’m the first person who said it because as soon as I started saying it publicly, Splunk started copying me, which I thought was funny. But, it’s true, right? You need to have really strong data principles for security to handle the scale, but to get value out of your data as well. And that’s more of what we’re really leaning into today. So the work I did there got the attention of some investors, one in particular, actually two had emailed me. One, I just completely ignored. We talk about it, and we’re cool now. But the other one ended up incubating the company. I hired some early engineers, and then I went out and raised money, and got a bunch of “Nos” and then eventually someone was like, “Yeah, we’ll do your seed round.” We raised our A from Lightspeed and our B from Coatue and yeah, it’s been fun. It’s probably the hardest thing I’ve ever done in my life. But it’s been super rewarding, super challenging, I’ve learned a lot, I’ve grown a lot, and I continue to — it’s never dull moment.

Vivek: That is what we hear a lot from founders is — super challenging, super hard, but super rewarding and can’t do anything else. It’s always both right.

Jack: I feel like life is kind of like that in general. if you want to, learn about yourself, you have to challenge yourself. There was a phrase that I heard recently, it was if you want to reach your limit, you have to train at your limit. You gotta do the work and you gotta figure it out, and you have to push way beyond your mental limits. Obviously, there’s a balance in startups, you don’t want to just run at your limit forever, because then your performance begins to degrade, so the rest balance is hard. I’m pretty bad at it to be honest. I’m getting better. I should rephrase and say, in the past, I was pretty bad at it, but now I’m getting better.

Vivek: Well, I was gonna ask if you’ve always been that way because you were at Airbnb between 2016 and 2018 when the company was probably growing and scaling like crazy, and there’s probably all sorts of challenges associated with that, security and otherwise. So what were some of the lessons that you learned from that experience, both personally and professionally?

Jack: Yeah, Airbnb was amazing. I just love the founders and I think that they’ve done a really great job of building a great culture and really instilling their roots of design into the company in every way. I have nothing but respect for Brian, Nate, and Joe. I took a lot of lessons away from Airbnb that really allowed me to begin to understand what it means to build a startup.

So I thought about this question and I came up with three things. The first one I came up with is, don’t fear the unknown. And don’t worry if you get it wrong on the first try. When I joined as an engineer — that was actually the first security engineering job I ever got. And prior to that, I was just an analyst. Being a security analyst is very challenging for a lot of reasons, but it doesn’t really set you up to have a great career because all you’re doing is you’re looking at data all day. And at a certain point, it becomes less effective for you to look at it manually and then you have to start automating. And that’s the type of work that I was doing at Yahoo because I realized that at a certain point, I was just unable to do my job effectively. So I sat with the DevOps engineers, and I sat with the security engineers and I was just becoming a sponge. I was like, just teach me everything. That’s one example of really not fearing the unknown. You know, you have to push yourself out of your comfort zone a little if you want to grow. So that pattern continued at Airbnb, but even more so because I was hired to build a lot of security tooling. And Airbnb was a completely different environment.

But you know, the core was the same. A bunch of cloud infrastructure, a bunch of systems to secure, let’s go figure out how to do it. But this time, all in AWS. Yahoo was this massive on-prem shop as you know — they were 20 years old at that time, so diving right into AWS, I didn’t know anything about the cloud. I just kind of went in and started building, and I made mistakes and then I corrected them. So the mantra of fail fast is really important, and with fail fast, you have to learn from it, otherwise, you’re failing continuously. So that was, that was one.

The second one is, to learn to thrive in chaos. Just because something isn’t perfect doesn’t mean it’s not effective. I think as engineers, we have a tendency for perfection where we’re like, okay, it needs to be this way, it needs to be nice and neat. My classes need to be perfect, I need comments, all these things, right? But, the thing at a startup is that nothing needs to be perfect for it to be successful. When you join a startup, you have to keep in mind that things are chaotic naturally because no one has been responsible for the sliver of work that you are now responsible for. You have to train people into thinking like that. Like you were brought in to make this thing good. It is bad. That is natural. That’s how this works. You know, we are giving attention to it and we are bringing you here to make it great. So that was one thing as well.

And then the last one is don’t be afraid of taking ownership and effectively being the change that you really wanna see. In startups again, because there’s a lot of things that have never been focused on before, it’s really your job to be an owner, and that’s one of Panther Lab’s company values is being an owner. Customer love, be an owner, take care of the team — those are our three. And ownership is so important because it’s around if you see something and it’s important, just take ownership and you’re like, “Hey, this thing just had to get done, I went and did it.” And that’s exactly the type of mentality you need in a startup because again, things are very chaotic. You’re trying to figure out a bunch of things all at once. It’s very much building the airplane as you’re falling off the cliff. And the type of people who are self-starters and growth-oriented are going to allow you to both visualize what the plane needs to look like and then make it happen, just do the work and get to a very different state, and then you have new problems.

One of my favorite quotes from one of my investors is, “We only make new mistakes”. It’s the same mentality. Learn from where you’ve come from, use it as a really key source of input for your next move, and don’t make the same mistake again.

Vivek: For you, going from engineer to founder, first-time founder, from a place like that, what were the biggest challenges? What was the lightning bulb for you to decide, okay, I’ve got the product idea, now I just got to go do this.

Jack: Oh, it was total ignorance. I’ve been asked the question before, it’s knowing what you know now, would you have still done it? And the answer’s yes, but oh my God, I had no idea what I was getting into. Airbnb was the first startup experience and working in a startup as an engineer and running a startup are completely different universes. But, you get some of the same elements of urgency. Urgency and ownership are similar, just as a founder, it’s a hundred X is hard. Not to say that being an engineer is not hard in a startup, but it’s just very different.

As an engineer, I was just really excited to keep working on that problem. And engineering was one of those things I was just so continuously intrigued with. One of my biggest strengths is in orchestrating things. I’ve always found myself to be really good at, if you have a bunch of objects in a space and you need to organize them in a certain manner, how do you put them together to have a good outcome. My mind has just really excelled with those types of things. So an example of that is when I got into DevOps. DevOps is this idea of can you deploy a configuration onto a hundred thousand machines that are all different? And it’s a very hard orchestration problem, but it’s really fun because then you have to figure out, your mind has to work in very interesting ways, where you’re like, well, what is the state of this machine when I go to it? And what is the state after and how do I make sure that’s reliable? And there’s all these edge cases, and building a company’s very similar. It’s very orchestrative movement where you’re saying, okay, we need to figure out what product to build. We need to hire the right people. We need to really put them in their most powerful positions to where they’re engaged, and they’re using their strengths and their gifts to push us all forward collectively.

And you have to coach them and guide them and make sure their heads in the right place and focus them and it’s a very similar mental model. So when I decided to start the company, it was really just, I wanted to keep building because I knew that the work at Airbnb was really just the beginning. And going from being a software engineer to being a founder with zero business experience. It’s been a crazy journey. And going from being a founder to becoming a successful founder it’s like going from being the water boy on the football team to being the coach. And doing that in a year. That to me is the level of growth that you have to go through to be successful in that role. And then you have to continue to be the best. You have to continue to learn from the best and do the things that people who are the best do. And it takes a lot of growth. It takes a lot of the right contextualized knowledge, and it takes the right people around, you coaching you.

I was actually talking to another founder this morning because we were working out together. And he was just asking me about the scaling journey with being a sole founder. And I basically was just like, you have to hire around the things that you’re not competent in, and you have to really trust that those people are really great at that. And they’ve gone through that journey before. That’s really key. As a founder and CEO, I’m always told I should be the only one really learning on the job. And then everyone else should be coming in using their experience to push the whole company forward and really know the process and the technique of really scaling the one sliver of the company. And I tell that to my team all the time. I always orient them around, if you’re going to bring someone in, they have to be better than you. That’s what you look for. And the line I use all the time is from Ben Horowitz’s book that I think he took it from Colin Powell or someone, which is “Hire people for specific strength versus a lack of weakness.”

Startups are a team sport, you know, it’s not the founder that makes it great, it’s everything else around it. That’s been the transition. It’s, it’s been a massive step function every year and every year is different. I continue to learn so much about how to do this, and I’m always gonna keep learning how to do it because it is very rewarding when you get it right, but it’s super challenging along the way and, it’s very existential a lot of the times.

Vivek: I’m sure the exponential growth every year you’re always looking back and saying, you know, these were the challenges, these are the opportunities. Every founder, when you’re jumping in for the first time, you’re sort of learning as you go and you’re a different person and a different founder probably every six months.

Jack: Maybe even every three months right now.

Vivek: Well, thinking about the transition you made to being a founder, how did you think about a market like the one you were entering, which is the SIEM market? As you mentioned, there have been players like Splunk that have been around for a long time and are pretty pervasive, and are well-capitalized. How do you look at that and decide, Hey, you know what? I’m going to jump in, I think there’s a new opportunity here. Maybe just give us a sense of what that was like, taking that plunge in a landscape like that?

Jack: I’ll be really honest, going as an engineer, becoming a founder, I knew nothing about Go-To-Market. Just straight up, right? What did I know? I was a decent engineer and, I knew security really well. But when I started the company, my mind automatically went to, I know how I can build this thing better, and I know how I can continue to satisfy the problems that people were having who looked like me in other companies because I was effectively given two options. I could join another startup. I could have joined a company like Stripe, right? The ones that were just kind of peers to Airbnb at the time, but had a lot of cloud infra, had a lot of the same problems. I could join a company and keep building internally again and just do this over and over, or I could just build a company and then do it for those same types of people, but I could support multiple companies. And I could build one thing and make it really great instead of building a bunch of internal SIEMs all the time. That’s really what my target was. My target was I want to build a better version of this that allows us, as analysts, to use a UI instead of everything Command-line, because that’s what Stream Alert was. It was basically a backend service. And we really struggled with getting our analysts who were fairly new to Python, you know, didn’t know about Terraform and all these things. It’s just, it’s very engineering oriented. Didn’t know about deployments and DevOps, which was basically a required skill at that point.

So, what I knew was I wanted to build it with a stronger foundation on the backend in terms of which programming language we’re using. I want to do a compiled language on interpreted language because it’s high-skill logging and it just will perform better. And I want to have a UI. And those were the two things in my head that I was focused on. The hope was that we’d be able to support an even higher scale of logging and then companies would be able to use us either alongside their current SIEM or as an augmentation and then eventually as a replacement as we’ve caught up on parody. That’s where my head was, and I didn’t really think that anyone was really doing anything like this. Now it’s a bit different where there’s more companies in cloud-native, but just because you’re cloud-native doesn’t mean that you’re good at scaling. It’s not guaranteed. You still have to do a lot of work. And my team at Panther Labs has done a lot of really amazing stuff to get to that scale. Just for a sense, I think our biggest customer was doing a few petabytes of data per month, and that was a mind-blowing number. It just wasn’t possible. And that’s the start of what you need for SIEM. You need some way of getting to that scale because everyone is continuously growing and these big Fortune 500s just have so much data, they’re probably freaked out. They’re, I just can’t even begin to start looking at this. So let’s solve that problem. Now the next problem to solve, which is also very much a data problem, is how do we get as much security value out of that data as possible, which is very challenging just to even define. Because a lot of security teams, they’re all looking for different things in all these different ways. So finding the intersection of all that and really hinging a product around it and showing very repeatable value is very challenging. Because detection is one of those things that’s so non-binary. You look for many years for the breach and you may not ever find one depending on what’s going on. If you’re a big Fortune 500, you’re probably targeted a lot more. But if you’re a growth-stage startup, you could probably never see anything happen. But you know you need to do it. It’s like car insurance. You know you need to buy it and you know you need to drive safely, but an accident may never happen. But you do it and you pay for it because it’s important and you need to cover your risk for other people.

So in a lot of ways, this type of security is similar to that. Whereas other types of security are very defined, like cloud security. Your cloud is secure and it meets your standards or it doesn’t, it’s very binary. Same thing with application security. Like you wrote a vulnerability into your code or you didn’t, and of course, there’s gray areas with all these, but they’re much smaller gray areas than detection.

Because detection is like interpreting the law. It depends on who’s reading the law, right? It’s the same thing with analysts. I’ve worked with analysts that are incredible at what they do, and the way they work is just magical. They just know the system so intimately. To where I would look at the same logs and be like, I didn’t see it. I just didn’t know how you found it. Right? It makes it really challenging to do this. And that’s a challenge that we have now. So initially it was, can we build tech that’s going to allow us to get some early customers and solve the pains that they’re having that we were having at Airbnb and Amazon and my early team was from Amazon as well. So, they were really good at scale. They knew what scale means. Now the second layer of that is how do we make the most out of this data as possible and make it so widely applicable that it’s actually solving a lot of these detection challenges teams are having.

Vivek: It’s amazing because you, you talk about having these Amazon folks, and who knows scale better than Amazon, right? And so even just getting them on board, this is the perfect opportunity for them to show what scale really means and how do you bring scale to, a next generation of customers, that can actually start to utilize and use this. So take us to getting that first customer. What was that like? What was the journey to do that? How did you feel?

Jack: The first customer was interesting. I think that at the time, so Panther Labs was open source, and we had open sourced the platform because the thesis was, engineers wanna run open-source tooling and that’s going to allow them to trust us. In security, a new company, it’s a little bit of a chicken in the egg problem because you want people to use you, but no one trusts you until other people use you. So how do you get around that? You can do open source because engineers are tinkerers and they wanna play with stuff. So we did that and that allowed us to get our first few customers. But the story I would tell is really around one of our first big logos and that was a really transformative process because it wasn’t so much about the open-source element, it was really about are we able to hook them in and get them interested and then show them that we can evolve very rapidly and, and do the things that they want.

So, we were on sales calls with them all the time. And at the time it was me one engineer, and then my COO, my now COO. And we were sort of playing the role of SE/AE right? So what we would do is we would sit on calls with them and they would say, hey we like these things, but there’s these other two things that are just missing.

So what we would do is we would go build it and we would get maybe three-quarters of the way there. And then we’d be like, what do you think of this? We eventually did that enough times to where we got them to sign and then we got others to sign using that same technique. And in a lot of ways that’s super similar to what you just have to do after that point as well. So getting the customer is one big piece of work, but then keeping them happy and showing that you’re evolving over time is another.

Vivek: Jack, let’s talk about AI because this is the topic that is on everyone’s mind. Today, Panther Laabs does not incorporate AI into its platform at its core. How do you think about that? Is that something you even think about? Is it something that you’re thinking about for the future? Do your customers even care? We would love to get your thoughts on that.

Jack: Yeah. AI is a very complicated thing in security because of what I was mentioning before, around detection is such a gray area. It’s in a lot of ways not great for that use case because you don’t always know input versus output. Like, yes, this was truly bad or not, you don’t have enough data. And everyone is very different. Every environment’s completely different. So naturally it becomes not a great use case. However, with a lot of the advances that were made, we’re certainly investigating and seeing where’s the best mechanism of deploying machine learning and training data on things like queries is a great place for us. Like how would we translate natural language into a query? Because our effective backend is SQL. It’s a data warehouse, so there’s some cool stuff we could do there. There’s some cool stuff around just observing behaviors for people who are continuously doing certain response actions. There’s a lot of types of things we can investigate, but in security, there’s always been these systems called UEBA. user behavioral analytics. However, they’re also notoriously terrible and a lot of people just ignore them. So SIEMs in general just have a bad rap. I think most, most people just hate the SIEM. They hate the category, and there’s a reason that they hate it. And the reason that they hate it is because it was slow. They weren’t scalable. They were hard to use. They weren’t accurate, and then it just made their life a living hell every day.

And it’s because there’s core problems that were never solved in security. So, a lot of those core problems end up being data architecture problems. And if you solve that, then you’re on the way to having very repeatable ways of actually getting great value from your SIEM. But until you solve those problems, it’s very difficult. And then actually that’s also the precursor to doing things like AI because you can’t really apply machine learning on unstructured data — it just doesn’t really work. To understand what the logs are. You have to know that this is a login event across all these different log types. Then you can feed that into the model and say, Hey, from the beginning of time, this is when Jack has logged in historically. You know, model, what do you think about this log? Is this a typical IP address that he would log in from? Is this a typical whatever? And, there’s a lot of processing that we can do on top of that as well to make it very valuable, but when we ship our features that are going to do things like this, I want them to be really good. I don’t want to ship something that’s just to check the box and then it’s not helpful. I want it to be valuable. So we’re doing a lot of building and investigation right now around what that next layer of analysis is. And I’m excited to see how the team decides or doesn’t decide to use, something like an OpenAI API or, or something similar. I think for us it’s just making sure we have the right use case for value and then leaning into it heavily. So I personally pay attention to it a lot. I think it’s exciting and everyone’s trying to build as fast as possible — it’s a great Silicon Valley energy and it’s really cool being here in San Francisco, and just watching it and seeing what’s happening in the industry is really cool. But security’s been very lagging for technology for a long time.

Vivek: and for a good reason in some ways too, right? As you mentioned, just slapping in GPT, slapping in an OpenAI plugin, when you’re dealing with really sensitive and private data that your customers are entrusting you with, I imagine it’s not a chatbot or something like that where you can just sort of move quickly to incorporate AI, you have to be thoughtful about it, given the structures your customers are playing within,

Jack: A hundred percent. Yeah, those privacy concerns. And then honestly, it’s just value. I want to be able to deliver value there. It’s funny because I remember when the Web3 craze was happening a few years ago, and then now it’s like, oh, well that didn’t work out. Let’s do AI. But AI is, has always been a very enticing technology for security. Web3 obviously, in my opinion, had nothing to do with security. I always joked about doing NFTs of alerts. It’s like security alerts that you got breached on.

Vivek: Just figure out a way to combine Web3 with AI with security, and your next round will just materialize.

Jack: That’s right. GPT will generate a term sheet for me.

Vivek: I love that. Well, you know, you had a great tweet recently, analogizing GPT with autocorrect, and you basically said it’s an aid to creativity, which I really loved. Because I think there are a good amount of people out there that are a little bit spooked about GPT and what generative AI is doing. So, what are some of the ways that AI is aiding creativity within Panther Labs or within your own life?

Jack: I use it a lot. I use it for use cases that I would’ve otherwise need to just crawl the web for. A thing I do a lot is I ask a question into Google and then I search and look at everyone else, like five or six different pages and I read a few articles — I skip some of the clickbaity ones. Especially for entrepreneurial-level things. Some things seem to be just so click baity and so useless, or very surface level. And very specific questions I think are really great for something like GPT. So, the way I use it is I’ll ask very specific questions.

For example, I was building a new team, and I was asking about ratios. You know, I wanted to understand, hey, in this, for example, like in sales, you have ratios of AE, SE, and SDR, let’s just say, right? You have a certain ratio that you should maintain. So, I was asking questions like that, just trying to understand, how should I at my stage lay out my team to do this. I use it a lot for summarization as well. If I write something long, like, Hey, can you summarize this down? I’ll use it for, I’m trying to think of a word that explains this. And it gives me great suggestions. That’s perfect.

I’m not so much a fan of using it for net new things all the time. I use it when I know I have a pretty good idea of how I want it to work. And then I want to get a new iteration of that. That to me is a perfect use case for it.

Oh, actually, a really cool thing I did recently, which is more personal. So, I keep a list of questions in Notion. I have Notion in my private life as well because why not? It’s great. I love it. I’m a huge Notion fan. And I’m really big on asking good questions to people and getting to know people beyond the surface-level stuff. Because I think when you establish that level of vulnerability, you reach a new level of trust.

So I have a list of questions that are related to that. Like — what makes you trust somebody, what was the most rewarding trip that you ever took? Questions like that. And I’ve worked on them for many years. So, when GPT-3 and 4 came out, I was like, oh, what if I feed the questions and I train it on those questions that I know I really like, and then get some more questions back. So, I did that and I thought that was super cool. And you can use this for interview questions, right? I’m interviewing for X, Y, Z role. These are questions I like, generate 10 more. That works beautifully and it’s an aid of creativity because it’s inspiring. And maybe you get six back that you like and the four you don’t. That’s fine. That’s six others that you didn’t think of. And in a lot of ways this a massive shortcut to just having a ton of people around you. Because think about it. When you are building a company, you want a lot of diverse minds around who don’t have the same perspective, and that’s how you build great things. Otherwise, you are in tunnel visioned into one group of mentality. You’re in this box. And tools like language models really help you expand your mind authentically and in a way that is constructive. The one thing I will say that I thought was hilarious — I think I mentioned and brought it up as well, but I’m very big into just self-growth and those types of things. And so, I asked ChatGPT, “How do I find true love?” I was already in a relationship. I was just curious what ChatGPT thinks is the way that you find true love. And it was so on point. I was blown away. It said: Finding true love can be a complicated process, but it can be done if you take the time to focus on yourself and become the best type of person that you’d want to be with. Figure out what your values are, what are your goals? And put yourself in situations and settings where you’re more likely to meet someone who shares similar interests and values. And be open and honest in your relationships, and don’t be afraid to communicate your feelings and needs.

That’s actually pretty solid. I just got such a kick out of that one answer.

What self-love means is you have to know yourself first. You have to know what your intentions are. And that’s such an important thing for just business as well. You have to set your intention going into the year. You have to set your intention with the future you’re building. You have to set your intention with everything. How do you want to show up? Who do you want to be? What’s your identity? And then once you understand those things about yourself, then you’re like, cool, this is what I think would be ideal in a life partner. This is what I think would be ideal for having someone run this function in my company. This is what it would be ideal in this event that I want to throw. This is the outcome. This is what I want people to feel like once you get to that level of psychology, for yourself and others, then I think you’re just more effective in everything.

And a lot of the stuff that we do is all connected. Like being into fitness and health. It creates a drive and creates a consistency that applies in other parts of your life. And when you have all of those together, then you’re effective, right? But I think it’s flawed to think, “Oh, I’m just going to be great at this business thing.” Right? Because a lot of the work that you do on yourself can make you better at business and vice versa. Sorry – I could talk about self-growth stuff for hours.

Vivek: I wanted to wait until the end, but it’s too good not to ask you about the body hacking and I don’t know if that’s the right term anymore, but you know about all the things you do around, a combination of fitness with being very insightful into what’s happening at the self level — all of these things. Is that new for you as a founder? Have you always been that way? Has it changed being a founder? Would love to get your thoughts on that.

Jack: Everything I do is for the purpose of longevity. And if it doesn’t really matter what I’m doing, if I become a parent, that’s a whole new level of endurance that I need to be ready for. But even just being a founder requires a level of endurance, mental endurance, and actual physical endurance. They’re very much hand in hand. And I’ve just learned so much about my diet and my sleep and my movement, and I’m at this point now where I’ve learned so much about how my body reacts to certain stimulus, it’s just been a total game changer for me, and it’s allowed me to have better focus, it’s allowed me to learn how to just run at the right pace. I’ve had a string of health problems my whole life. I was actually never athletic as a kid, and when you’re not athletic as a kid, I think it teaches you to just not be athletic in general. But when I got to college, something really cool happened where I had an athletic roommate and he brought me to the gym, and I just kept going.

And there’s so much to being well-rounded. If you want longevity, you need to have your diet on point, because diet is actually so underrated in terms of how it affects your energy. I think it’s probably one of the most things aside from sleep, obviously. If you sleep poorly then nothing else is going to matter and you should read, ” ‘How We Sleep,” it’s a great book. So, if you’re struggling with sleep, start there. And then aside from that, learn about diet.

I wear a WHOOP religiously, this thing on my wrist, and the WHOOP taught me how to sleep right. I’d be working till 11:00 at night and then I’d go to bed, wake up at 7:00, and I’d feel horrible every day. I’d have headaches when I woke up, I would just pound coffee, and I just pushed through it. and then I learned how to sleep properly. And now I go to sleep around between 9:00 and 10:00. And then I get up at 5:00 – 5:30. I do my workout in the morning and have some time to myself. I set my intention for the day. I think another underrated part of longevity is your mental game. And just learning what your body reacts to, for example, I stopped eating meat about two years ago and I eat fish still. I’m a pescatarian, and I just find that that works for me. And some people just only eat meat and that works for them. But the way your body reacts to food can significantly affect your energy.

But all these things I’m doing are for helping my longevity and making sure that I stay strong and flexible and push, but I also recover and I take a step back and rest a little bit and getting better at that last part — that balance of activity and rest, activity and rest. If I’m constantly burning myself out, then I’m not being effective as a leader. I’m not setting the right example either for my team. And again, I’m getting better at knowing when to take time off.

It can be hard as a sole founder and a CEO. We do all these things, or I do all these things to be the best CEO I can be and to deal with the infinite stimuli that come with running a company. And if you don’t do these things and you’re constantly tired and you’re sluggish, you’re not going to show up in the right ways.

Vivek: Jack, just, just hearing even the last five minutes of what you were talking about, I realize I’m underperforming on probably 12 different things that are not even company-related between my sleep and diet and all these things, so I have a lot to learn from you.

Jack, this was fantastic. Thank you so much for joining us. Congrats to you and the team on everything that you’ve achieved, at Panther Labs and everything you’re about to achieve and really exciting to see where the company goes and where this sector goes. And this was really enjoyable. So, thank you so much.

Jack: Thanks for having me on, it was really fun.

Coral: Thank you for listening to this IA40 Spotlight episode of Founded & Funded. If you’re interested in learning more about Panther Labs, please visit www.panther.com. If you’re interested in learning more about the IA40, please visit www.IA40.com. Thanks again for listening, and tune in a couple of weeks for our next episode of Founded & Funded with Numbers Station Co-founders Chris Aberger and Ines Chami.

MotherDuck’s Jordan Tigani, DuckDB’s Hannes Mühleisen Commercializing Open-source Projects

MotherDuck’s Jordan Tigani and DuckDB’s Hannes Mühleisen on partnerships and commercializing open-source projects

Welcome to Founded & Funded, my name is Coral Garnick Ducken, I’m the digital editor here at Madrona. This week, Madrona Partner Jon Turow brings us a story about a great partnership forming between two people — who had never even met — when they each find themselves on a mission to focus on what it is they do best. You’ll hear from Hannes Mühleisen, creator of the DuckDB open-source project, and Jordan Tigani, the database leader who saw an opportunity to commercialize it by creating MotherDuck. They share the lightning-bolt moment that led to one of them flying half way around the world to meet – how does this happen, and how do they set their partnership up to be the foundation of a really big business while still supporting the open-source community? Jon gets into all of this and so much more.

MotherDuck and DuckDB have become integral for students of the modern data stack, but this story of inspiration, partnership, and execution is something that builders everywhere can learn from. So, with that, I’ll hand it over to Jon to take it away.

This transcript was automatically generated and edited for clarity.

Jon: Here’s Jon Turow. I’m a partner at Madrona. And I’m just really excited to be here together with my good friends, Jordan Tigani and Hannes Mühleisen. Thanks so much for joining, guys.

Jordan: Great to have the chat with you, Jon.

Hannes: Yeah. great to be here. Thank you.

Jon: So, I want to get into the genesis of DuckDB and MotherDuck. Jordan, you’re the founder and CEO of MotherDuck. Can you tell us what MotherDuck is?

Jordan: Sure. MotherDuck a serverless data analytics system based on DuckDB. You know, we’re a small startup company. We first got our start or even starting to think about it in April of 2022. And were funded by Madrona, among others, a few months afterwars.

Jon: Hannes, can you talk about what is DuckDB? What was the genesis of it, and sort of your part of that story?

Hannes: Sure, I’m happy to. So what is DuckDB? DuckDB is a database management system — a SQL engine. It is special because it is an in-process database engine, which means it’s running inside some other process. It is an open-source project. We have been working on this for the last five years or so. And it’s the creation of myself together with Mark Raasveldt, who was my Ph.D. student at the time. From this word Ph.D. student, you can already deduce that this was in some sort of academic environment. And at the time, I was a senior scientist at the Dutch National Research Lab for mathematics and computer science, the CWI in Amsterdam, which is famous for being the place where Python was invented, among other things. And there, I was in a group called Database Architectures, which has been working for many years on analytical data management engines. For example, they kind of pioneered columnar data representation for database architectures. They pioneered vectorized query execution. It’s been quite influential, let’s say. That’s nothing to do with me. I joined after all these things happened. But I did notice that while there were all these great ideas and great concepts flying around there was really that much in terms of real-world impact. And as a result, people were kind of using, let’s say, not the state of the art, right? I found that a bit sad. So we started talking to practitioners, figuring out where the problems were. And it turned out that this management of setting up data management systems, of transferring data back and forth, was really a concern. And it really didn’t matter how fast the joint algorithm was if your client protocol is horrible. That is, I think, one of the basic insights. People have written hundreds of research papers on joint algorithm, but nobody has ever thought about the end-to-end here. And so, we decided that we’re going to change that, and we’re going to actually look at the end to end and we’re going to bring the state of the art in research to a broad audience. And so we started implementing DuckDB back in 2018, I believe. It’s a bit insanity to say, okay, we are two people going to write a database management system. Like, these are things that usually hundreds of people work on for 10 years. And we have been warned. But I think one of my character traits is to kind of leap without looking sometimes. And that’s definitely an instance of this where I was leaping without looking. You could also say the companies case of leaping without looking, but we can talk about that later.

Jon: So Hannes, one of the things that you’ve shared with me over the time that we’ve known each other is even in those early days, you started to get customer feedback about using DuckDB and how to get it to work. And without leading the witness too much. There was an example of getting this thing to run in an academic compute environment with just all the things that are locked down by IT and the implications of that. Can you share that and how it impacted DuckDB?

Hannes: Yeah, absolutely. So part of the job of a database researcher is to try other people’s stuff. Like somebody writes a paper, maybe they ship some code. In an ideal circumstance, you get to try it. It’s very exciting. But it’s very difficult. Usually. That’s a code that is not meant to be run on like anything else. And then if you are, as you said, confronted with the sort of, absolutely lockdown environment of an ancient Fedora version where you don’t have root and the admin has like a 3-day turnaround, and you just want to try something and see if it’s like not worthless. It’s just completely impossible. And over the years, it built two things. One is an uncanny ability to run things without Docker. So – – prefix is my friend. And the other is, of course, a deep hatred for dependencies. And I think we underestimate the real-world cost of dependencies. It’s something that’s one of my, how should I say this, vendettas, especially given the recent rise of containerization that it seems to be just fine, or you have rust with its cargo thing, which is just fine to add dependencies. It’s not fine. It’s actually a, I like to say, an invitation for somebody else to break your code. But that was another one of these deep convictions that we got in designing DuckDB that can’t have dependencies. That was totally born out of the environment and the design, as you mentioned, of DuckDB as well, like we talked to people in the data science community and essentially listened to them. It’s a very uncommon sort of thing to do for a database researcher, oddly enough. They told us what they didn’t like, and they were also super happy to iterate on our half-baked ideas and give us feedback. So that was really, really valuable in shaking down the early design parameters, if you want, of this thing.

Jon: If I move to the next part of the story, Hannes, here you have this thing that you’ve built that’s really useful, and yet you’re doing a job that you love, you’re a researcher. Did you think about turning DuckDB into a company yourself? How did you think about that? What was the exploration, and how did you land on DuckDB Labs?

Hannes: Yeah, that’s, interesting because it’s a kind of a push-pull thing. So, first of all, in our research group, there has been a precedent in spinning off companies. There was, for example, VectorWise, which is obscure, but it’s the first vectorized database engine that came out of our group that was a spinoff. The CWI is also a place that is generally supportive of spinning out companies. But there was also a lot of pull, right? We had DuckDB we open sourced it in 2019, and then people started using it, and then people started essentially asking us questions like, when can we give you money? it’s an interesting situation to be in. You are in a research institute, somebody asked you, can we give you money? And you have to say no because there is just no process in this research institute to take money. It’s really weird. And I think it was about the same time when the VC started badgering us, for lack of a better word. There was this, endless stream of, “Hey, have you thought about starting a company?” So, we were a bit reluctant at first. I think it took us a couple of months of people asking us whether they can give us money. VC’s kind of asking us, “Can we give you money?” And us sort of think like, “Uh, not so sure. I don’t know.” There’s many stories about what exactly tripped us to do to start leaping. A story I like to tell is that kindergarten in Holland is just so expensive that I had no other choice but to start a company. Another one is that it’s absolutely clear that we needed to spin out in order to actually give DuckDB the room to grow because there’s only so many things you can do as an employee of a research institute. So that started this whole process of then, then we didn’t know anything about starting a company. Right? Like how do you do that? It’s not something they teach you at computer science school. Obviously, lots of discussions followed. Lots of soul searching, figuring out what’s going to be the business model. What’s going to be the process? Who are we going to trust? I think that’s, that was an important first question. Who are we going to trust?

Jon: And you decided, that you want to focus on the technology itself.

And that’s kind of where this landed.

Hannes: Right. But that was the long process. This process was very interesting because we talked to VCs, and they were like, “Okay, so you’re going to make a product, right?” Like, we have piece of software, isn’t that a product? And like, no, no, no, no, no. You have to be a Snowflake. It’s like, okay, but we don’t want to be a Snowflake. Yeah. Well, hmm. Difficult, right? And so there’s, there was a lot of discussions that went exactly like that.

But this idea that you can just be a technology provider of sorts, that didn’t resonate well, I think. And we were really like also wavering a bit on like, okay, they all wanted us to do this, should we really do this? But in the end, we talked to some people that have made database as a service companies. Very successful ones. They told us about their experience with this. They said, okay, this is what you are looking at if you do this. And that was clear that we didn’t want to do that. We wanted to be more open, we wanted to be more flexible. We wanted to not target one particular application area. Because, in our mind, DuckDB has so many different possibilities that going just for one, would be a bit restrictive. And because there were people already that were commercial users that were willing to give us money, we also had a different approach, which we could just say, “Hey, okay, we’ll take their money and, uh, we’ll run the company from that like, in the olden days. And that is still what we are doing. And it’s been, I would say — I’m quite happy with how this worked.

There are some people that we are thankful to that helped us, in the beginning, there was a Dutch entrepreneur that basically turned up with his lawyer, like on day three of this adventure, and said, “Here this is my lawyer. You need to talk to this guy.” And he’s still our lawyer. Right? There’s been, one of your former colleagues, Anu Sharma, who was extremely helpful and supported us in the beginning without any agenda if you want. There, there were a couple of people that have been extremely supportive, and I’m probably forgetting some, but it’s been, a great experience to do this non-standard thing because there were people out there were super willing to help.

Jon: That’s a fun introduction to the first thread. Jordan, can you maybe take us back to when you learned about this thing DuckDB? And what was the light bulb that went off in your head, and what you did about it?

Jordan: Yeah, so I was chief product officer at SingleStore, and we were really focused on database performance, building the fastest database in the world. We were looking at some benchmarking reports that somebody had done, and it had a number of different databases and I saw one that said, DuckDB, and was like, what is that? And why is it so fast? And where did it come from? And so, I did a little bit of poking, and I encountered some of the papers that Hannes and Mark had written. And they just really resonated with kind of the experience I had had over the last 12 years of working on these big data systems, one of which is that most people don’t actually have big data. And that scale-up is actually quite a reasonable way to build things. In SingleStore we were working on these sort of distributed transaction features that were taking a long time to build. And in BigQuery we worked on shuffle in order to do joins or in order to do kind of high cardinality aggregations that were very, very complex, basically relied on specialized hardware, and had big teams of people to work on them. And then in DuckDB, in order to do like these joins, you just build a hash table, and then you share a pointer to the hash table and it’s like, wow, that’s just so much easier. So there was the complexity side of things. There was the scale upside of things that it’s like, what you can do on a single machine is so much more than you used to be able to do. Then there was also the part that Hannes was talking about, which I think people haven’t actually grokked, yet, as the special sauce for what makes DuckDB so awesome. Which is that everybody focuses in databases on what happens once the query starts and until the query finishes. But there’s a bunch of stuff that happens both before and after. So before: How do you get the query there? How do you set things up? And then after there’s, okay, how do you get the data out? And so often, they go through these incredibly antiquated ODBC/JDBC interfaces or a rest interface or there’s the Postgres wire protocol and the MySQL wire protocol. And they’re just not really great. I think DuckDB was one of the first things that I had seen that really focused on the overall end-to-end.

To give an anecdote in BigQuery, we outsourced our JDBC driver and ODBC drivers to a company called Simba. And there was a bug in the outsource driver that added like a second and a half to every query. If your queries are taking minutes and you add a second and a half, that’s not a big deal. But if you want to do some sort of BI dashboards, etc., adding an extra second and a half is terrible. And in fact, there were some cases where it would add tens of seconds or even minutes because if the data sizes were large, they would basically pull through this very narrow aperture of the whole table back. And so, it was unusable for some BI tools.

And the thing is, we had no idea that this was even a problem because all we focused on was, okay, we get the query, we run the query as fast as possible, and then we give you the results. The fact that DuckDB is actually focusing on these kinds of problems, I think, is why it’s doing so well. Somebody Tweeted, “Why is DuckDB so fast?” The reason that DuckDB is fast is because all the stuff that everybody else isn’t paying attention to, they’re actually paying attention to, and so it feels fast.

Jon: There’s two things that really strike me. One is that you, Jordan, immediately imagined single-box execution. Just like Hannes, with a much bigger box. You realized that hosts in the cloud also count as single boxes, and yet so much RAM, so much compute. And, and I guess you’re going to say that comes from the sort of family of origin where you’d been raised at SingleStore and at Google. But the second thing is you, Jordan, I think are excited about doing complementary activities with your day, around team and company and business building with this advanced technology. And so maybe you could just kind of comment about that part.

Jordan: Sure. Yes, I did immediately think of cloud. I mean, I’d been on a, team in Google that was supposed to build a data marketplace, and then we said, well, you don’t want to just download the data. You want to actually compute over the data where it sits because it’s large data. So we built BigQuery, which is essentially taking a service that already existed in Google called Dremel, and we’ve built a product around it. And then at SingleStore, they were in the process of a cloud transition, and I spent the last 18 months taking an on-prem database and sort of building a cloud service out of it. And so, I know the pain of it, I know how it works and that’s just, that’s how my, that’s how my brain works. The other thing is in my career, starting as a software engineer for 15 years, as you move up the corporate ladder, you handle larger problems that have more complexity and more ambiguity. And then, as a manager, it’s another big step beyond that, which is people are more complex and more ambiguous, and getting them to do something is harder. So you have to figure out what makes them tick. How to get things to work, how to get the right output, and then as a manager with larger scope, you end up actually designing. It’s similar to a design problem in software as you’re designing with your organization. Okay, we need these pieces and these pieces, and this is how communication works. And it’s almost like it’s a distributed system. And then sort of moving to product, because then I switched from engineering to a product manager, is like you’re designing in the product space. And the product space is just this even broader palette of things that you can do. Because it turns out what actually matters is how are customers going to interact with something? If you build a beautiful piece of technology that nobody wants, it’s going to be really disappointing because nobody’s going to end up using it.

And so, you’re painting with this broader and more complex and more ambiguous space, and to me, that’s been sort of exciting. Nowadays, even though I love technology and I love to sort of geek out about databases I’m also realizing the thing that gets me excited is building products. That involves not just the tech, not just the architecture, but also all the market, the pricing, the packaging, the customers, all those other pieces that go along with it.

Jon: So, here we are in this moment where you spotted DuckDB. And a light bulb goes off in your head, and there’s a moment where you are so motivated that you get on a plane. Can you tell that story, Jordan?

Jordan: Well, I think I need to back up a little bit because I was really excited about this, and I’m like, Serverless is something that I think is the right way to, build cloud systems. And I felt like a serverless DuckDB should exist. There were so many nice things about it and so many things that other systems couldn’t do, like being able to scale down to zero and pay for what you use, and being able to sort of rebalance and move things around. It reminded me actually, sort of, of BigQuery, but like rotated 90 degrees. Big query was sort of very wide and thin. And then, with this, we could be very thin and deep, But also do the same sorts of things and perhaps be, be even more flexible. So I’m like, all right, it’s been a long time since I’ve coded. So maybe I’ll just sort of hack on this for a little while. And I got about two days into it. And then I asked, a friend and mentor of mine for an intro to Hannes and Mark because I knew that he had been working with DuckDB. The morning I talked to Hannes and Mark, it was like, huh, this seems like it could actually work. They’re not doing exactly what I’m talking about, but they kind of are looking for somebody to come in and build something like this. So that could, that could really work.

And then, in the afternoon, I talked to Tomasz, who was then at Redpoint. I got about 15 minutes in, and he is like, “I like this idea. I want to fund it. Come to my partner meeting.” And I’m like, “What?” I was not thinking of starting a company. I was thinking of learning Rust. That was actually my goal was to learn Rust. The next day I talked to another VC about something totally different, and I ran the idea by them and they said, ” I like this idea. You just had the partner meeting. How much do you want?” And I had no idea what to even say.

The next day I talked to a neighbor met who worked at Madrona. who I’d been meaning to have coffee with, and I let slip that, “Hey, I’ve been thinking about this idea.” And so she brings, Jon along with her to the coffee meeting. And that’s how Jon and I met. And that’s also sort of like within 48 hours kind of realizing, hey, there’s an interesting idea here. The next day I one of my first vacation, post the start of COVID. And I was in Portugal for a few days, and we’re trying to do all this sightseeing, and I’m taking all these calls from other founders, from VCs. I felt a little bit bad for my wife because it wasn’t as much fun of a vacation as it otherwise could be. And then I realized, okay, if we’re going to make this work out the most important thing is I need to have a great relationship with Hannes and Mark, and I need to really kind of see them in person and kind of look them in the eye. So I rerouted my trip, and I came back through Amsterdam. Hannes books four hours and I’m like, four hours — there’s no way we’re going to talk for four hours, and then like five hours later, we’ve been geeking out about databases — it was just sort of a really fun conversation. We’re like, oh, we’ve got to get to dinner, and we had dinner with our spouses. And that was the start of MotherDuck.

Jon: Was Jordan the first person who came to you with an idea to commercialize DuckDB?

Hannes: No, he was not the first person, I’m sorry to say. But how should I say this? He was credible. I think what really made, made you Jordan stand out from anything I’ve heard to this point. And I think also anything I’ve heard since, to be quite honest, is that you came from this background at SingleStore and BigQuery, and in a way, it was a big shock to me that somebody who had this kind of background would consider our scrappy single-node system for, for something serious. And I thought, okay, that was really crazy because we were being ridiculed for not taking distributed systems seriously with DuckDB. People were like, no, this is pointless. But we always thought like, okay, if we are some odd balls in Holland, and no one cares. But then to see somebody like Jordan come and say, no, no, you’re, you’re totally right. This is what we’re going to do. That was shocking. And I think, it really was clear that if we were going to work with somebody that does this, it’s going to be Jordan. That was pretty clear from the beginning. Certainly, after he changed his travels at the last second.

It’s quite funny to hear the other side of the story from you, Jordan, because while I’m aware of sort of the points where you were in contact with me, I had, of course, no idea of the background chatter with everyone else that already was going so far. But when, when we first talked, I was like, yeah, no, we can totally do this. You came over. I think we had indeed we had a good feeling about this. Then it went super quickly, of course. Like this was all in the matter of days. From us saying, yeah, we’ll be on board. It felt like it was minutes after that that things started being like, set up or something.

Jordan: And I was worried I was going to freak you guys out because things had moved so fast. It was like, oh, this intense American is like just cause, just cause we said yeah, it sounds like a good idea. All of a sudden, you’re like, okay, now boom, boom, boom. Here’s the money on the table. And I was kind of terrified that because it was, things were moving way faster than I had expected and was just sort of like riding the wave. I was very cognizant of trying not to freak you and Mark too much.

Hannes: I think at that point, we had spoken to enough Americans. And I have to say my wife is American, so I have a daily exercise in this. But it wasn’t scary, I thought. I don’t know. It seemed so logical and obvious that I wasn’t scared at all, and I don’t think Mark was either.

Jordan: That’s good to hear.

Jon: There’s a certain sort of trust in friendship that’s evidence here, but I want to also pull out this connection that you mentioned to me once or twice, Hannes. The fact that Jordan has not just built a lot of cool stuff, but has built a lot of cool distributed systems that scale out versus scale up. Coming to you and saying, “Hey, scale up is actually pretty cool.” That kind of narrative violation, I think…

Hannes: There was a shock to me. Yes.

Jon: It seems like it gave credibility and also has been an animating theme for MotherDuck. Isn’t that right, Jordan?

Jordan: Yeah, I mean, the recognition that most people don’t have huge amounts of data. Even working on BigQuery, people were not doing big queries. For the most part, they were doing little queries and focusing on getting data in and getting data out, and the user experience of the query and using the system is actually more important than size. And then also, yeah, you can scale up. And then the last piece is — you can always make it distributed. I’m sure somebody, even if we don’t do it, somebody’s going to come up with a distributed DuckDB. We have a bet internally about whether we’re going to end up doing sort of a distributed version of, MotherDuck. My bet is, no, we won’t need it. Other people think that we will, but we’ll see. Like in BigQuery, we had BigQuery BI Engine, which was a scale-up single node system that sits on top of BigQuery storage. And because it had to run in these constrained machines that have to run Google search, there were no big machines, and so we ended up having to build out a scale-out version of it. It took a year and like three or four engineers. It can be done, but, ideally, you wait as long as possible, and so you get as much innovation into the core engine until you have to do that.

Hannes: I think what this biblical transformation from scale out to scale up. The reason why this was so surprising. I think what, what is so, so transformative for us, I think was because our idea of why we wanted to scale up was based on a feeling, I want to say. And the feeling was based on actually using things like Spark and having this feeling that something’s wrong with the world, if this is the best we can do. But this was based on a feeling that scale-up was the way to go. We didn’t have data on this. It was only recently, I think in maybe 2020 or so, that Google published the TF data paper. That actually said this explicitly, like 95 percentile of all our machine learning in job size is like a hundred gigabytes or something like that. But then, of course, Jordan, you also had seen this from the inside. I feel like if you haven’t seen the big data, then probably no one has. So, it made the difference for us from a feeling to something that, was actually a thing. It was a great moment. I would have to say.

Jon: If we go back to coffee, Jordan. The first time you and I met, there, there’s a thought that went through my mind and a question I asked. The thought that went to my mind was that this is almost witty to say, let’s go for, whatever’s the opposite of big data. That’s almost funny, considering the way that so much of our data technology has been designed to scale. And the first question that I asked you was, well, if we change that constraint, and we looked at the world as it is and the workloads as they are instead of how it could be if we waved to magic wand, what can we do that’s possible that wasn’t possible before? Maybe you can share either what you answered then or what you would answer now about that if it’s different.

Jordan: I wish I remembered what I answered then, but it was a rough couple of days. What I think I probably said is, when we started BigQuery, the mantra we used — and it came from the Turing award winner and database researcher Jim Gray — what he said was, “With big data, you want to move the compute to the data, not the data to the compute.” When I kind of described building this system where you didn’t want to just download the data. is because moving the data is so expensive and it’s so hard. And so BigQuery was sort of built around that premise. But then, once you kind of recognize that data may not be that large after all, then how would you design the system differently, and can you move the data to the end user and leverage the compute power that, the end-user has? George Fraser, the Fivetran CEO, just did a benchmarking report. I think it’s just crazy and amazing that the CEO of a multi-unicorn company is running, database benchmarks and doing a good job of it. But anyway, he found that his 2-year-old Mac laptop, not even state of the art, was faster than a $35,000 a year data warehouse. It used to be the laptop was synonymous with underpowered, and nowadays it’s, it’s a huge amount of power. And so, why when you run a query against one of these cloud data warehouses and you wait three seconds for it to run? And meanwhile, everybody else is waiting for that same hardware, and your incredibly powerful computer on your desk is sitting there idle. Why not actually let that computer on your desk participate in that query? A — it’s less expensive because you’ve already paid for that laptop on your desk. B — it’s a better user experience because you can get results back in milliseconds, not seconds. I think there is a new paradigm and a new architecture that can be used. And then there’s further things. There’s edge, there’s mobile, there’s leaving data where it is when it gets created, instead of having to worry about consolidating the data. Like people complain about, okay, why do I have to consolidate all this data in the same place? It’s so expensive to move up. If you can have the compute be anywhere, then there’s so many kind of interesting things that you can do.

Jon: It’s amazing to see that two things can be simultaneously true in this area. The marginal cost of a cycle of compute may be the lowest in a cloud because of the scale and the optimization of that. And, yet, when we dial up the capacity of some distributed system analytics in the cloud, it’s a lot like adding lanes to a highway which produces more gridlock in about 24 months.

Jordan: The reason that adding lanes to highway doesn’t make things faster is because what’s slow is getting things on and off the highway. Getting back to sort of DuckDB is like getting your query in and getting your data out are very often the most important things. And what happens in the middle all converges to the same place over time.

Hannes: Yeah, it is really shocking to see what is possible on laptops. It’s something that we have kind of forgotten about. And, of course, the marginal cost of a, CPU cycle in the cloud isn’t what your cloud database is billing you for it. Right? I think there’s also a big difference between what the cycle costs and what you are paying for the cycle. And I think it is maybe also part of the reason why it is so much nicer to run things locally than it is to go through that.

Jon: Guys, I want to leave this a little bit where we started. T o produce DuckDB and MotherDuck and this really exciting opportunity for your customers, it took each of you doing what it is that you love that is complementary. I think our audience would be interested to hear however many months in it is since that all happened, what it is that you’ve loved or learned from the other person that you’re working with that has helped you become stronger since then?

Hannes: Well, my life has turned around pretty radically since we first spok, Jordan. My life has changed completely from a mostly academic researcher to somebody who is running a, I mean, it’s not a giant team, but it’s a team of competent people that are building DuckDB here at DuckDB Labs in Amsterdam. And my position has changed into something that is much more what Jordan described. And I’m not sure I’m entirely there yet that — this is going to be the thing that I really love doing. But it has changed a lot and it’s been really interesting for me to see how Jordan goes about doing things. Because he’s been, of course, also building his company the last 10 months at a much greater speed than we do, of course. But since this is all so new to me, it’s been extremely interesting and valuable for me to just watch that a bit.

Jordan: On my side, at the end of 2022, I sent a letter to kind of our team and to investors, and I said, if there was one word, to sum up 2022, it’s lucky. I feel just incredibly fortunate that we kind of hitched our star to DuckDB and to Hannes and Mark and their team because there’s such incredible tailwinds that come behind this really groundbreaking technology. And we were in the right place at the right time, and, hopefully, we’re going to have this great partnership going into the future. And one of the things that worries me the most going into that is that we’re going to do something to screw up that relationship. And from other founders and people who have commercialized open-source technology, including the Databricks founder, have shared with me is that it’s going to get hard because your incentives are going to diverge, the things that you care about are going to be at odds. And so, it’s just something to actively maintain. As fortunate as we are, we want to acknowledge that and also acknowledge that in order for this partnership to be successful in the future, it’s going to take, active work and deliberate trust in being willing to say, “okay, Awell, maybe we wanted to do this, but for the sake of the relationship we will take a slightly different approach.”

Jon: Hannes Mühleisen, Jordan Tigani, thanks so much for your time. This has been a lot of fun.

Hannes: Thanks for having us.

Jordan: Thanks, Jon. Thanks, Hannes.

Coral: Thank you for listening to this week’s episode of Founded & Funded. If you’re interested in learning more about MotherDuck, please visit MotherDuck.com. If you’re interested in learning more about DuckDB, visit duckdblabs.com. Thank you again for listening, and tune in in a couple of weeks for another episode of Founded & Funded with Panther Labs Founder Jack Naglieri.

GitHub CEO Thomas Dohmke on Generative AI-powered Developer Experiences

GitHub CEO Thomas Dohmke talks with Madrona Partner Aseem Datar about Copilot X and the evolution to generative AI-powered developer experiences.

Today we have the pleasure of hosting GitHub CEO Thomas Dohmke. He and Madrona Partner Aseem Datar talk about how Thomas got into working with computers and coding and the work he’s been doing since becoming GitHub CEO in November 2021, including the recent launch of Copilot X. But these two discuss so much more, including the rise of generative AI, talking about everything from how it is a new way for developers – everyone really – to express their creativity to how it democratizes many skills and access to those skills to the generative AI-powered developer experiences and how the constantly evolving world developers have always worked in has set them up with the perfect safety network to leverage generative AI to its fullest potential. Thomas also offers up advice for people just launching a startup. But you’ll have to listen to hear it all.

This transcript was automatically generated and edited for clarity.

Aseem: Hey, everybody. My name is Aseem Datar. I’m a partner at Madrona Ventures. Today I have my close friend and GitHub CEO Thomas Dohmke. I’m excited to chat with him on this wonderful topic of generative AI.

Thomas, welcome.

Thomas: Yeah. Hello, and thank you so much for having me, Aseem.

Aseem: We are excited more than you are, Thomas. It’s always fun to talk to somebody leading the charge on innovation in this industry. Maybe start by giving us a little bit of your story and introducing yourself.

Thomas: I’d like to say I’m Thomas, and I’m a developer. I’ve been identifying as a developer ever since the late ’80s early ’90s when I was about 12-13 years old, and I got access to computers first in the geography lab in school and then later when buying a Commodore 64. I’ve been fascinated by building software. And, obviously, as a kid also gaming and playing with all kinds of aspects of computers. And I have been working with code and being passionate about code ever since building my own applications, studying computer engineering first in Berlin. And then, doing my Ph.D. in Glasgow, I worked at Mercedes, building driver assistance systems. And then, in 2008, Steve Jobs announced the App Store. And it pulled me into the app business. I had a startup called Hockey App that was ultimately acquired by Microsoft in 2014, and that moved me from Germany all the way here to the West Coast into Microsoft and that path. Then led me first into GitHub through the acquisition, running special projects at GitHub, and since November 2021, I’ve been the CEO.

Aseem: What a fun journey. Thomas, I can’t stop myself from saying developers, developers, developers all the way from the Steve Ballmer world. And it’s so much fun to be talking to you. Clearly, a lot has changed in the world. There’s this rapid pace of innovation that we are seeing with this new capability set called generative ai. And we are all excited about talking and hearing more about generative ai. What’s your worldview? I would love to just understand that.

Thomas: If I look back over the last six months or so, we had multiple moments that you could compare to the App Store moment I described earlier. That happened in 2008. I think the biggest of those moments clearly was ChatGPT late last year. And you know, I have heard people describing that moment of ChatGPT launching and seeing fast adoption as the Mosaic moment of the 2020s. If you’re old enough, you might remember the first browser Mosaic and then quickly followed by Netscape and, actually, last night over dinner, I argued with folks — is it the Netscape moment or the Mosaic moment? I think it doesn’t really matter, but what matters is that within a rapid amount of time, people adopted ChatGPT and had seen the way they work shifting. And before that, before ChatGPT, we had already seen a shift through Midjourney and Stable Diffusion — those image models. And I think, you know, those models are great to describe what generative ai does, and part of it is really creating a new way of people expressing their creativity. And we have heard stories of folks spending, you know, their evenings rendering images instead of watching Netflix. I think that’s exciting. My example always is, you know, depending on what city I’m in, what customers I’m speaking to is like, you know, ask Stable Diffusion to render the skyline of Tel Aviv as if it were painted by the French impressionist Monet. And obviously, Monet hasn’t seen the skyline of Tel Aviv as it looks today. And yet, those models generate a picture that resembles a Monet rendering the skyline of Tel Aviv, Sydney, or San Francisco. And I think that is really the power of this new world of generative ai.

And the other thing that it brings is it democratizes a lot of skills and access to those skills. And especially if you think about students and kids that sit in class and where the teacher in front of a class of 30 kids just doesn’t have the time to be the tutor for each every single individual kid, but have it giving them an AI assistant where they can ask all the questions that they might not dare to ask in class, or where they, you know, didn’t have the time or the teacher didn’t have the time or the parents don’t have the time because they’re working three jobs. I think that is where really the power of this AI moment comes from and where we see tremendous excitement in the industry and really in, in everybody you’re talking to.

Aseem: Yeah, I mean, no question. Right? I think productivity is such a massive space where generative AI is having an impact today. It’s awesome to see these scenarios in real life, come to light, whether it’s for students, whether it’s for business workers, whether it’s for information workers, but behind it, all is the ethos of creativity in some senses in the software world are developers, right? And I think you can’t run away from the fact that there are developers creating these intelligent applications and embedding AI into it. So what does this moment really mean for developers? How do you think the generative AI-powered developer experiences will change?

Thomas: The role of developers has always changed, right? If we look back over the last 40 years, we went from punch cards and machine language and mainframes and cobalt and whatnot to modern programming languages. We went from building everything ourselves before the internet to leveraging thousands of open-source components ever since, you know, the early 2000s, I’d say.

Aseem: By the way, I thought Visual Basic was a big moment. just You know, going back to those days but, but carry on.

GitHub CEO Thomas Dohmke: And you can probably make that argument for many programming languages in their own right. I think Ruby was a great moment as well. And a lot of startups in the last decade or so were founded on Ruby on Rails because it’s just so easy to iterate with Rails. And Python, you know, unlocked a lot of the machine learning that we are now seeing. And the nice thing you know about software development is that it has been always part of the practice of software development to solve issues, right? No developer is perfect, whether we made mistakes on punch cards, we made mistakes in assembler and now we are making mistake in code. It has always been around solving issues, fixing your own bugs, or fixing your team’s bugs. And the word bug even comes from the bug on the punch card. And so, we built all this tooling, compilers, and debuggers to find, issues by writing code. We invented practices like unit testing to make sure that what we’re building is the thing we wanted to build. And in the last decade or so, we introduced DevOps practices or agile practices, code review, pull request, pair programming, continuous integration and deployment, CI/CD, code, and secret scanning. And so if you tie this now to AI, it’s actually fascinating. We’ve built the safety network within software development to leverage generative AI to its fullest potential. We all know that those models, those large language models, are not always right and that they have something called hallucinations. They think they have the answer, and they’re confident in what they’re saying, but it’s wrong. And with all these practices that software developers have, we have the safeguards in place to make sure we can work with a model suggestion and either take it and modify it or take it and then figure out in code review that is not exactly what we want to do. You could argue we built DevOps with the aspiration that in the future, there will be a moment like ChatGPT, where we can unlock more productivity, more creativity in developers to ultimately become realize even bigger ideas. I think that’s ultimately what this is all about.

And at GitHub over two years ago, now — in 2020, we started working on Copilot, which is one of the first AI pair programmers. It sits in your editor, and when you type as a developer, it suggests code to you and can complete a line, but it can also complete whole methods — multiple lines of code, lots of boilerplate, import statements and Java and whatnot, test cases, complex algorithms. And it’s not always right, but developers are used to that. They type in the editor, and it shows the suggestion. And if that’s not what I want, well, I can just keep typing, and if it’s close enough to what I want, I’d press the tab key, and I can use that and modify it. And that’s no different than copying code from Stack or from GitHub and then modifying that. Almost never, you know, you find a snippet on the internet that’s exactly what you want.

The generative AI-powered developer experiences gives them a way to be more creative. And, I mentioned DevOps earlier. I think DevOps is great because it has created a lot of safeguards, and it has made a lot of managers happy because they can monitor the flow of the idea all the way to the cloud and they can track the cycle time. And they have a certain level of confidence that developers are not just SSHing into a production server because, they are some safeguards in place, but it hasn’t actually made developers more happy. It hasn’t given them the space to be creative. And so, by bringing AI into the developer workflow by letting developers stay in the flow, we are bringing something back that got lost in the last 20 years, which is creativity, which is happiness, which is not bogging down developers with debugging and solving problems all day but letting them actually write what they want to write. I think that is the true power of AI for software developers.

Aseem: I remember my days of writing code in an Emacs editor, and that was just like slightly better than Notepad because it had a few color schemes and whatnot. Two things that you mentioned that I latched onto. One is productivity, and the second is creativity. And I think those two certainly are top of mind for developers. What are some of the things that developers should be excited about, and what are some of the areas that you guys have doubled down in and will continue to double down in?

GitHub CEO Thomas Dohmke: Yeah. I mean, let me take you on a bit of a history lesson. In the summer of 2020, GPT-3 came out, so that’s almost three years ago, and back then, you know, our GitHub next team that team within GitHub that looks into the future asked themselves can we use GPT-3 to write code? And, we looked into the model, and we came up with three scenarios. It’s fascinating now in 2023 to look at these three scenarios because there was text to code. So that’s what Copilot does today, right? You type text, and it suggests code to you. Code to text, which is like you, you ask the model to describe what the code is doing. And we just announced that as part of Copilot X, where you can have Copilot describe a pull request to you. And if you’re a developer, you know what that’s like. You’re working all day on a feature, and you’re submitting a pull request, and now you have to fill out all these forms and its title and the body, and like, ah, I know, I know what I did today. And it’s all obvious to me because I build all this code. I don’t want to spend too much time describing that to others. And so, with copilot for pull requests, we are helping people to just do that for them. And it describes the code, but it’s not only about the pull request, it helps you to describe code that you might be reading from a coworker and the editor. It might just help you to remember what that was. And it might help people to describe old code. This old COBOL code that some banks are still running and its code that’s from the ’60s, running on mainframes where the people that wrote that code back then are long in retirement, I hope. And so, the expertise is gone. And then the last one was conversational coding. And we didn’t build that at the time because we felt the model was not good enough to have these kind of conversations. And clearly now, with ChatGPT 3.5 and, and now GPT-4, we have reached the point where those chat scenarios are useful. And more often right than wrong. Back in 2020, we explored these three scenarios, and the way we validated, that this is good enough for us and that we can build a product on top of that was we asked our staff and principal engineers to submit coding exercises, things we would use in an interview loop — a description and a method declaration and a method body. And so we got about 230 or so of these exercises, and we stripped out the body and basically gave only the declaration and the description to the model. And we gave the model 150 attempts for each exercise to solve the exercise and get close enough to the solution. And what we figured out from this experiment that 92% of those exercises could be solved by the model back then in 2020. Even then, already the model was good enough for a lot of these coding exercises. And so, we took that as inspiration to build Copilot and ship Copilot to the world.

On March 22nd, we announced Copilot X. So, then the next generation of Copilot, of really bringing the power of these AI models into all parts of the developer experience, whether it’s coding in your IDE, whether it’s chat scenarios where you can explore ideas. The example I tried first was I ask it how to build a snake game in Python. You know, the game that we were playing on cell phones before they had touchscreens. And it starts showing an explanation of how you do that, and then you can just ask it to “Tell me more on step one,” and it shows you some code, and you can start building with that. I think that’s the true power here is that you can rediscover your love for programming if you lost it. Or you can explore a new programming language, or you can just, you know, ask the chat agent to fix a bug in your code or fix the security issue, like to remove that SQL injection that you accidentally put there. We announced Pull requests. I’ve already mentioned that describing pull requests. And soon enough, we will also have test generations. So, the pull requests will check whether you actually wrote or the tests you’re supposed to write and then generate those tests for you. And then the other cool thing that we announced is Copilot for docs. And so, we built a feature that basically lets you ask questions about the documentation for React, Azure, and a couple of other projects.

And so, the model has a cutoff date until it was trained. And the training is a really expensive process. It takes, you know, weeks on a supercomputer to train the model again. The current GPT-4 has a cutoff date of September 2021. And it actually will tell you that, if you ask questions about things that happened since then. And so it doesn’t know about changes to open source projects in their documentation that happened in the meantime. And, you know, September 2021 to, we are recording this in March 2023, is a long time for APIs of open source projects. What we’re doing is basically we are collecting that data from those open source projects, and we are feeding them into the prompt, so they’re becoming part of the prompt of the part that you’re not seeing as the person asking the question, so can answer up to date questions on those projects.

Aseem: I am so excited about Docs, right? Like, I go back to my days as a developer, and so much time was spent on going and reading up docs and pulling up from different places, and it was just a productivity suck. So, congrats and kudos. And I do want to point out that I think GitHub created this notion around Copilot, which is now injected all across Microsoft, and now there’s a copilot for Office as a copilot for Teams. I couldn’t be more excited to see where this goes. Shifting gears, a little bit, Thomas, one thing that gets me excited, especially in the world of venture, is that our startup founders and teams can now go from zero to production very quickly. What advice do you have for somebody starting out, like building a business or creating a team? What should they be bullish on? What should they be worried about?

GitHub CEO Thomas Dohmke: I think you know a lot about creativity is to stay in the flow and not get distracted from all the things that are happening around you. And oftentimes, you know, we, we are like gravitating to those things, whether it’s the browser or whether it is social media and whatnot. And so, I think my first advice to startup founders is, you know, stay focused and leverage the time of the, day when you’re actually creative because that time is so limited. Like, you know, our creativity is infinite, but the time when we are actually creative during a day, when we have the energy to build cool things, is fairly limited. And, for some people, it’s early in the morning. For me, it’s usually after my first cup of coffee, that’s when I’m the most creative. And then I always want the second cup of coffee to have that same impact, and it doesn’t, right? It never works that way. am also creative at the end of the day when it’s dark outside, and I’m a bit of a night owl as well. And so I think, you know, as a founder, you have to find those moments during the day and keep that energy ultimately flowing.

We live in this, you know, world right now, whether you call it a recession or not, I think we are in a complicated macroeconomic environment, to say it more, politically correct. But I think those times are always challenging and opportunities at the same time. And we saw this in the last downturn in 2008 — many of the startups that are now part of our lives, like Airbnb, Uber, Slack, or Netflix, they were founded around that same time. And they’re now part of life. And they, or Shopify actually, is another great examples of these that was founded during a downturn, building the technology, and then as we came out of this everybody wanted to have an e-commerce store or, and buy from these stores. And I think that’s the opportunity that we have now and today or this year, it’s leveraging generative AI, as like the foundational layer. And many startups will build on top of that, and they will have to find differentiation and defensibility of their idea. And I think, you know, we’ll see a. cool ideas building on top of ChatGPT or GPT-4, and a lot of these are really cool, but they’re also probably not going to survive as a company on their own because it’s a small idea that, you know, summarizing your emails in Gmail. I would think Google will build that into the product and then you really have to push hard to make that a paid product that’s customers will pay for if they have that already built into Google.

Aseem: I couldn’t agree more. We’ve always talked about do more with less, but I think the, the AI or the capabilities that we are seeing pop up is all about doing much more with much, much less. And that’s, I think, the beauty of the pace of innovation that we are seeing all around ourselves. Thomas, I know that you are, you’re deeply plugged into the startup ecosystem. You see a lot of these open-source projects come to life. Are there any projects or startups that you are really, really excited about?

Thomas: I’m, I’m staying bullish on ChatGPT and OpenAI, and I think we are at GitHub very excited about the future of Copilot. I mentioned earlier things like Stable Diffusion and Midjourney, which make me really excited. I’m, I’m not an artist at all. I can’t draw, and I certainly cannot paint something that looks like a Monet. And if you take that a step further, I’m really bullish and excited about a startup called Runway that lets you generate videos from images, from video clips, but also from text prompts. And I think, you know, there’s going to be a moment where you can just write a script into a text field, and it generates a full animated video for you. And that will allow us to take the stories that we heard as kids from our parents or even grandparents and turn them into little video clips that we can show to our kids. And I think that will be so cool if you can basically tell the stories for me now, two or three generations ago in little videos to the next generation. I think you and I both sit on a board of a company called Spice AI that explores AI from a different perspective, which is not about large language models or image models. It’s about time series AI and finding anomalies in time series data. And it allows you to query that data, and they started with blockchain and Web3, and you can write your own queries and quickly figure out what’s Bitcoin doing. But you can also run AI on top of that and find things that are interesting, find alerts, or find price changes. And in the future, I think there’s a lot of huge space in there. You can apply this to your server data, your server monitoring maybe your Kubernetes clusters. There are all kinds of time series data that affect us every day — weather is ultimately also time series based, right? Like it’s cold in the night and warm in the day. And so, I’m excited about that. In general, you know, the AI and mL space is super exciting for me. There are so many startups I could list here. There’s Replicate, a startup that’s based in Berkeley. They’re letting you run machine learning models with just a few lines of code. And you don’t actually have to understand how machine learning works. There’s OctoML based in Seattle that uses machine learning to deploy machine learning models to the cloud and find the most efficient version, you know, the right GPU type, and the right cloud provider for your model. But I think you know the ML AI space is super exciting, and I’m sure we are going to see lots more ideas that nobody thought is possible and and nobody thought about right now. And. Similar to, ChatGPT in hindsight, seems so obvious. But until it came and conquered the world, nobody else had built it. So, I think I couldn’t be more excited about that future.

Aseem: Yeah. And I echo that sentiment. We at Moderna are really excited about being able to help Runway, OctoML, and Spice AI in their journey of building out for the future. And I think it’s always interesting to see the future getting accelerated in a way that we can, that we can’t even imagine, to be honest. And yes, there is scenarios around hallucination, et cetera, that we’ve all got to watch out for. And I think you said it well, which is, it’s a start. There’s still going to be a developer or human in the loop, at least for the short term, until it gets to a point of high confidence.

Thomas one other interest. Notion that, that I wanted to sort of also pick your brain on is if I’m a startup founder, what should I look forward to in the distant future? I mean, we talked about all these modalities, but one of the challenges that founders have is developers are hard to come by and top talent is very hard to come by. And, there’s this notion around, tools being built to go tackle the low-code, no-code space or democratize development. What’s your view on that from a GitHub perspective?

Thomas: You know, I think there’s this, slogan, fake it till you make it. And that’s true for so many founders as well. You know, you don’t have to have a perfect solution right from the start. You can combine all these AI tools that are available to you now to stitch something together really fast. Whether it’s copilot, whether it’s Stable Diffusion, whether it’s some of the other tools that help you just by AI — help you write your marketing copy. Embracing those things as much as possible and adjusting your style to it. I think what will happen to developers is that the developer will learn how to leverage AI to its best. Andrej Karpathy tweeted about this recently where he basically says I changed my programming style and by writing a bit more commentary and a bit more declarative, statements I can get Copilot or aI to actually synthesize more code for me. And I think that’s kinda like what we are going to learn and where I’m bullish on building AI in the open and having those models out there, and building with them and learning how to use them as early as possible before we get to AGI and there’s a certain amount of scare about this and what we can do. But you know, today, those models are, not sentient. They’re not actually creative. They’re predicting the next word. And if you wanna switch ’em off, you can just go, to an Azure data center and switch, switch it off. And I think, but, so we need to build this in the open and we need to learn from where’s the model good and how can we use the model to help us as humans. And we also need to learn where’s the model bad and where make does it make mistakes or makes wrong predictions.

And actually, I think, the model itself will be able to correct itself. I think, there was recently an example from Ben Thompson’s blog, Stratechery, where basically somebody on social media posted, I think, four paragraphs of a blog post from Ben into ChatGPT and then asked it who wrote this. And it basically detected that this is a blog post from Ben Thompson without telling it that information. And I think, in the same way, we will be able to use AI to detect something that was wrongly written by AI. And so, the technology works with each other. And I think by building this in the open, we are preparing for that future where AI plays a bigger role for us on this planet.

Aseem: Hey Thomas, I know we are out of time. Thanks so much. This has been a blast, and I’m sure our startup founders, our listeners, are taking so much away from this discussion with GitHub CEO Thomas Dohmke. And I couldn’t thank you enough. Thanks for being on with us. And we’re excited to be able to partner and work together.

Thomas: Yeah. Thank you so much for having me on this podcast.

Coral: Thank you for listening to this episode of Founded & Funded. If you’re interested in learning more about what’s going on at GitHub, check out their blog at Github.blog. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded & Funded with the founders of MotherDuck and DuckDB.

 

Credo AI Founder Navrina Singh on Responsible AI and Her Passion for an ‘AI-First, Ethics Forward’ Approach

Credo AI's Navrina Singh on ‘AI-First, Ethics Forward’ Responsible AI

In this week’s IA40 Spotlight Episode, Investor Sabrina Wu talks with Credo AI Founder and CEO Navrina Singh. Founded in 2020, Credo’s intelligent responsible AI governance platform helps companies minimize AI-related risk by ensuring their AI is fair, compliant, secure, auditable, and human-centered. The company announced a $12.8M Series A last summer to continue its mission of empowering every organization in the world to create AI with the highest ethical standards.

Navrina and Sabrina dive into this world of governance and risk assessment and why Navrina wanted to make governance front and center rather than an afterthought in the quickly evolving world of AI. Navrina is not shy about what she thinks we should all be worried about when it comes to the abilities of LLMs and generative AI and her passion for an “AI-first, ethics-forward” approach to artificial intelligence. These two discuss the different compliance and guardrail needs for companies within the generative AI ecosystem and so much more.

This transcript was automatically generated and edited for clarity.

Sabrina: Hi everyone. My name is Sabrina Wu, and I am one of the investors here at Madrona. I’m excited to be here today with Navrina Singh, who’s the CEO and founder of Credo AI. Navrina, welcome to the Founded and Funded podcast.

Navrina: Thank you so much for having me, Sabrina. Looking forward to the conversation.

Sabrina: So Navrina, perhaps we could start by having you share a little background on Credo and the founding story. I’m curious what got you excited to work on this problem of AI governance.

Navrina: Absolutely. Sabrina. It’s interesting. We are actually going to be celebrating our three-year anniversary next week, so we’ve come a long way in the past three years. I started Credo AI after spending almost 20 years building products in mobile SaaS and AI at some of large companies like Microsoft and Qualcomm. And I would say in the past decade, this whole notion of AI safety took on a very different meaning for me.

I was running a team which was focused on building robotics applications in one of the companies, and as we saw these human-machine interactions in a manufacturing plant where these robots were working alongside a human, I would say that was really an aha moment for me in terms of how are we ensuring safety of, obviously, humans, but also thinking about environments in which we could control these robotics applications when they go unchecked. And I would say that, as my career progressed, moving to cloud and building applications, especially focused on facial recognition, large language models, NLP systems, and running a conversational AI team at Microsoft, what became very clear was that same physical safety now was becoming even more critical in digital world. So when you have all these AI systems, literally as our agents working alongside us, doing things for us, how are we ensuring that these systems are really serving us and our purpose? And so a couple of years ago, we really started to think about is there a way that we can ensure that governance is front and center rather than an afterthought. So six years ago, we really started to dive deeper into how can I bridge this gap, this oversight deficit as I call it, between the technical stakeholders, the consumer, the policy, governance, and risk teams to ensure that we are not having these AI-based, ML-based applications all around us becoming this fabric of our society and our world completely going unchecked.

For me, that was an idea that I just could not shake it off. I really needed to solve for especially in the AI space, there’s a need for multi-stakeholders to come in and inform how these systems are going to serve us. So that led me to really start looking at the policy and regulatory ecosystem.Is that the reason? Is that going to be the impetus for companies to start taking governance more seriously? And Credo AI was born out of that need as to how can we create a multi-stakeholder tool that is not just looking at technical capabilities of the systems but is also looking at the techno-social capabilities of these systems so that AI and machine learning are serving our purpose.

Sabrina: And I think at Madrona, we also believe that all applications will become intelligent over time. Right? This thesis of taking in data and leveraging that data to make an application more intelligent. But in leveraging data and in using AI and ML, there becomes this potential AI governance problem, kind of what you had just alluded to a little bit there.

We even saw GPT4 was released, and one of the critiques among all the many, many amazing advances that came with it is how GPT continues to be a black box. Right? And so, Navrina, I’m curious, how exactly do you define responsible AI at Credo? What does that mean to you, and how should companies think about using responsible AI?

Navrina: That’s a great question and I would say one of the biggest barriers to this space growing at the speed at which I would like, and the reason is there’s multiple terms: AI governance, AI assurance, responsible AI, all being sort of put in this soup, if you will, for companies to figure out. So there is a lack of education. So, great question. So let me step back and explain what we mean by AI governance. When we think about AI governance, it is literally a discipline of framework consisting of policy regulation, company best practices, , sector best practices that guide the development, procurement, and use of artificial intelligence. And when we think about responsible AI, it is literally the accountability aspect. How do you implement AI governance in a way that you can provide assurance? Assurance that these systems are safe, assurance that these systems are sound, assurance that these systems are effective, assurance that these systems are going to cause very little harm.

And when I say very little, I think we’ve found that no harm is, right now, an aspirational state. So getting to very little harm is certainly something companies are aspiring for. So when you think about AI governance as a discipline, and the output of that is proof that you can trust these AI systems, that entire way of bringing accountability is what we call responsible AI.

Who is accountable? When that person is accountable for ensuring AI systems actually work in the way that we, uh, expect them to. What are the steps we are taking to minimize those intended and unintended consequences? And what are we doing to ensure that everything, whether it’s the toolchain, whether it’s the set of company policies, whether it is regulatory framework? All of them evolve to manage the risks that these systems are going to present.

And I, I think that for us, I would say in, in this very fast and emerging space of AI governance has been critical to bring focus and education too.

Sabrina: Maybe we could just double-click on that point. How exactly is Credo solving the problem of AI governance?

Navrina: So Credo AI is an AI governance software. It’s a SaaS platform that organizations use to bring oversight and accountability to the procurement and development, and deployment of their AI systems.

So what this means is, in our software, we do three things effectively well. The first thing that we do is we bring in context, and this context can come from new standards, existing standards like NIST RMF This context can come from existing regulation or emerging regulations, whether it’s EU AI act as an emerging regulation or existing regulations like New York City Law number 144. Or this context could be company policies. Many of the enterprises that we work with right now are self-assessing. They’re providing proof of governance. So in that spirit, they’ve created their own set of guardrails and policies that they want to make sure gets standardized across all their siloed AI implementation.

So the first thing that Credo does is bring in all this context, standards, regulations, policies, best practices, and we codify them into something called as policy packs. And these policy packs, you can think about them as a coming together of the technical and business stakeholders. Because what we do is we codify them into measures and metrics that you can use for testing your AI systems. But we also bring in process guardrails, which are critical for your policy and governance teams to be able to manage across the organization. So this first stage of bringing in context is really critical. Once Credo AI has codified that context, the next step is this assurance component. How do you actually test the data sets? How do you test the models? How do you test input and outputs, which are becoming very critical in generative AI, to ensure that whatever you’ve aligned on in the context, actually you can prove soundness, you can prove effectiveness against those guardrails. So our second stage is all about assurance and testing, and validations of not only your technical system but also your process. And then the last component is supercritical, which is translation. And in translation, we are taking all the evidence we have gathered from your technical systems, from your processes that exist within your organization, and we convert them into governance artifacts that are easily understandable by different stakeholders. Whether you are looking at risk dashboards for your executive stakeholders, whether you need transparency report or disclosure reports for your audit teams, or whether you are looking at impact assessments for a regulator. Or whether you’re looking at just a transparency artifact to prove to consumers that within the context of which, as a company, you’ve done your best.

So as you can imagine, just putting it all together, Credo is all about contextual governance. So we bring in context, we test against that context, and then we create this multi-stakeholder governance artifacts so that we can bridge this gap, this oversight deficit that has existed between the technical and business stakeholders.

Sabrina: I’m curious as it relates to the policy packs, are they transferable across different industries? Do you work with different industries? And, and are there certain regulations that are coming out where Credo is more useful today? Or do you see that kind of evolving over time?

And then I have a couple of follow-up questions after that, but maybe we could start with that.

Navrina: Right now, as you can imagine, the sectors that Credo AI is getting a lot of excitement in are regulated sectors. And the reason for that is they’ve been there, they’ve done that, they’ve been exposed to risks, and they’ve had to manage that risk. So our top performing sectors are financial services, insurance, and HR. And HR has been, I would say, a new addition, especially because of emerging new regulations across the globe. So having said that, when we look at the regulated sector, the reason companies are adopting Credo AI is because, one, they already have a lot of regulations that they have to adhere to, not only for old statistical models but now for new machine learning systems.

However, what we are finding, and this is where the excitement for Credo AI just increases exponentially, is we are finding unregulated sectors, whether it is high tech, whether it is even government, which, as you can imagine, has a lot of unregulated components. We are finding their companies are adopting AI governance because they are recognizing how crucial trust and transparency is as they start using artificial intelligence. And also, they’re recognizing how critical trust and transparency is for them to win in this age of AI. If they can be proactive about showing whatever black box they have, what were the guardrails being put around that black box? And by the way, it goes way beyond explainability. But I think the transparency around what are the guardrails we are putting across these systems. Who potentially can be impacted by these systems? What I, as a company, have done to introduce a way to reduce those harms, and being very proactive about those governance artifacts, we are finding that there’s an uptick in this unregulated sector around brand management, around trust building. Because these sectors want to adopt more AI. They want to do it faster, and they want to do it by keeping consumers in the loop around how they’re ensuring at every step of the way that the harms are limited.

Sabrina: When You talk about explainability, I think one thing that’s interesting is being able to understand what data is going into the model, understanding how to evaluate the different data sets. Is Credo evaluating certain types of data? Like is it structured versus unstructured data? How are you guys thinking about that level of technicality, and how are you helping with explainability?

Navrina: I think this is where I’ll share with you what Credo AI is not. And this goes back to a problem of education and a problem of nascency in the market. So, Credo AI is not an ML ops tool. For many companies that have in the past, I will say five to six years, adopted ML Ops tools, that ML op tools are fantastic at helping test experiment, develop, productionized ML models primarily for developers and technical stakeholders. And they are, Many of the ML ops tools are trying to bring in that responsibility layer by doing much more extensive testing by being very thoughtful about where could there be fairness, security, reliability issues. The challenge that happens right now with ML ops tools, it is very difficult for a non-technical stakeholder. If I am a compliance person, if I am a risk person — if I am a policy person — to understand what those systems are being tested for, and what are the outputs. So this is where Credo AI comes in. We really are a bridge between these ML ops tools, and if you can think about the GRC ecosystem, the governance, risk, and compliance ecosystem, so that’s an important differentiation to understand. We sit on top of your ML infrastructure, sort of looking across your entire pipeline, entire AI lifecycle to figure out where there might be hotspots of risk. That basically aligned with the context that we’ve brought in with the policies, with the best practices that these hotspots are emerging. And then Credo AI is also launching mitigation where you can take active step.

So having said that. To address your question a little bit more specifically, right now, Credo AI, over the past three years, has built a strong IP moat where we can actually tackle both structured and unstructured data extremely well. So, for example, in financial services, which is our top-performing sector, Credo AI right now is being deployed to provide governance for use cases, from fraud models to risk scoring models. To anti-money laundering models, to credit underwriting models. And then, if you think about the high-tech sector, we are being extensively used for facial recognition systems. We are being used for speech recognition systems. And in government, where we are getting a lot of excitement, there is a big focus on object detection on the field. So situational awareness systems, but also back office. As a government agency or as a government partner, they are buying a lot of commercial third-party AI systems. So Credo AI can also help you with evaluation of third-party AI systems, which you might not even have visibility into.

So how do you create that transparency which can lead to trust? But we do that very effectively across all the sectors. And I know we’ll go a little bit deeper into generative AI and what we are doing there in just a bit. But, but right now we, we’ve built those capabilities over the past three years, both structured and unstructured data sets and ML systems are a focus for us, and that’s where we are seeing the traction.

Sabrina: Is there some way that you think about sourcing the ground truth data? As we think about demographic data in the HR tech use case, is there some data source that you plug into, and how do you think about this evolving over time? How do you continue to source that ground truth data?

Navrina: It’s important to understand why do customers use Credo ai and then it then addresses the question that you just asked me. There are three reasons why companies use Credo AI. First and foremost is to standardize AI governance. Most of the companies we work with are global two thousands, and as you can imagine, they have very siloed ML implementation, and they’re looking for a mechanism by which they can bring in that context and standardize visibility and governance across all those different siloed implementation.

The second reason that companies bring in Credo AI is that they can really look at AI risk and visibility across all those different ML systems. And then lastly, why they bring in Credo AI is to be compliant to existing or emerging regulations.

What we are finding is in most of these applications, there are two routes we’ve taken. One is that we source the ground truth for a particular application ourselves. So, in that case, we’ve worked with many data vendors to create grounds through data for different applications that we know are going to be pretty big and massive, and we have a lot of customer demand from. However, on the second side, where a customer is really looking for standardization of AI governance — is really looking for compliance. In that case, we work with the ground truth data that the company has, and we can use that ground truth data to test against. Because, again, they’re looking for standardization. They’re looking for regulatory compliance, and they’re not looking for that independent check where we are providing the independent data sets to do ground truth.

Sabrina: In the compliance and audit use case, is this something that companies are going to have to do year after year? How should they be thinking about this? Is this something they’ll do time and time again, or is it a one-time audit, and then you check the box and you’re done?

Navrina: The companies that think about this as a once and done, checkbox, they’re already going to fail in the age of AI. The companies we work with right now are very interested in continuous governance, which is one, from the onset, I’m thinking about an ML application. How can I ensure governance throughout that development process or throughout the procurement process? So that before I put it in production, I not only have a good handle on potential risk but once I’ve put that in production and then through the monitoring systems that they have, which we connect to, we can ensure continuous governance. Having said that, the regulatory landscape is very fragmented, Sabrina. Right now, most of the regulations that are upcoming will require, at minimum, an annual audit, an annual compliance requirement. But we are seeing emerging regulations which need that on quarterly basis. This is where, especially with the speed of advancements we’ve seen in artificial intelligence, and especially with generative AI, where things are going to change literally on a week-by-week basis. It is not so much about the snapshot governance viewpoint, but it is going to be really critical to think about continuous governance because it takes that one episode. I always share with my team. I’m like, AI governance is like that insurance policy you wish you had when you are in that accident. So the companies that are going to say, “Oh, let me just get into that accident and then I’ll pay for it.” It’s too late. Don’t wait for that moment for everything to go wrong. Start investing in AI governance and especially make it front and center to reap the benefits of AI advancements like generative AI that are coming your way.

Sabrina: I love that analogy around the insurance — you get into that accident and then you wish you had the car insurance. I think this is a good place to pivot into this whole world of generative AI, right? There’s been a ton of buzz in the space. I think I read a stat on Crunchbase that was saying there was something like 110 new deals funded in 2022 that were specifically focused on generative AI, which is crazy. I’m curious, when it comes to generative AI, what are some of the areas that you see there being more need for AI governance? And I know Credo also recently launched a generative AI trust toolkit. So how does this help intelligent application companies?

Navrina: Yeah, that really came out of a need that all our customers right now want to experiment with generative AI. Most of the companies we work with are not the careless companies. So just let me explain how I view this generative AI ecosystem.

You have the extremely cautious who are banning generative AI. Guess what? They’re not going to be successful because we are already getting reinvented. We got reinvented with GPT4. So any company that is too cautious in saying, I’m not going to bring in generative AI, already lost in this new world. And then you have the carelessness category, which is the other extreme spectrum. That let’s wait for that accident before I’ll take an action. But by that time, it’s already too late. And then there is the central category, which I am super excited about, the clever category. And this clever category is one, understanding it’s important for them to use and leverage generative AI.

But they’re also very careful about bringing in governance alongside it because they recognize that governance keeping pace with their AI adoption, procurement development is what’s going to be the path for successful implementation. So, in the past, I would say, couple of months, we heard a lot from our customers, that we want to adopt Gen AI, and we need Credo AI to help us adopt generative AI with confidence. Not like necessarily solving all the risks and all the unknown risks that Gen AI will bring, but at least having a pathway to implementation for these risk profiles.

So the generative AI trust toolkit that we have right now, we are literally building it as we speak with our customers, but it already has four core capabilities. So the first capability that we’ve introduced in the generative AI trust toolkit is what we call Gen AI policy packs. So as you can imagine, there’s a lot of concerns around copyright issues, IP infringement issues. So we’ve been working with multiple legal themes to really sort of dissect what these copyright issues could be. So as an example, just this week, the Copyright Office has released a statement about how it handles work that contains material generated by AI. And they’ve been very clear that the copyright law requires creative contributions from humans to be eligible for copyright protection. However, they’ve also stated very clearly that they’re starting a new initiative, which is going to start thinking about this AI generator content and who owns that copyright. But till that happens, really making sure the copyright laws are something that companies abide by, understand, and especially in their data sets, is critical.

So one of the core capabilities on our trusts toolkit is a policy pack around copyright infringement where you can quickly surface and I wouldn’t say quickly, there is obviously work involved based on the application, but quickly understand. So, for example, we have copyright policy pack for GitHub co-pilot, we also have for generative AI, especially coming from stable diffusion. The second category in our trust toolkit is evaluation and test. And so what we’ve done is we’ve extended Credo AI lens, which is our open source assessment framework, to include increased assessment capabilities for large language models like toxicity analysis, and this is where we are working with multiple partners on understanding what are new kinds of assessment capabilities for LLM that we need to start bringing in into our open source.

And then the last two components that we have in our trust toolkit, is a lot around input output governance, and prompt governance. A lot of our customers right now, in the regulated space, are being clever because they don’t want to use LLM for very high-impact, high-value application. They’re using it for customer success. They’re using it for maybe marketing. So in that scenario, they do want to manage what’s happening at the input and what’s happening real time in the output. So, we’ve created filter mechanisms by which they can monitor what’s happening at input output. But also, we’ve launched a very separate toolkit, it’s not part of Credo AI suite, for prompt governance so that we can empower the end users to be mindful about, is this a right prompt that I want to use? Or is this going to expose my organization to additional risk?

I’m very excited about the trust toolkit, but I do want to caveat it. We should all be very worried because we don’t understand the risk of generative AI and large language models. If anyone claims they understand, they’re completely misinformed, and I would be very concerned about it. The second is the power of this technology. When I think about things that keep me up at night, LLMs/ generative AI have literally the power to either make our society or completely break it. Misinformation, security threats at large. We don’t know how to solve it, and Credo AI is not claiming we know how to solve it, but this is where we are actually going to be launching really exciting initiatives soon. Can’t share all the details, but how do we bring in ecosystem to really enable understanding of these unknown risks that these large language models are going to bring?

And then thirdly, companies should be intentional about can they create test beds within their organization and, within that test bed, sort of experiment with Gen AI capabilities, alongside governance capabilities, before they open that test bed and take generative AI to full organization. And that’s where we come in. We are very excited about what we call Gen AI test beds within our customer implementations, where we are testing out governance as we speak around unknown risks that these systems bring.

Sabrina: Wow, a lot to unpack. I think, a lot of exciting offerings from the Gen AI trust toolkit, and I totally agree with you in terms of making sure that people are using responsible AI — large language models in ethical ways and responsible ways. Right. I think one of the critiques is that these LLMs may output malicious, or it just falsely incorrect information and can guide people down potentially more dangerous paths. And I think one thing that I’m always interested in trying to better understand are there certain guardrails that companies can put into place to make sure that these things don’t happen. And I think you just alluded to one — the test bed example here. So I’d love to understand more about other potential ways that companies can use Credo to put these guardrails into place. Maybe it’s more from a governance standpoint and saying, “Hey, are you making sure that you’re checking all of these things when you should be?” Or potentially, it’s, “Hey, are we testing the model? Are we making sure that we understand what the outputting before we take it out to the application use cases?”

It’s certainly a question and big risk in my mind of the technology, right? And we don’t want to get to a place where maybe the government just shuts down the use of larger language models because it becomes so dangerous and because it is so widely accessible in the public’s hands. Just curious how you’re thinking about other guardrails that either companies can do using Credo or otherwise.

Navrina: This is where our policy packs are literally, I would say, the industry leader right now in putting those guardrails. Because again, how do you, when you have an LLM, maybe you’ve retrained it on your corpus of data, or it’s basically just sort of searching on your corpus of data? I think there’s a little bit more relief that you can point to factual information. So the propensity of these LLMs to hallucinate sort of decreases if you are putting those guardrails around, what can you go through, if you can go through only this customer data, which my company owns and just use that corpus of data, those guardrails become really critical. And this is where Credo AI policy packs for copyright for guardrail on systems, what corpus of data you should be using become really critical. And then the input output governance, as I was mentioning, becomes really critical.

Recently I was having a conversation, and I’m not going to name this company, uh, because I think they’re doing phenomenal work, but there was this statement made by an individual from this organization saying that, we should not be overthinking the risk of generative AI systems, but just launch them in the market and let magically the world converge to what the risks are. And then, magically, we will arrive at solutions.

And I think that is the kind of mindset that’s going to take us down that road of AI and completely being unmanaged. And that’s what keeps me up at night when you have so much belief in technology that you turn a blind eye to managing risk. And we do have lot of people in this ecosystem right now that do have that mindset. So, the carelessness category that I was mentioning. So I think this is where education becomes really critical because as we have seen, and I have been exposed to in the past six weeks, is the capacity building within regulators right now is very limited. They are not able to keep up with the advancements in artificial intelligence.

They’re really looking to technologies like us to help work with them, to really think about these guardrails. So, either we are going to run into a future scenario where, there’s heavy regulation, nothing works, and technology is very limited. Or we are going to run into a situation where there is no thinking around these guardrails that we are going to see mass national security threats, misinformation at large.

And I think, I’m trying to figure out right now with the ecosystem, what is that clever way to implement this? And I think one of the cleverest ways is public-private partnership. Because there’s an opportunity for us to, for example, for Red teaming, bring in more policymakers, bring in impacted communities, and make sure that the outputs of those red teamings are exposed to the folks around what potential harms have been uncovered and what commitments a company can make to ensuring that harm does not happen.

Or if you think about system cards, I’m, I’m excited for ChatGPT as well as GPT4 to release their system card. But there are a lot of questions and I think the mechanism by which those questions can be answered around these system cards is going to be really critical. Or work being done by Kudos to Hugging Face around rail license. We are partners in their rail initiative, which is a responsibility AI license, which is being very prescriptive about where and where not this AI and machine learning system can and cannot be used. I think that’s the area and opportunity we are getting into is being very clear around the gap between the intent of an application to the actual use. And how do you bring transparency between those two is going to be a lot of responsibility of the developers building it, but also, the enterprise is consuming it. And Credo AI has such a unique role to play in that as an independent third party, bringing this transparency. And then I think that’s the world that we are getting into right now.

Sabrina: And I wonder if there are other ways that we as a collective community — as investors investing in this space, and then also as company builders, how can we continue to educate the ecosystem on AI governance, what that means, and how we should collectively make sure that we’re implementing these responsible AI systems in an ethical way.

Navrina: So Sabrina, there are a lot of actually great initiatives that are being, worked on. We are an active partner of Data and Trust Alliance, which was started about a year and a half, two years back by the founder of General Catalyst, and it has really attracted some of the largest companies to this partnership.

And we worked with Data Interest Alliance on an assessment, so as investors are looking at, and whether these investors are VCs looking to invest in AI companies, or whether you are part of a corporate venture group or you’re part of an M&A team doing due diligence on a company, what are the questions you should be asking of these AI companies to really sort of unpack what kind of artificial intelligence is being used? Where are they getting their data sets from? How are they managing risk? If they’re not managing risk, why not? What are applications, and what is their categorization of risk profilers application?

The hype is exciting in generative AI. I’m excited about the productivity gains. I’m super excited about the augmentation and creativity it’s already unleashing for me and my eight-year-old daughter, by the way, she’s a huge fan of ChatGPT. Loves writing songs. She’s a fan of Taylor Swift too, so she mixes the two. So I see that. But the issue is really making sure we are being very intentional about when things go wrong. when things are going right, phenomenal, right? It’s when things go wrong. So Data Interest Alliance highly encourage you to look at them.

SD is another initiative. It has investors that have a total of about $2 trillion assets under management. It is investors for sustainable development ecosystem. The investors are asking the same questions, right? What, how do we think about AI companies maybe contributing to misinformation? How do we think about an investment? How can we create disclosure reporting for public companies as part of their 10Ks? Is there a way that we can ask them to report on their responsible procurement, development, and use of artificial intelligence and more to come on that because we are right now working pretty hard, similar to carbon footprint disclosures on responsible AI disclosures? So we’ll be able to share more with you end of this year on an initiative that is gaining a lot of steam to have public companies actually talk about this in their financial disclosures. So good work happening, more needed, and this is where Credo AI can really work with you and rest of the ecosystem to bring that education.

Sabrina: I’m excited to check out those different initiatives and continue partnering with Credo. And I think just to shift a little Navrina. You’re also a member of the National AI Advisory Committee, and as a part of that, to my understanding, you advised the president on National AI Initiatives, and as we were just chatting about, this is extremely important as it relates to the adoption of new regulations and standards. What are some of the initiatives that you’re advising on? And do you have any predictions as to how you see the AI governance landscape shifting in the years ahead?

Navrina: So Sabrina, just FYI, and full disclosure — I’m here in my personal capacity. What I’m going to share next is not representation of what’s happening at NAAC. Couple of things I can share, though, and this is all public information, is first and foremost, NAAC was really emerged from this need of, when we look at United States globally, we are not regulators. We are the innovators of the world. Europe is the regulator of the world if you will. But when we have such a powerful technology, how do we think about a federal, a state-level, and local ecosystem to enable policy making, to enable a better understanding of these systems, and bringing that private-public sector. So that was the intention behind NAAC. Having said that, as I mentioned, I can’t talk to the specific things we’ve been working on, put NAAC aside, I do want to give you a little bit of frame of reference on as Credo AI and myself personally, I’ve been very actively involved with global regulations, whether it is with European Commission on the EU AI act. Whether it’s with UK on their AI assurance framework, whether it is with Singapore on their phenomenal model governance, or whether it’s with, Canada on their actually just recently launched AI and data work. So having said that, couple of things that we are seeing. We are going to see more regulations, and we are going to see more regulations that are going to be contextual. And what I mean by that, in, in United States as an example, New York City has been at the forefront of it with the New York City law number 144, which is all around ensuring that automated employment decision-making tools, if any company is procuring them or using them or building them, have to provide a fairness audit for those in the next month, so April 16th is going to be an interesting day to really see which enterprises take that responsibility very seriously, and which enterprises are bailing on that responsibility. And the question is then the enforcement and how is that going to be enforced? So we are going to one, first and foremost, continue to see a lot of state and local regulations.

On a global stage, I think EU AI Act is going to fundamentally transform how enterprises work. And this is going to have, if you thought GDPR was groundbreaking, think about EU AI Act 10x of that.

So we are going to see brussel effect in its best in the next year. EU AI Act is going to go into effect this year, and it’s going to be enforced in the next two years. So this is the moment that companies have to start deeply thinking about how do they operate in Europe. Having said that, there is a little bit of a curve ball that was thrown at the regulators because of generative AI. And right now, there’s an active debate in European Commission around what EU AI Act covers, which is general-purpose AI systems, and do all generative AI fall in general-purpose AI systems. And there’s active lobbying, as you can imagine, from some of the larger, powerful big techs to avoid generative AI being clubbed in that category because there’s a lot of unknowns in generative AI.

So what we are going to see this year is a very interesting policy landscape, which needs that capacity building to come from the private sector. But this is also going to be a really critical foundation for how are we going to govern and how are we going to keep stakeholders accountable for generative AI.

Sabrina: Do you think there are ways that enterprise companies can start getting prepared or ready for this?

Navrina: First and foremost, I think the C level really needs to acknowledge that they have already been reinvented yesterday. So once they acknowledge that, now they have to really figure out, “Okay if I am going to be this new organization with new kind of AI capabilities in the future, do I want to take that carelessness approach or do I want to be clever approach or cautious approach?” I think right now what, what is going to be really critical is, and this is a big part of the work that I do, in addition to selling Credo AI product, is really sitting down with C-level executives on sort of honing in on the point that why AI governance needs to be an enterprise priority, similar to cybersecurity, similar to privacy. And we’ve learned a lot of lessons in cybersecurity and privacy. So how does AI governance become an enterprise priority? Why you need to do that and how you need to adopt AI with confidence. It is less about, I would say, regulation and trying to be compliant with that. Right now, it’s more about how can I be competitive in this age of AI and how can I bring new AI technologies, and how can I have a good understanding of what the potential risk can be. I think managing regulatory, compliance, managing that brand risk comes little bit secondary right now. It’s literally, do you want to compete in this new age of AI or not?

Sabrina: I think that if you’re an enterprise company not thinking about leveraging generative AI or, AI in some ways, it’s going to be a very tough couple of quarters and years ahead for those companies. Just to wrap up here, I have three final lightning questions, which we ask all of our I40 the first question is, aside from your own company, what startup are you most excited about in the intelligent application space and why?

Navrina: I would say that I am a big fan of the work companies like OpenAI have done. Because we are going to see, uh, this whole notion of co-pilot, someone who is with you wherever you are working and augmenting your work, is something that I get really excited about and especially the ease of use.

Sabrina: Yeah, I love the notion of a co-pilot, right? It’s the ability to democratize AI and allow people that may not have a technical understanding of what’s going on in the backend to really be able to use and leverage the application. Okay. Second question. Outside of enabling and applying AI to solve real-world challenges, what do you think is going to be the next greatest source of technological disruption in the next five years?

Navrina: Wow. Right now, my head and brain is literally all about artificial intelligence. The things that keep me up at night, as I mentioned, is really thinking about will we have a future that we are proud of or not. So I spend a lot of time thinking about climate companies, sustainability companies, and especially how these two, the AI and climate world, are going to come together to ensure that one, we have a planet that we can live on and two, a world that we are proud of, which is not fragmented by misinformation and these harms that AI can cause.

Sabrina: Third question. What is the most important lesson that you have learned over your startup journey?

Navrina: Wow. I’ve learned so many lessons, but I think the one that was very early on, shared by one of my mentors 20 years ago, and holds even more importance to me now. He would always say that a good idea is worth nothing without great execution. And I would say in my past three years with my first startup, all things being equal, the fastest company in the market will win. So when I think about a market that has not existed, and you are a category creator in that market, I am okay with, if the market doesn’t pan out. I’m okay with if the enterprise customers are not ready and they need change management. But the thing that I share with my team is I’m not okay if everything is working in our favor, and we get beat because we didn’t move fast. So that is really important. And we have within Credo AI, our, one of our values is what we call intentional velocity because as you can imagine, the speed by itself doesn’t do much good. It has to be married with this intentionality.

Sabrina: I love that. Well, Narvina, this has been really fun for me. I’m excited to continue following all the great work Credo AI is doing and thank you again.

Navrina: Thank you so much for having me, Sabrina. This was fun conversation.

Coral: Thank you for listening to this week’s IA40 Spotlight Episode of Founded & Funded. If you’re interested in learning more about Credo AI, visit Credo.AI. If you’re interested in learning more about the IA40, visit IA40.com. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded and Funded with the CEO of Github.

Acquired Hosts Ben and David on Getting Started at Madrona and Tom Alberg’s Legacy

This week, we’re excited to release a special live episode recorded during the community event portion of our 2023 annual meeting. Madrona Managing Director Matt McIlwain talks with the hosts of the Acquired Podcast and Madrona Alumni Ben Gilbert and David Rosenthal. The three reflect on Acquired Podcast episode No. 28, which dove into the Amazon IPO with the late Madrona Co-founder and original Amazon Board Member Tom Alberg, and the early days getting the show off the ground from the Madrona offices.

You can watch the live video here.


This transcript was automatically generated and edited for clarity.

Matt: I’m Matt McIlwain, I’m one of the partners at Madrona and we’ve had a very action-packed day, and some of you are here for the day and many of you have been investors and partners and friends. At the beginning of the day, we reflected back on one of our co-founders, Tom Alberg. Tom was a co-founder of Madrona. He built, helped build Perkins Coie Law Firm, sold McCaw to AT&T. He started Madrona, led the first investment in Amazon and a bunch of other exciting things over the years. And he always had this incredible ability to be curious, to be impact-oriented, and to think long-term.

As we were thinking about how to celebrate this community, this amazing ecosystem, he even wrote a book on flywheels. Actually, we were talking about the University of Washington and how Tom had helped raise the money for the first standalone computer science building 20 years ago. You think about how Madrona’s invested in well over 20 companies out of that computer science school and now, increasingly, places like the Institute of Protein Design and the intersections of those. So, we wanted to have another fun engaging way for the whole community to both share on Tom and then through that really Tom’s legacy. We thought what better way to do that than hang out with Dave and Ben.

I think for very few of the people in this room, David and Ben need an introduction. But they are the co-founders of the Acquired Podcast and Madrona alumni. A consistently top 10 in the world technology podcasts.

Ben and David working on an early Acquired Podcast episode in the Madrona office.

David: And it all began after hours in Madrona conference rooms.

Matt: There you go.

David: And maybe a few partners’ offices that we commandeered without them knowing.

Ben: Thanks, Tim and Scott.

David: And Soma a few times!

Matt: I think that’s a part of the fun aspect of this story is that David and Ben were at one point working at Madrona. But let’s even take a little bit of a small step back from that. They’re both accomplished venture capitalists, incredibly community-oriented people. But I think our good friend Greg Gottesman was the person that gets credit for bringing you two together, and I think it might have even been at a Seder dinner at his house. Tell us about that story.

Ben: Yeah, I’m originally from Ohio.

David: I’m going to cut you off even before we start.

Matt: See, they’re not used to being the ones that get interviewed.

David: We’re like an old married couple at this point. So, there were ulterior motives for both of us at that Seder, I believe. And the irony is we ended up getting together.

Ben: No. I didn’t have an ulterior, I was from Ohio. I did not know people in Seattle. Greg wanted generously invited me to his home for Passover…

David: Because he wanted to recruit you to Madrona…

Ben: It worked.

Matt: Well. Was that the whole story?

David: The other half of it was that I was just about to come back to Madrona from business school and was working on trying to close what I thought would be my first deal and court an entrepreneur. And Greg was helping me and said, “Okay, why don’t you invite him to Passover Seder?” And so, I was trying to work on a deal at this Passover…

Ben: David’s like reading the Haggadah trying to sell.

David: And there was this Ben guy there.

Matt: That eventually did lead to both of you working together at Madrona. And so maybe share a little bit about how that started to shape your friendship long before there was a gleam in your eye about doing some podcast together.

Ben: It’s interesting. I was fortunate to work at what we call then Madrona Labs now Madrona Venture Labs. And I didn’t know anything about venture capital, and so I had this sort of immense respect and looked up to everyone who was an actual investor at the firm. And I always, I have this sort of mental framework that in any business, you want to work in the core competency of the thing that the business does. I was looking around the thing here is investing and like David’s a guy that I had this relationship from a Passover Seder with, and I just want to absorb everything in his brain. And Acquired really was me attempting to find a way to get to spend more time with David.

Matt: Now the true story, now the true story.

David: The feeling is definitely mutual because of course, as a venture capitalist, you realize that it’s the entrepreneurs and the builders who create all the value. So, I just wanted to extract everything from Ben’s brain as like, ” Hey, as a Microsoft product manager who” — Ben chronically undersells himself. He was on the cover of the Seattle Times as the future of Microsoft.

Ben: The day before I joined Greg.

Matt: Hey. Timing, timing.

Ben: And it was super unintentional. I thought the story wasn’t going to run because it was like a long lead piece that I had done the interview like three months before. And I assumed because it didn’t come out yet that the story was dead. And so, I signed the offer letter with Greg to come to Madrona Labs and then I got the email from the Seattle Times, “Hey, watch the paper tomorrow morning”.

And it was only on a morning run. I was mentally preparing to go and give notice. “Hey, I signed an offer last night.” And I run past one of those newspaper bins with my face on the cover. Like a hundred percent. True story. No exaggeration.

Matt: There’s been some newspaper bins that my face has been on the cover. But that’s a different topic though. But I don’t think I’d ever heard that story. That’s really cool. And is that why you’ve never gotten to interview Satya?

David: It might be… despite repeated attempts.

Matt: Maybe we can all figure that out someday here together. Podcasts were not quite what they are today, back when you all started the Acquired Podcast. So how did this idea even come about? How did you first get started on it?

Ben: I actually do mean it when it was, I was trying to spend more time with David. So, we were at drinks and I was pitching him on two ideas. I was like there’s two podcasts that I think would be good podcasts. I was a really big into podcasts. I was like loading podcasts over FireWire onto my iPod in 2009.

Matt: You were an early adopter, an early adopter.

David: Back when podcasts were cast that you put on your iPod. You were doing that?

Ben: Yes. And so the two concepts were acquisitions that actually went well. Because I have always felt like the media narrative is around look how terrible of a deal this was. And incinerating billions of dollars and three quarters later there’s a write-down. And if what we’re doing in the world is creating companies that we want to one day go public or what the vast majority of companies do get acquired, we should understand how to work backward from when those go well. That was topic one, and topic two was let’s dive into businesses that have had multiple billion-dollar insights. Because the working hypothesis that I had at the time was most businesses that get to scale forever draft on the core insight they had when they started the business. And can never transform and come up with a second one. Occasionally you get an iPhone or an AWS, but very rarely. And David, I remember your comment to me was like, “I will 100% do the first one with you. And for the second one, I think we’ll run out after five episodes.”

David: And the first one only got us about 20 episodes or so before we ran out of that. But I think, like to me, the magic of how we started is both of those ideas were small ideas. And starting a podcast was a small idea. It really was, the feeling genuinely was mutual. It was an excuse for us to spend time together and build our friendship. We never could have imagined that podcasts would grow as big as they have, that we would grow as big as we have. We were talking with Nick from Rec Room earlier about when he started a VR company. However, many years ago. That was about the same time.

Matt: By the way, Nick, we got grilled on VR companies today. They were like, “What were you guys thinking?” We’re like, “Wait, we invested in Rec Room.”

Ben: Rec Room is an open world multi-platform experience. I don’t know…

Matt: Because Nick listened to the market.

David: Sometimes you just get really lucky in life and magic happens with the right set of people and things change and you figure them out along the way. And that’s what we’ve done since the days in the Madrona offices.

Matt: And I think you all were recording in the offices if I remember correctly.

Ben: Yeah. We’ve got a picture, I think in what is now Soma’s office, of us sharing a headphone because we didn’t buy multiple sets of good headphones. (See photo above)

Matt: It’s a good frugal startup. There’s nothing wrong with that. That’s perfect. Since you did go down the Acquired Podcast path, tell us about those first few episodes. And then, I want to get to this one that was, I think, your second break from actually the Acquired story. But you did a few before that, maybe 20, 25 before you went up in a different direction on IPOs.

Ben: Yeah. The early ones, it was interesting. We, if you look at our analytics now, what you will see is, every time we break our record and achieve the greatest episode that we’ve done to date, which is cool that happens every couple months.

Matt: Especially because they’re so freaking long. But that’s just my opinion.

Ben: Totally. It is an episode that is just David and I and the canonical wisdom for when you start a podcast is 30 to 40 minutes release weekly to set a listener habit. And have guests because then the guests can promote the show. And zero times in our entire, at least the last five years, has a guest episode been the highest ever.

And that’s forced a lot of introspection. And I think at least one of our big takeaways is to always focus on the thing that you do that is unique and differentiated. And almost everyone who’s a good podcast guest goes on multiple podcasts. And so, it’s less…

Matt: Concentrated.

Ben: Exactly. The thing that we can do that’s different than everybody else is the format that we have developed. That is just the rare thing that’s David and I doing that format.

David: And I think that is the core of the magic that was there in those first days that I think carries through to today was, it is about our friendship and about us learning together. And that’s what we were doing in the Madrona offices for, Matt, you were referring to, I think it was until the Facebook IPO episode we did. Where every episode we did, it was so tightly constrained to this has to be an acquisition of a technology company that went well. That we are going to analyze. And then whenever, this must have been 2016 when we were like, well, the Facebook story is so important. Maybe we can expand to do IPOs as well.

Ben: And the thing that we thought was true was people want to listen to this because they like us grading acquisitions and we were super wrong about the job to be done of acquired in the mind of a listener. The thing that they were actually there for is great storytelling, structured analytical thinking, and Ben and David. And over time we learned that we can apply that to like anything. There doesn’t have to be a liquidity event. It could be Taylor Swift’s music career.

Matt: When’s the food show coming out?

Ben: We should LVMH — Dave and I talked about handbags for four hours.

Matt: So eventually, and actually I’m thinking specifically of episode 28. You not only went to a second IPO, but you had a guest on to help tell the story, and that was of course the story of Amazon’s IPO and Tom Alberg was the guest. So, take us back to that episode. I’ve re-listened to it now multiple times. And just tell us a little bit about how that an idea even came about. And how you prepared for that particular moment?

Ben: First of all, I was super nervous because I didn’t really have a relationship with Tom, and he’s so unbelievably accomplished. And so, David, you were the one to like swing in his office and be like, “Hey Tom, do you think you might be willing to come on our pathetic little podcast as a guest?”

David: And Tom, of course, said, “Yes.” As anybody who knows Tom, would’ve known that immediately he would. I was also a little nervous in part because he was one of my bosses. I guess one of your bosses, too. In a sense. But Tom was just so you know, it’s funny, Ben said that guests go on multiple podcasts. Tom probably did some others, but he was just quiet and unassuming and so humble and like genuinely. And I was coming from, before Madrona, I had worked at News Corp where not my direct boss, but like my…

Ben: Little bit of a different culture.

David: The culture of the founder was a little different.

Matt: I think we’re picking up what you’re putting down.

David: Yeah, a little different than Tom was. I re-listened to the episode on the flight up here, and one of the things I didn’t even pick up on at the time, while we’re interviewing, he just casually says the line of, “Oh yeah, I was involved with Vizio, too”. Like Vizio was a multi-billion-dollar outcome. And he was just like oh yeah. Oh yeah. I think it was when Ben was introducing him.

Ben: And there were moments where we would ask a question and Tom would say, ” I don’t know. That’s a good question.” And most other people would’ve conjured some sort of answer to try to sound smart. That was of no interest to Tom.

Matt: I think the Vizio connection, if I remember correctly, was that Doug Mackenzie from Kleiner was on the board.

David: That’s how it came up. We were talking about what happened about John Doerr and Amazon. Yes. And that’s how the Vizio connection came up.

Matt: Re-tell that story. That’s a great story.

David: The John Doerr. Oh, this is so great. Re-listening to the episode, I Tweeted about it from the plane. It is so cringey. Cringeworthy to me and to Ben, too, to re-listen to it because we listen to ourselves. We’re like, we were talking too much.

We just we had this amazing person, like we were so young and green in what we were doing, and if we had the chance to redo it, we would do it differently. But Tom is amazing and one of the stories he told that we almost didn’t let him tell was when the second venture round for Amazon came together.

Ben: And Tom was on the advisory board for Amazon.

Matt: It wasn’t quite an official board yet.

Ben: Because I think Amazon had raised maybe like $1 million on a five pre. Tom was one of the angel investors. And so, he was Jeff’s maybe only advisory board member at that point.

David: I think he said there were a few.

Matt: Yeah, it was a small group, and I think Tom was the only one with substantial business experience.

David: And now putting it together because of the Vizio connection is probably how this came together. But Tom says, ” I came home from work one day and we and Jeff were starting to think about raising some more money. And my wife said, do you know some guy named John Doerr?” And Tom was like, “Well yes. I know who John is”. And she said, “You need to call him back because he’s been calling here every 15 minutes, for the last several hours. He on your home phone number and saying he needs to talk to you.” And it was all about trying to get an angle to leave the round.

Matt: And he beat out General Atlantic, which is, was in the news the last two weeks for different topics we probably won’t get into right now. But yeah, that was pretty interesting time.

David: And not only beat out General Atlantic but beat out at half the price. Yep. That was how powerful John was.

Matt: But almost lost the deal because…

David: He tried to hand off the board seat.

Matt: Ah, there you go. There you go. What else do you remember from that episode or things that stood out to you two as you looked back and listened to it, other than, ” Hey, we’ve come a long way”.

I thought you guys did a great job on that interview.

Ben: Tom did a great job at helping under, helping David and I understand when certain things got introduced into the Amazon dogma that today we just assume have been there forever. Like he set the record straight that the famous flywheel diagram and we all talk about the Amazon flywheel and everybody tries to graph their own business onto Amazon’s flywheel. But no one’s flywheel as strong as Amazon’s flywheel. That was a two years post-IPO conversation that sort of came up. And I think a lot of us try to attribute, “oh, Amazon, from the very moment that Jeff conceived of it was exactly this way.” And Tom just very graciously was both very respectful of the genius of Amazon from the very start, but also helping to unpack when did certain components of the Amazon lore actually get layered on.

David: Yeah. There was a moment, I think, Ben, when you asked him, ” When you met Jeff in that first fundraising round. Oh. Was he like something special? Was it clear to you that this was one of the most amazing entrepreneurs in history that was sitting before you.” And Tom again so graciously was like, “No.”

Ben: I think he said. ” He was very good. Definitely in the top 10 or 20% of the entrepreneurs that I meet with. But, you know, we work with a lot of great entrepreneurs.”

Matt: We do work with a lot of great entrepreneurs for on record.

David: And I think he, he just made the point so well. Yes, of course Jeff was incredibly smart, incredibly driven, had a great idea, was operating in a great market. But nobody, and I think Tom even said this, not even Jeff and his wildest dreams at that moment, as ambitious as he was, could imagine what Amazon was going to become. And I think the conversation with him was great because, and this just so reflected his style that I absorbed from working with him, of, you can’t have delusions of grandeur. You need to work every day along the way and respond to the market as it develops. And that is the story of Amazon. And Tom is such a big part of it.

Matt: And then there’s this whole thing about, as a CEO, you’re always busy, there’s always a lot of work, but there’s really just a handful of consequential decisions. I remember the Barnes & Noble story, that’s another one that really stood out from that episode to me. And he unpacked that whole, the Barnes & Noble folks. You guys know this knows better than me, but that to me, or maybe there’s another one that stood out to you about how the CEO’s job is to be super thoughtful about being the kind of the, ” aligner in chief,” but then also be there to lead those couple of decisions that are the key decisions every year to be made. And that felt like one of them.

David: It absolutely was. The Barnes & Noble, the quick story is the Riggio Brothers who were these rough and tumble, like Brooklyn, New York guys. Not Tom’s style. And at that point in time, not Jeff’s style. Jeff then was very different than Jeff today. And they basically came and said, “We’re going to kill you,” at dinner with Tom and Jeff and said we’re coming for you.

Ben: “You can either partner with us, or we can buy you on terrible terms, or we can leave dinner and we will kill you,” was effectively the message.

David: And Tom and Jeff together, decided to fight. It was Jeff’s decision, but I think Tom helped him get to the decision to fight.

My favorite non-Amazon Tom story is my last day. I will never forget this, my last day at Madrona, before I went to business school. I spent two years as an associate. I was locked with Tim and Tom, in a Wilson Sonsini boardroom, negotiating a recap of a company that had raised too much money and fallen on some challenges and other people on the cap table were unhappy. I remember the whole time thinking like, this is my last day before my summer, before business school, and I’m locked in this boardroom. What are we doing here? Ventures about focusing on your winners. What are we doing? And that was just not Tom’s style. And that company today is Impinj, which is a three-and-a-half billion-dollar public company.

Matt: Yeah, that’s that long-term mindset.

David: That is that long-term mindset.

Matt: One of the things that really impresses me and leads me to listen to your podcasts on my jogs. It’s multiple jogs for me to get through one of the episodes. Just to be clear, I don’t run that far these days. But you are so good at research and preparing. How do you process having this external research with what you want to ask and what you want to do your own analysis around on episodes? So that one may be particularly, but more, maybe more generally, too.

Ben: If there’s any metric that we watch the most carefully, this is the one that we care about the most. Which is, is it possible for us to have no inside information but for insiders to have to think we did because we understood it that well. And it’s not like it shows up in a graph, so it’s a hard thing to track over time. This has been like a superpower of David’s and a thing that sets us apart from other podcasts or Substacks. There is an immense amount of public information available on a company. And if you’re just willing to take a month and use the internet the very best way that you can, you find so much. If you scope with date operators, so you’re only looking at New York Times articles on Nintendo from 1989 to 1990, like I was last week. The public sentiment and the journalist sentiment changes so much from year to year, that five years from now, the way people talk about companies will completely overwrite how they’re talked about in the era today.

Another great example is people go to conferences all the time. Lots of times, the talks are recorded on YouTube. A lot of times, they’re like boring industry conferences that don’t get a lot of fanfare and coverage. But David will go find a talk that some mid-level executive at SpaceX gave that got like 2000 views on YouTube at an aerospace conference and have this insight on SpaceX that is super overlooked by the media and a key part of their story.

Matt: So I wasn’t planning to ask this, but can’t help but asking, have you tried using ChatGPT yet

David: Yes. And it’s wrong. Sometimes very wrong. I can’t remember what the example was, but I…

Ben: Oh, when we were doing the NFL episode, I was what was the most recent stadium that was built that didn’t have a dome on it, and it gave me the wrong answer. And I was like, that’s not true. Give me the one before that and it gave me a wrong answer again. This is, it’s like fun to talk about, right? In rooms like this and on Twitter, these things get a lot of like pickup of GPT is so wrong, it hallucinates. But do you remember talking to your friends in 2004 and being like, you can’t trust anything on the internet.

Matt: Or as Mikhail pointed out at lunch today. Can you trust your friend? Is it a hallucination or an insight? So that was really powerful moment we had earlier in the day.

What about, you’ve done a couple of more recent episodes on the history of Amazon and then the history of AWS. I enjoyed them immensely. Maybe just an extra thought or two on what you all have learned, what we all have learned in this ecosystem. We spent a lot of time talking about some of the amazing things that Microsoft’s doing and less so this year and today on Amazon. Any thoughts, big picture takeaways from all the research and the podcasts you’ve done on Amazon over the years?

David: Yeah. It’s interesting. My view on Amazon has evolved quite a bit, even up until our most recent episode that we recorded that is not out yet. We did another episode with Hamilton Helmer, who wrote the great book, “7 Powers.”

Ben: Best business strategy book out there.

David: Absolutely. Up there with Porter’s “5 Forces” and “The Innovator’s Dilemma.” Anyway, this episode that we did with him was specifically about transforming as he calls it, which is ironically Ben’s second idea for a podcast of companies that have had a second act.

And the amazing thing about Amazon and the AWS story within it, and as we talked about on the episode, there are many origin stories of AWS, all of which have some element of truth.

Ben: Except for the fact that they had excess servers, that is patently false.

David: That is false.

It seems so far afield from what Amazon retail was. But what Hamilton and his research has done on this. Is it’s actually, if you think about a two-by-two matrix for companies of your existing customer base, it’s your existing customer base on one access and your existing capabilities within the company. AWS is a completely 100% different customer base than the Amazon retail customer base. But the capabilities within the company that Amazon had to build to serve retail were the same to build AWS. They had, at the time, the world’s best internet architects, backend servers, etc. Everything that goes into building AWS, they had to build in-house. So, it actually was a very natural thing. And in Hamilton’s research companies that have done this, almost always, that is the case. Different customer set, same set of capabilities within the company to serve their customer base. I always used to think about Amazon as this like just incredible wild idea factory. And I’ve come to appreciate, at least in the AWS case, and I think in some of their other successful forays, too, there is a little more science to it than that.

Ben: Hamilton’s sort of advice to founders who are looking to figure out what’s the next S-curve to stack on top of your S-curve is what are your capabilities uniquely enable you to do versus your competitors? Or all the other companies out there, even if you’re not competing with them right now, what can you uniquely do, even if it’s serving a different need in the world?

Matt: I think back, 16 years ago, we hosted an event up on Capitol Hill with Andy and couple of startups. I think Eric Brown, who’s in the room, presented how Smartsheet was using AWS at that event. And that was the launch of AWS. And it was clearly focused on startups and developers, but it was that developer orientation. And then they were able to build a platform strategy, too. And you think about companies over time built on top of that platform, Snowflake, our portfolio companies, and many others. And that kind of platform capability is a core competency of Amazon and other areas. Of course, Prime being a prime example.

David: The other amazing piece of the AWS story that I didn’t appreciate until we did our episode. Hearing from, in that case, we did get to talk to some folks within the company. The go-to market organization around AWS that was not an existing capability within Amazon. And the story of AWS and Andy Jassy building that is one of the most incredible entrepreneurial journeys. There’s this trope in venture: in enterprise, like at the end of the day, whatever percentage, 50%, 70% of enterprise software gets sold through the four or five big enterprise software sales giants, whether it’s Microsoft or Oracle or Salesforce, or what have you. Amazon built another one of those giants from scratch, which is amazing, with a lot of Microsoft DNA.

Ben: And then retained first place. They had a five-year lead on cloud. And then they retained first place 15 years later.

Matt: And they’re always continuing to learn, which I think is, I think both Amazon and Microsoft have now built or rebuilt that muscle of continuous learning. And I think that is possibly why they’ve become both, such major forces in the technology-driven ecosystem. Speaking of ecosystems, you both, and particularly David, have exposure to the Seattle ecosystem and Silicon Valley. You and your wife, Jenny, moved down to Silicon Valley many years ago. This was long before we opened up a Silicon Valley office. How do you compare and contrast the ecosystems both from a startup perspective and a venture perspective? You’ve seen both, you’ve certainly got plenty of experience with companies you’ve worked with and built and co-invested with others, too.

David: This is just my perspective, and I’m not sure that it’s right, but it’s the lens that I think about this. Even when I was at Madrona now many years ago, I thought that this was starting to happen and now I think it’s really progressed on this journey. I don’t think of them as actually different. I think of them as the same ecosystem. And now I think with the number of people that I know that I’ve met and become close with over the past few years in the Bay Area ecosystem that have moved up to Seattle during Covid is enormous. And all of those companies are, maybe they’re Seattle companies, maybe Bay Area companies — I don’t even know how to classify them. It’s the same thing. They’re cross-border, so to speak. And so, for me, I never thought them about them as totally separate. Now, I do think, on the margins, technology workers in the Bay Area historically are more likely to start startups and on the margins, Seattle technology workers have been more likely to stay at Amazon and Microsoft. A large part of that was the Amazon share price. It was a good incentive to do that. In my investing and working with founders and getting to know folks through the podcast I, I think that’s changed as well. I don’t see the appetites as being different now. What do you think, Ben??

Ben: I’ve had to redefine my lens. For PSL Ventures, I’m a Seattle, Pacific Northwest-focused venture capitalist. What does that mean? That used to mean we invest in butts and seats here, but that’s stupid. What I care about and the reason that this is our fund’s thesis, and part of Madrona’s thesis, is the talent pools that get trained at these institutions. The University of Washington that come out of Amazon, that come out of Microsoft. I don’t care what physical location people are in when they’re starting companies. I care that they have unfair access to the talent networks that are coming out of these institutions. That is where there’s alpha generation.

David: One thing that’s interesting, we’ve gotten to know a bunch of Stripe folks over the past couple of years through the Acquired Podcast. A huge percentage of those people are now in Seattle for various reasons.

Matt: Yeah. I was with a bunch of Stripe senior execs the weekend of SVB, and a lot of them live here, to your point. That is an interesting point, was one, we didn’t explore as much today at our annual meeting this notion of what’s the hybrid. We still think that having a nucleus of the team close by, especially at the early stage, matters a lot. But a lot of these teams are increasingly hybrid, increasingly distributed teams, and you want to have the best talent in the best roles in the world. So, it is evolving and yeah, I’d like to think that Seattle’s got a little bit of a different culture than the Valley based on my experiences. I don’t know if you guys would agree with that or not.

Ben: Totally. I think we are for better or for worse, and the answer is both, insulated from height. You’re not going to see as many companies go raise four consecutive funding rounds with very little revenue growth or product development advancement, but also when the market falls apart and you’re like, “Oh no, where’s the intrinsic value,” the companies that are in the portfolio of Seattle venture funds tend to be correctly valued. I don’t know. It’s a double, double-edged sword.

David: I do think, though, there is a tremendous demand among Silicon Valley venture capitalists to invest in Seattle companies. I’m sure you both see this every day. Still, I haven’t lived here since 2016, but I get asked all the time, what are the best companies in Seattle?

Matt: And I’m sure you just point them our way.

David: Exactly. I say, have a great venture firm, two great venture firms I can introduce you to.

Matt: That’s awesome. Coming back, building on this ecosystem. Of course, Tom’s book about flywheels, thinking about that episode and Tom’s impact on really all of our lives. We sat next to each other for 20 years and he was a total amazing human being and friend, mentor. Is there anyone last thought you might share about how you think about him and maybe his legacy and something that might leave as a thought around inspiration or legacy for the rest of the group?

Ben: Every conversation I ever had with Tom — he was a very curious person. And I hope that if I have a fraction of the success that Tom had in life, that I stay as curious as he did.

Matt: Love it.

David: I completely agree that. I was chatting with someone earlier about our interactions with Tom, and just how he approached things and I was reminded he was so much older than us. And I don’t mean that in a bad way, but just like a factual way. But from talking to him, you would never think that. He had a young mind always. And I hope that I can be that same way when I’m further up there in years.

Matt: Guys, you’ve been really kind to let me turn the tables a little bit on you and do this interview and have this discussion.

You’re great people, great friends of the firm. We wish you all the continued successes as investors and ecosystem builders and, of course, with the Acquired Podcast. So, thanks so much for being here today.

Ben: Thanks, Matt.

Thanks again for listening to this week’s live episode of Founded & Funded. Tune in in a couple of weeks for our next episode of Founded & Funded with the CEO of Credo AI.

Common Room’s Viraj Mody on Building Community, Foundation Models, Being Relentless

Madrona Managing Director Soma dives into the world of intelligent applications and generative AI with Common Room Co-founder and CTO Viraj Mody. Madrona first invested in Common Room in 2020 — and we had the pleasure of having the founders join us on Founded & Funded the following year.

Common Room is an intelligent community growth platform that combines engagement data from platforms like LinkedIn, Slack, Twitter, Reddit, GitHub, and others with product usage and CRM data to surface insights from across an organization’s entire user community. Customers like Figma, OpenAI, and Grafana Labs use Common Room to better understand their users and quickly identify the people, problems, and conversations that should matter most to those organizations.

Soma and Viraj dive into the importance of deeply understanding the problem you’re trying to solve as a startup — and how that will feed into your product iterations — why organizations need a 360-degree profile of their user base, how Common Room has utilized foundation models to build intelligence — not just generative intelligence — into its platform — and so much more. So I’ll go ahead and hand it over to Soma to dive in.

This transcript was automatically generated and edited for clarity.

Soma: Hi, everyone. My name is Soma and I’m a managing director at Madrona Ventures. Today, I’m excited to have Common Room co-founder and CTO Viraj Mody, here with me. I’ve been fortunate enough to have been a part of the Common Room journey right from day one when we were co-leads in the series seed round the Common Room did a couple of years ago. So it’s been fantastic to see the company come from start to where they are today. Viraj, welcome to the show.

Viraj: Thank you for having me, Soma, and thanks for partnership from the early days.

Soma: Absolutely. So Viraj, why don’t we start with you giving us a quick overview of the genesis of the idea and where you guys decided this is the problem space you’re going to go after?

Viraj: So one of my co-founders and our CEO, Linda Lian, led product marketing at AWS and did a lot things by hand. So AWS has a phenomenal champion development program called AWS Heroes, and Linda was involved in that and that planted the seed for her in terms of the power of unlocking community and champions out there to help. And then independently myself, my previous experience was at Dropbox, which was pretty early in the product-led growth journey. And so we spent a bunch of time at Dropbox building internal tools that essentially helped unlock a lot of the insights like Common Room.

So Tom, one of our other co-founders, him and I worked at Dropbox, and then our fourth co-founder, Francis, was one of the early designers for Facebook groups, which was a very community-led, powered by the people-type surface within Facebook. So all of us had various perspectives on the same problem, and so it was a very natural fit when we all started chatting and exploring how to convert all of our various experiences into a product that we can help other companies leverage their community and customer base.

Soma: People say that hindsight is always 2020. Today, you’ve got a product out in the market. You’ve got a lot of great logos as your customers. S o you can say like, you know, “Hey, based on the traction, I can sort of project what the future could look like.” But when you started at Common Room, how did you decide at that time that this is a bet that was worth taking, and why you decided to spend the next chunk of your life working on this company, building this set of products, and going and doing something fantastic in the process.

Viraj: I think it comes from really understanding the problem space, and I think that’s why having co-founders with complementary skills is really important. Each of us brought a unique perspective but had a pretty unifying vision for where we want to see this product and the company go. We were pretty early in terms of seeing some of the motions that were being unlocked by community, both based on our experiences, but more importantly, talking to customers who are already doing this as part of their product journey. Some of our early customers, like Figma and Coda, were great partners in helping us think through how they would like to shape their business growth engine. And then that spurred a bunch of ideas for us.

One thing we did a lot early on that I would suggest everybody spend time if they’re in an early team, is talk to customers, not with the idea of helping them give you solutions, but really deeply understanding their problem. And then using your unique perspectives and experiences, plus your context of what’s going on in the ecosystem. We’re pretty well connected in terms of not just our networks but also a lot of peer companies. And so connecting the dots of the problems customers face, especially progressive customers who kind of see where the world is going, and then partnering with them in order to build that vision of the future. I’d say that’s been a key ingredient in the early days. And then, for me personally, I think I have a lot of experience and confidence in my ability to build this.

Back when Common Room started, there were a few fads going on. There was crypto and there was FinTech, and those were all exciting marketplaces, very exciting. But for me personally, my strength and experience and scale lined up really nicely with building business enterprise-scale SaaS software. And that also beautifully coincided with some of these leading customers we were talking to who had real problems that we thought we could uniquely solve that no one else was paying attention to.

Soma: Love to hear that confidence, Viraj. Both you having in yourself as well as your co-founders and what signals you’re seeing in the market. I don’t know if you remember this, but we had you and Linda on this show a little while ago. And at the time, we were talking a bunch about what goes into making a great founding team and how do you find people that are aligned or bound by a common mission and vision and there was a great story that you guys talked about what you were looking for, what were some of the mishaps you had along the way, and how you ended up with the founding team that you have now and, and, and all that fun stuff. If you fast forward two years from then, how do you think that journey is going? Do you still feel the same level of excitement and energy around the founding team, or do you wish you had done anything differently?

Viraj: There are plenty of things I could have done differently, but I feel like, all in all, I feel fortunate having the founding partners, but also, the rest of our team has just been phenomenal working with all of the Roomies. I wouldn’t change the team for anything. I think we have a great crew. One thing that I think summarizes how we’ve operated over the last few years is just relentlessness and focus on executing with velocity. I feel like these two have been pretty consistent. As with every step of scale, we’ve had to change our approach, but we haven’t changed both of these just focusing relentlessly on customers and their problems and then internally focusing relentlessly on speed of execution. And I think both of those together have been really impactful in helping us get to where we are. So I definitely wouldn’t change those. Anyone who says they wouldn’t change anything about the past is generally not seeing the whole picture. So obviously, there are things, but all in all, I feel like we’re positioned to do well as long as we continue focusing on the things that matter.

Soma: I always say, Viraj, that having a great founding team is sort of half the battle won. And I have a variety of companies that I work with and I look at like the forwarding team that you guys are and have put together — I feel really good about what you guys have done. People talk about ICP, or Ideal Customer Profile, but I want to take a step back and ask you what kinds of companies do you think need engagement with their communities. And how do you think it impacts the business? Is it all around customer satisfaction, or it does go beyond and say like, “Hey, I can help you with the top line, I can help you with the bottom line, I can help you with product adoption. I can help you with this, I can help you with that.” How do you think about that?

Viraj: Broadly speaking, this is important for every company out there because it all starts with the definition of community. I think it’s very easy to try and paint a very narrow picture of what a community is, but really you think about your community as existing users, future users, people who engage with your brand, people who have heard about your company, but not really used it. You can sort of encompass all of these and then build a bottom-up community strategy. And I think the answer goes way beyond, you know, just the customer satisfaction part of it. That’s kind of the bottom of the funnel in many ways. I think every company in the world really needs to think about how they can accelerate their own growth with data and insights unlocked from their broad community of users, not just like a very narrow definition of social media community or forum community, or Slack community. Companies that do this right unlock all sorts of superpowers, not just in terms of growing their top line and bottom line and absolute dollar numbers, but also getting really high-quality signal from people out there about problems that need to be solved that they may not have on their radar. Or identifying some of your most active champions in different parts of the world, who exhibit behaviors that are not easily spotted using conventional tools.

Soma: I’ll tell you from my vantage point, one of the things that I thought you guys did a great job, even in the last 12 or 18 months, is with the kinds of customers you’ve been able to sign up to start using Common Room and to see the benefits of Common Room community engagement. You go down that list, it’s literally the who’s who of technology customers and logos. And I would say that’s a fantastic place to be in because it’s one of these things where you get the leading companies then others follow fast. Was that a strategic imperative that you guys took or how did you guys end up with literally a phenomenal set of logos and customers, even in as early stage of a company as you are?

Viraj: That’s been a key focus for us from the early days, making sure that we have some of the best thought leaders in our space. And I think that’s been key. When partnering with early customers, it’s important to identify companies that are seeing the future the same way you do. So many of the logos you see on our website and using Common Room embody that — they are at the bleeding edge of how to engage with their community, how to leverage their community, and how to grow and cultivate champions. So it was a very deliberate decision on our end to identify who we think these companies are. And then it’s been almost a “practice what you preach,” right? Once you work with people who are really aligned with your vision, they then help sort of promote your product and your vision to their peers. Who, by definition, are other leading logos. So it’s been really helpful for us to sort of use what we’re helping our customers do on our own to grow our customer base. So between that and the networks we’ve been able to unlock, obviously from the team here, but also from our investors and partners, it is been very deliberate and I think it’s been paying off, at least so far.

Soma: Can we now go one step further and talk about a couple of specific examples? I know for example that Figma and OpenAI are a couple of your customers. Can you specifically talk about what they do with Common Room?

Viraj: We’ve been fortunate to work with some of the best companies out there and some incredibly well-known logos. Each one obviously has a different focus, but there’s a pretty common overlap of use cases across all of these. Since you mentioned Figma and OpenAI, I can chat a bit more about those companies. Both of these are very community-first in terms of not just a community of users but also a community of practice, where, you know, “Hey, we are building a product and obviously we want users to use and champion our product. But independently of that, we also want to build a very robust community of designers for Figma who speak with each other, bounce ideas, and help each other grow and nurture. And then for OpenAI, you know, a bunch of researchers and people on the forefront of AI practice. And then tying that back out to business outcomes.

So when Figma launches a new feature, how do they go and reach out to their champions who’ve been requesting that feature to help them spread the word and generate content? How do they host the best events geographically across the world, bringing together in real life or online people who are their biggest champions, people who’ve been generating content independently and talking about Figma features?

Similarly for OpenAI, as they’ve gotten to where they are, they’ve had several versions — GPT-2, GPT-3. Along the way, they’ve had communities they’ve developed on Discord and forums, where they discuss best practices about how they take this really nascent technology and then help unlock powerful use cases amongst each other. But then also having the company collaborate with them. So how do you make sure that people at the company are paying attention to these conversations going on?

Both of these companies have the fortunate problem of just having so much engagement that one thing that’s been powerful for them is being able to unlock signal from the noise, right? When you have something as powerful as GPT-3 and ChatGPT unleashed in the world, everybody’s talking about it and we have some of the most significant community growth we’re seeing amongst any of our customers in these companies. But from a company’s perspective, how do you take all of this great activity going on everywhere and extract the things that actually matter to you, which is where we use a lot of machine learning models of our own, but also foundational models from OpenAI itself to help.

Soma: One of the things that I’ve heard you and Linda talk a lot about recently is the intelligence layer that you guys are building as part of Common Room. Tell us a little bit about why do you think that is critical to what you’re building. And, more importantly, how do you go about building it into your platform?

Viraj: Once you start collecting information about all of the conversations happening across various digital channels. For some of the best companies out there, the volume of conversations happening is pretty overwhelming. The typical company will have social media presence with Twitter and Facebook and so on, conversations on Reddit. Then they’ll have closed forums, for example, where people are asking usage questions or questions about product, or open and or closed conversational communities like a Slack or a Discord server. Then you have technical conversations in GitHub and Stack Overflow. Plus, you have a lot of your internal systems with your CRM and your product usage data.

When you start thinking about each of these merging with the other, the amount of data you have starts to exponentially grow. And then being able to convert that into meaningful signal is where some of the most impactful outcomes happen. So where we’ve invested a lot in the intelligence aspects of Common Room are on the axis of community members. Once you have members in your community, really understanding who they are and building a 360 profile for them across all of these various channels, that’s one layer. The other one is around the activity that they represent going on. So someone publishes a YouTube video, someone else has a post on a forum, and someone else has a GitHub pull request or a GitHub issue — how do you take all of these as part of that 360 profile and then help paint a very clear picture of what sentiment is this member expressing? What are the key frustrations? What topics are they talking about? What are the different categories of conversations they’re having on all of these platforms?

So, intelligence about the members, intelligence about their activities. And then on a third axis, you can think about intelligence about businesses who are likely to buy your product or the propensity of businesses who are either existing customers or future customers — and how that interacts with the previous two. So, a business entity is made up of members who are having conversations. How do you build a model that helps you see the propensity of conversion or propensity of churn or propensity of upsell? And all of this can be derived from various signals for each channel there are different signals. So from day one, our focus has been not just on collecting data, because collecting data is a starting point. But how do you unlock key insights and outcomes from that data and then drive actions based on those insights and outcomes?

Soma: I really like how you framed it, Viraj. Talking about it as intelligence on people, intelligence on activities, and intelligence on outcomes. Now, switching to how did you decide what approach you are going to take with your models or with the models that you’re going to use? How do you approach training, you know, tuning of these models, are you using any of the currently really popular foundation models or are you thinking of building your own foundation models? How, how, how are you thinking about all of this?

Viraj: Yeah, it’s a combination of both. We have certain layers of intelligence that leverage custom ML models built with a pretty standard tech stack — you know, XG boost on SageMaker and feature engineering in-house. And then we also go leverage some cutting-edge foundation models, for example, OpenAI’s Da Vinci model but then help fine-tune it to perform ideally for our use cases and then also help us scale across our production data.

So from a custom ML capability perspective, we have a bunch of features we’ve built around the ability to auto merge members and organizations across various signals that we have about them. So, Common Room integrates with Slack and Twitter, Discord, GitHub, Stack Overflow, or for LinkedIn, Meetup, and dozens more. And the same person may have different profiles across all of these. The same person may have different conversations on each of these, plus internal systems like Salesforce and HubSpot. So we’ve built custom ML models that use signals from all of these different sources that we’ve trained in-house, using some of those technologies I mentioned earlier. Then we’ve built out a scoring that allows us to go say, “Hey, look, with a high degree of confidence, we think that this person is the same regardless of having a different name here or a different avatar image there, or a different email address”. So there’s an aspect to custom machine learning models that we’ve built for the use cases of merging members or merging organizations.

Then there is another use for custom models around propensity. Once you see a community-qualified lead of some sort, either through your CRM or either through community activity — how do you go build a model that helps predict a propensity of certain outcomes? Like, “Hey, you know, this organization is ripe to adopt your technology based on their champion behavior”. Or here’s one that’s likely to churn, so please go invest some time in making sure they don’t. So these two are examples where we’ve built a bunch of in-house models. But where the world is really exciting now with some of these foundation models is NLP and LLMs, providing a capability that just didn’t exist until recently, where you can go and quickly extract sentiment or extract conversational topics that are not necessarily keyword search. Or even categorize conversations as, “Hey, here’s a conversation about a feature request, or here’s somebody asking for support, or here’s somebody complimenting your product, you know, maybe you want to use that in marketing material”. So this is where we use foundation models by companies like Amazon or OpenAI. But in order to scale them for production use cases, we have to be able to fine-tune them. So, you know, OpenAI has fine-tuning capabilities, so we’ve been able to take the Da Vinci foundation model and fine-tune it for our use case, both as a performance optimization for better performance for our specific customer base, but also from a cost optimization so that we can actually go apply these models in a scalable way across our entire user base. Because without these, it can get really costly. It’s very easy to put on an exciting demo that leverages the hot new foundational model, which is great for a weekend project or with like toy data. But the minute you want to scale it to the kind of customer base that we have, or beyond that, you have to start worrying about the practicalities of, you know, downtime. If these hosted models have downtime, you don’t want to have downtime yourself. Or cost, if you are going to just simply pass through all of that cost, it’s going to become really expensive for you or for your customers.

The other one obviously around is precision and recall, right? A lot of the foundational models are built for general-purpose use cases, and they do a phenomenal job for them. At the end of the day, your specific use cases are going to be slightly more nuanced, and so how do you tune those so that your precision and recall are both even better for your customers? That’s where we’ve spent a bunch of time investing. I know generative AI is sort of the buzzword of the day, and we have some like, pretty clever ideas. But even before you go there, there are so many powerful things you can do with just unlocking capabilities that don’t need generative AI capabilities is just extracting signal from noise in interesting and meaningful ways. There’s some huge opportunity there as well.

Soma: Whenever you talk about Open AI today, most people immediately jumped either thinking about GPT-3 or ChatGPT kind of thing. And the fact that you are sort of not necessarily using that, but you’re fine-tuning a model to make it work for what you are looking for and do it in a cost-effective way that’s great to hear.

It’ll be interesting to hear how is OpenAI helping startups like yourselves. There are people who use ChatGPT. There are people who use GPT-3, and that’s sort of one set of people. And then, for people like you, has OpenAI been helpful?

Viraj: Yeah, absolutely. We’ve been partnering with them since the early days. We’re fortunate enough to have worked with some of the early OpenAI team, and it’s been really interesting to explore how to take some of these research and exploratory models and help commercialize them. OpenAI has different tiers of models internally. There’s Ada, Da Vinci, and several others. Each of them have a different cost, they have different performance characteristics. They have different use cases that they’re optimal for. And we’ve had a pretty open channel with them just in terms of trying new things before they are available to the general public or giving feedback both ways on what’s working, what’s not working. On pricing models, etc. So it’s been, it’s been extremely collaborative for us since the early days. And part of it also is walking the walk. We are OpenAI’s community, along with every other developer out there who’s dabbling in their technology. So making sure that they have the ability to get feedback at scale is pretty important to them. And so I’m, I’m glad we’re able to make that happen

Soma: You did mention a little bit about cost, and in today’s economic climate, managing the burn rate is super critical for every startup of every company for that matter. There is so much hype and buzz and excitement and craze around this generative AI and everybody’s experimenting with that in one way, shape, or form. And the cost could add up pretty quickly before you realize what’s going on. Do you guys feel like you are encountering that, or do you feel like your approach has enabled you to sort of stay ahead of the curve?

Viraj: One example of one of the models we use is it’s 10 x cheaper than off-the-shelf models that we could just pass through to. And a lot of that is the result of fine-tuning that I mentioned earlier. It helps us not just get higher quality results than just basic prompt design, but it helps us train the models so that it can optimize for our use cases without all of the extra cost. And then, from a deployability perspective, this just helps us deploy it in a way that takes a lot of the critical dependencies in our control as well. So monitoring cost, I think, is super important, especially as you are making some of these foundational capabilities available to customers. Because for some of the problems we solve, activity can change wildly. So, if a customer has a conference, you’ll get a week where there is a huge spike in activity, which will obviously drive a whole bunch of additional cost if it’s not built in a way to sort of foresee that event happening already. So if you build a company and then actually deploy it to production where you’re simply passing everything down to some foundational model, be it OpenAI or whatever else, you are likely to be in for a surprise if there’s a variance in volume that you’re driving, which is where some of the lessons to focus on our, like, “Hey, how can we keep our costs under control while still making sure you leverage some of the most exciting capabilities out there”.

Soma: Before we wrap up, Viraj, are there some hurdles you’ve run into when getting the company off the ground into where it is today? And what did you do to get over those hurdles, that might be helpful lessons for other people who are coming from behind?

Viraj: I think there is a level of paralysis that can happen if you try and game theory out every potential outcome. Even in your product — you could hypothesize till the end of the world around what customers actually want and what they’re saying, what they’re not saying, but nothing beats shipping product and watching customers use it or not use it. I think being comfortable shipping things at extremely high velocity with high quality, and that’s, that’s a hard one to balance. So my advice would be to have strong conviction within the team, not just the founding team but the broader team, around your expectations for what it means to ship. What it means to ship quality software, right? You don’t always want to throw stuff over the fence and say, we ship a lot of code. But also, you have to have some ability to ship an MVP. And so develop a consistent understanding internally within your team of what is and isn’t acceptable for who you are as a company. And then live that day in, day out. Many companies will say, “Oh, we should embrace failure,” but then they don’t actually embrace failure. Or many companies will say, “Hey, we should like ship MVPs.” But then when you ship an MVP, they point out all the a hundred things that are broken. And so, clearly defining how you want to operate as a company and then backing it up with the actual execution of how you work, I think, is important. There’s no right answer. There’s no single answer that works for every company. But I feel like each company needs to have a well-understood definition of what, how they ship, and what they ship.

Soma: That’s awesome. That’s, that’s a great answer as, as people think about getting off the ground kind of thing and, and sort of going through their execution environment and, more importantly, the culture. Because I think sometimes these things all come together, and you really need to think about these different pieces of the puzzle and how they come together as you build and scale a team. So with that, Viraj, I do want to say thank you for being with us, it’s fun talking to you. As much as I’ve been part of the Common Room journey from day one, just hearing it, and some of it is rehearing, it just gives me a lot of energy and excitement for what you guys are and what you guys are doing. Thank you again for being here.

Viraj: Absolutely. It’s been great so far. I’m looking forward to more fun times ahead.

Coral: Thank you for listening to this week’s episode of Founded & Funded. If you’re interested in learning more about Common Room, please visit commonroom.io. Thank you again for listening, and tune in in a couple of weeks for an IA40 Spotlight Episode of Founded & Funded with the founders of the Acquired Podcast.

Leaf Logistics CEO Anshu Prasad on Applying AI to Freight and Transportation

In this episode of Founded & Funded, partner Aseem Datar talks with Leaf Logistics Co-founder and CEO Anshu Prasad. Leaf is applying AI to the complexities of the freight and transportation industry, connecting shippers, carriers, and partners to better plan, coordinate, and schedule transportation logistics. This enables network efficiencies and unlocks a forward view of tomorrow’s transportation market while simultaneously reducing carbon emissions.

Leaf Logistics was founded in 2017, and Madrona joined Leaf’s $37 million series B in early 2022. From the beginning, Leaf has been fighting the one-load-at-a-time way that trucking has historically been conducted. The company analyzes shipping patterns to make sure that when a truck is unloaded at its destination, another load is located to return to the city of origin. Identifying these patterns allows Leaf to coordinate shipments across shippers at 1000x the efficiency and effectiveness typical in the industry.

Aseem and Anshu dive into the story behind Leaf, what makes logistics so complex, and how AI can continue to improve it. And Anshu offers up great advice for founders that he’s learned on his own journey.

So, I’ll go ahead and hand it over to Aeem to take it away.

This transcript was automatically generated and edited for clarity.

Aseem: Hi, everybody. My name is Aseem Datar, and I’m a partner at Madrona Ventures. And today, I’m really excited because I have here with me Anshu Prasad, who’s the CEO of Leaf Logistics and also the founder. Anshu, welcome, and glad that you’re spending time with us today.

Anshu: Thank you, Aseem. This is great to chat with you.

Aseem: So, you know, Anshu as they it all starts with the customer. Tell us a little bit about your journey and the problem space and what you observed talking to customers in this space, and how you narrow down the problem that you’re solving.

Anshu: I’ve had the benefit of working in the space for some time before starting Leaf. And in that journey, what I got to observe was what, when I started in the space, seemed like a winnable game slowly and undeniably became an unwinnable game. It was hurting both the buyers of transportation and the providers of transportation. And being able to sustainably do something for their business to make sure that they had healthy returns and they had some reliability on a day-to-day basis. It’s a big part of our economy, as we all appreciate, and if anything, the pandemic shown a nice bright light on the essential nature of a well-functioning supply chain — and what happens when it doesn’t function all that well. But in the in-between times, in between crises, transportation and logistics is something that we’d all just, frankly, wish we could ignore because it would just work in the background. But if it doesn’t work for the participants, the customers who are invested in the supply chain, it doesn’t really work.

And over the last couple of decades, it’s become clear that even big sophisticated companies for whom transportation is a big deal are finding it to be less reliable and less planable than they’d like it to be. So, that was really the core problem, seeing some very smart, very hardworking people that I had a chance to work alongside and serve struggling with a critical part of their business. And it became clear that it was a time for us, and many of the folks working at Leaf Logistics now who’ve also spent similar amounts of time poking at this problem, to do something differently rather than just wade into the same fight and try to do more of the same and hope for a different outcome.

Aseem: Anshu, maybe just double-click into this a little bit and give us a sense of the kind of problems both the shippers and the carriers face on a day-to-day basis.

Anshu: At the fundamental level, this is a very transactional industry. A truckload from point A to B is seen as a snowflake. And for anyone outside the industry, it seems really alarming that it would be that way. Because we have all had the experience of driving down the highway and seeing every color of truck you can imagine on the road — how is this being done one load at a time? But that’s really, for a host of reasons, the way that this industry has evolved. So, the core problem as it gets felt by shippers is transportation becomes a one-load-at-a-time execution challenge. And if you’re a big shipper, you might have 500,000 or 600,000 loads a year that you need moved, and you’re treating each of them as an individual OpEX transaction.

And on the flip side, the carriers are responding to a demand signal that is very fleeting. It is, again, just one load at a time. I’m getting a request 48 hours in advance to go pick up a load in LA and then drop it off the next day in Phoenix. But I don’t know anything else beyond that. I don’t know what I’m supposed to do once I unload in Phoenix. Where is that next load going to come from? And I’m supposed to get up and start to play this game again, one load at a time, tomorrow. So, the ability to keep my truck utilized or my driver paid, maybe even return the driver back home, which is very important for the driver, becomes really hard for me to manage. So it becomes a constant challenge of trying to catch up with the transactional intensity but not really solving the traveling salesman problem. We think that should be solvable, but it’s not really what the data on the table allow us to do.

Aseem: Yeah, I mean, it’s, it’s just fascinating to understand and learn and, you know, as we’ve sort of worked together, just get educated every day on the complexity of this industry. I have been curious about this for quite some time — how did your background, your consulting mindset, set you up for tackling this huge problem and for ultimately achieving success in this industry?

Anshu: When I entered the startup world in the late ’90s, the flavor of the month was to apply technology to old problems. And we lucked upon an area where freight buying is a big deal for many companies in CPG, for example. We helped them buy their freight using basic technology that allowed them to automate a process that they’d been running for decades, using floppy disks and in web 1.0 kind of ways. And that helped to streamline some of the basic procurement processes. But it gave us an appreciation for the centricity of this purchasing decision to their core business operations. At the end of the day, everyone obsesses over their customer, and transportation is often the last point of interface with your customer, and yet we’re buying it as if it was just this free for all transactional auction. And so, what was getting lost was that sort of customer engagement, the customer entanglement from a well-serviced, well-structured supply chain to something that is very much ephemeral.

And the way that I kind of landed up in the space was a little bit by circumstance and by happenstance. We ended up helping companies like a Unilever or Bristol Myers negotiate their transportation rates. But what really drew me in was working with the people who had to do this work on a day-in and day-out basis and just empathize with them for a moment. You come to your desk at a Coca-Cola every day, and you’ve got a stack of shipments that need to get covered off, and you work your way through that stack as much as you can, and you get home and come back again tomorrow, and again, you have exactly the same Groundhogs Day problem. The only thing that shifts is outside of your control, ie. what is the market doing? I mean, if you’re a company like Coca-Cola, you’ve hedged your exposure to things like aluminum prices or sugar or high fructose corn syrup, and yet the second or third largest cost in your business, which is your transportation and logistics, is a bit of a guess and a gamble, and it shouldn’t be. And so as we, over and over again, and as I got into consulting, I saw this problem around the world, as we confront sort of the unreliability, not just of service, but of just the exposure that we have in our core businesses to this big cost item getting out of control. And last year was a good example. There were several top-fold of the Wall Street Journal shippers who all do a great job managing their transportation, just being subject to the whims of the market and being tens of millions of dollars over budget to the point where their earnings were depressed.

That is something that needs to get solved better with today’s data. And why I and many of the folks working at Leaf Logistics focused our energies on solving for something different was this is a remaining sort of big risk that looms in people’s business operations. And I’ve talked a lot about the shippers. If you think about the million or so trucking companies that are registered in the country operating at razor-thin margins, the roller coaster of the freight industry hurts them just as badly. And so the idea that no one is really winning and people are paying, and people in specific terms, layoffs, and bankruptcies are impacting this industry adversely. And it’s happening with increasing frequency over the last several years. Something should be done differently. As opposed to more of the same.

Aseem: I think the meta takeaway for me is that you have an asymmetric advantage, having spent so much time in this industry and really understanding the business processes like you described. And it’s amazing to me that the biggest spend has kind of often gone ignored, and you guys are doing a killer job in trying to build, I would say smart systems and intelligent applications around it. You know, one question that comes to my mind, Anshu, is that you’ve been steeped in this industry for quite a bit. You’ve been on the consulting side, you’ve been on the advising side. And starting a business is no small task. Especially in an age-old industry like this, where things are often done the way they’re done. And that’s the same way has been going on for many years. What headwinds did you face in, you know, tackling this problem in this industry, like landing your first big customers, can you just tell us a little bit about that journey?

Anshu: So there were three areas that I really focused on. One was customers. If we built something fundamentally different, would they be willing to take a risk and try something different? And let’s face it, you know, trucks are moving around the country, and they have been, somehow, people are muscling it through. So would there be a case for change? One test was talking to 50 perspective customers and saying if we built something, would they be willing to test it? Second was what is that earned insight that we had for so many years poking at this problem? What are we seeing that other people are not seeing because they’re caught up in the day-to-day fray? And that was fundamentally that much of this problem is planable. And if you apply a different sort of analytical approach to this, you could understand and uncover the planable bits and, at minimum, take the planable bits off the table to allow people to focus their creative energies on the stuff that just needs to be triaged through brute force.

And then the third was that you had to put the pieces together. And for me personally, the third piece was the most important. Finding other people who had worked with, who had seen this problem from different perspectives and angles, who are all kind of seeing the possibility of solving the problem differently enough that they would drop their current work and come do this. And the personal conviction from individuals that I had a ton of respect for, and I knew that they brought special skills to the table, come and jump into the boat and start rowing, gave me the most momentum of anything.

And so getting that first customer was as much about having built something off of a particular understanding with a set of folks who had special skill sets as it was about convincing that customer. To be honest, I think some of the early customers said, I understand what you’re describing. I think you guys will figure it out. They were betting on us as much as anything, and it was as much a partnership around sniffing through the common problem that we saw and working and iterating on that to solve for a different outcome. We were as invested as the customers were. Those early customers saw as much in the promise of what we were trying to build as we did. They just didn’t have necessarily the same sleepless nights as we did.

Aseem: I had the privilege of talking to a few of your customers, and I could say that they were not just fans, but they were raving fans. And I remember one comment where one of them said to us that I think Anshu understands and the team understands the problem more than we do. Which is a testament to your empathy and your, you putting yourself out in your customer’s shoes. Anshu, was there a moment when you started talking to these 50 and going deeper into the problem where you thought that I’m onto something, right? Was there a turning moment, or did it just happen at a consistent pace that built your conviction?

Anshu: I think what started to build conviction the most was how quickly we could arrive at a common understanding of the problem. It became very crisp and clear. So, if we just basically said, you know, and this past year is a good example. This past year can never be allowed to happen again, says the budget holder at a big shipper. That acknowledgment that something is fundamentally broken. It may be incredibly complex to solve, but just the common understanding that there’s a problem here versus the things are happening. Things are getting done. There isn’t a compelling case for change — that would’ve been a warning sign.

So, there were a couple of ideas that I’d been chatting to folks about, and they all agreed that there was some value to be delivered, but it wasn’t clear that it was a problem compelling enough to go take a risk. And in this particular case, the risk was give us data that you’ve never given to anybody else. A brand-new startup that’s starting out building the technology. Give us data you’ve never given to anybody else and trust us to be good stewards of those data is a big ask. And I was surprised and really encouraged by how many people were willing to part with these data in such a transparent way. Because, you know, it signaled to me that they appreciated the importance of a potential solution. They didn’t know what the potential solution was quite yet, but they were invested in trying to work toward one.

Aseem: Great point. I think often, people look at their data and say, hey, this is data that I’ve collected. It’s sort of my crown jewel. And it’s a huge testament to you and the team where, you know, customers came to you and said, look, I’ve got all this historics, but in some senses, I don’t know what to do with it. And if we can find a meaningful way to mine that data, not just look at it as mere flat files, but derive insights and then take action and complete the loop, like that’s the holy grail, which I think Leaf Logistics is doing so beautifully.

How are you thinking about building intelligence into your solutions? Tell us a little bit about your vision around the smart applications, the ML/AI-infused things. How are you thinking about next-generation technology as an enabler to solve this unique problem?

Anshu: Yeah, it’s actually an interesting area to apply that branch of analytical thinking and algorithmic decision-making. So we apply machine learning to large longitudinal data sets as sort of a starting point for the work that we do. So we understand that there are some patterns that will hold, and we can plan and schedule freight against those patterns. Just doing that, using some of the technology we’ve built, allows us to coordinate those shipments across shippers at a thousand times the efficiency and effectiveness that people in the industry do. So that confers a pretty significant advantage. Just planning and scheduling with the benefit of machine learning, pointing us to where we know that patterns will hold.

Where we’re starting to see some decision-making get enhanced is there are way too many inputs, right? So just with two or three shippers, the numbers of decision variables you might need to consider, too, for example, we’re working a fleet right now that works across multiple shippers in eastern Pennsylvania. It keeps 10 trucks and drivers and 32 trailers busy on a continuous basis. But on load by load level, that could mean there’s a load that is being fit into a, a standard plan that the pattern identified by machine learning holds over and over again. But it could also mean that there’s a load that needs to be taken to Long Island or to Ohio, and you need to be able to solve for that. And the consequences of stretching the fleet to Ohio needs to be factored in. And that sort of supervised learning based on those different inputs so that the algorithm is smarter the next time that an Ohio load pops up on the board becomes important. And building the technology to think about that because we know that those problems exist, I, we see that in the historical data. And so, how can we train the algorithms to do that? To kind of give you an example- optimizing for those decisions on a weekly basis as opposed to annual basis, which is what the industry typically does, confers anywhere between six and 16% advantage. So just literally taking sort of the learnings from week one and applying it to week two. Week two, applying it to week three, can have that kind of an impact. You know, let’s call it 10% on a $500 million spend — that is an enormous impact for a company.

What we don’t know is what shape it will take in the future sort of working of this industry. There are something like 5 million white-collar workers in U.S. logistics. Do we arm them with better decision-making tools so that the transactional work that they do now they have better data at their fingertips so they can execute smarter decisions? Or do we do what media buying and ad buying have done, where the algorithms take some of the rote decision-making and figure that out and execute that so that the creative brain power of the humans can be focused on up and downstream decisions that are impacted by transportation? I don’t know which way the industry will evolve and at what pace, but there are significant opportunities for bringing some of these technologies into this industry.

Suffice it to say that the millions of man-hours that are spent doing transactional work. I think, for most people outside the industry, it’d be alarming the level of manual intensity that transportation still requires. It will be rung out of the system. Exactly how that will be rung out of the system so that people can work on, hey, if I now know the rate from Dallas to these two locations three months in advance, how would I structure my production scheduling and my manufacturing processes differently? You just can’t answer that question today because those data don’t exist. But when those data exist, there are some very interesting problems for humans to spend their energies on versus what the machine or the algorithm can take off their plate.

Aseem: That’s fascinating. And you know, I, I, I think this is really unique as to how you guys are thinking about the problem and bringing the technology of today to solving a very well-known complex problem.

Anshu: It is fundamentally something we’ve all bashed our heads against the wall at for a long enough time. We talk a lot about waste in the industry and in terms of empty miles and emissions associated with them, but there’s just the waste of human capital. Today, a truck driver in our industry, if he or she’s driving empty to pick up the next load, they don’t get paid. They’re paying for diesel out of their pocket in a lot of cases. And then there are, of course, the man-hours wasted. On average, over four hours are wasted at pickup and at delivery, loading, and unloading. And so many inefficiencies that we’re all paying the tax for. If we can free up the human capital to work on more interesting, more valuable problems, we’re all going to be better off.

Aseem: You know, wanted to sort of just pop back up for a bit to get back to the 30,000 feet view. How are you thinking about scaling, and what are the challenges in front of you?

Anshu: It’s a very interesting problem, and in some weird ways, because of the complexity of the problem, there are multiple areas to pursue. So, one of the main things for scaling is to continue to have a disciplined focus on the few things that we think will make the most difference to our customers over the next handful of stages of our growth.

That focus and discipline becomes a really important thing for the management team to focus on, which brings me to maybe the most important point. One of the things that we’ve been very clear-eyed about is the team that it took to muscle through and get from zero to one may not be, and likely isn’t, the team that scales from where we are to where we’re trying to go. There just different skillsets. The obsession with the problem. The ability to iterate and think in first principles was essential for us to get off the starting block. Now we have to take the pieces of product-market fit and repeatability and drive toward scalability by looking at patterns and executing against those patterns with discipline. And hiring and upgrading our talent and challenging each other to make sure that we’re not settling for the status quo have been really important. Culturally we have a very transparent and open culture. Many of us have had the opportunity to work with other people on the team in past lives. So, there’s built-up trust, and yet we’re all trying to do things that scaling is an interesting word to use in the context of a startup, but human beings don’t scale very well. There are certain things that we do remarkably well, this is an organizational culture challenge to build something that scales, and that might mean that myself and others need to give up things that we used to get our hands dirty with to allow other people to pick them up and do a better job with. And that is, frankly, a big challenge. Hiring for and building the organizational muscle to genuinely scale as opposed to just doing a few of the things that we’ve been successful at a few more times. It’s really, it’s fascinating. And what’s been interesting is the learning that we, almost on an individual level, you can palpably feel are going through. This idea of letting something go that you used to obsess all night over because somebody else can come pick that up and, within a couple of hours, have a different solution than you, just because they look at the problem and frankly at the world differently than you, is a learning experience, a growth opportunity for many of us.

Aseem: Well said. So much of it is, building the right team, hiring folks that are coming from different backgrounds, different points of view, looking at the problem differently, but also world-class at what they do, right? And oftentimes, that’s probably not the existing team that’s there because they have different domain expertise or they come from various stages of a company’s life cycle as you scale fast. Another question on that front is, how do you think about repeatability and understanding patterns on what to invest behind? The challenge with scale is often prioritization because you can’t scale if you are focused on too many things. I mean, Yeah, you can scale horizontally, but that often doesn’t make you best-in-class in certain areas. So, what’s your guidance to founders or companies who are just slightly behind you on prioritization and being maniacally focused on a few core areas?

Anshu: I think that’s one of the toughest things for us to do and, honestly, to challenge ourselves to ensure that we’re applying the same strict filter continuously. Because sometimes we fall in love with our own ideas. And one of the challenges for an experienced team like ours is we do have so much familiarity with the problem, but we might be a little too close to it. So sometimes, our hypotheses are are tough to let go of. And similarly, even our most high-conviction customers might not be able to tell you what is it that they want next. So this is, oftentimes, when we’re talking about something different, we’re skipping a few logical steps in the solution design. So asking the customer for a set of features might lead us down the wrong path. And being able to really understand that requires that we have more than a handful of data points. And so we have this sort of ethos here that zero to one is very hard. One to 10 gives you data. And one to 10 is just, we’re going to be disciplined in making sure that we get enough data points so that we are not getting skewed perspectives that we double down on, and we don’t scale until we’ve got that repeatability understood. So repeatability and scalability are seen as distinct. And oftentimes, the people who are invested in innovation are not the people to take it to repeatability. And the people who are in the repeatability motion of sort of new ideas that we germinate with our customers are not the people that are responsible for or charged with scaling it.

And that sort of baton handoff has been helpful because I think that’s something we struggled with. Just switching gears as the founding team was very hard to do because, to your point, each idea deserves a ton of scrutiny and attention.

The other sort of lens that we apply is we have north star metrics that we look at, and we look at the differential impact of those north star metrics from each idea. So, it’s almost like a mini PNL ROI-based argument per idea. An exercise that we go through, which is different from the standing sort of reviews of the project planning and the metrics, it’s stepping back and saying, if I had to draw the line at three, which ones would be above and below the line, and forcing a debate. So, people are then debating the data that they have, the ideas, as opposed to any sort of custodianship for, you know, the work that they’ve been doing. It’s learning from great companies who will put competing teams to work on the same feature because they learn so much from looking at the same problem but with different diverse teams tackling the problem independently. We’re trying to just borrow from those pages. And that means that reprioritization is as important as prioritization in the face of new data. But it just becomes the way we work, and we’re trying to develop that muscle as we scale because we start to get to be a larger company, to be able to reset priorities doesn’t seem like what big companies do very well.

Aseem: The one thing that I’ve observed in partnering with the team, which is just amazing, is you all think in terms of 10x, you all think in terms of the outsized impact that an effort or a project or an idea could have relative to the metrics. Does it improve it by 5%, 10%, and is it worth doing it, or does it actually have its own leapfrog moment where it’s having an outsized impact if you go funded or if you go execute on that idea? That’s a good sort of framework for somebody to have as they think about scale.

Anshu, let’s talk a little bit about your experience and your principles around hiring and adding to the team. What tenants do you keep in mind when you hire people, especially at this stage in attacking this scale challenge? What roles are you adding, what should founders in your position know and learn from you on how you are thinking about the right kind of team and shaping the right kind of, I would say family, to go after this problem?

Anshu: Yeah, I think family is a great term for it. I think one fundamental thing is that the needs shift. Early on, one of the most important things we look for are people who have demonstrated grit. That they’ve had to go out and find a way through. And that can come in multiple disciplines, but there’s something to be said for finding a way through. And the shift between that and the scaling phase now is the thing that we spend a lot of time doing, even in panel interviews and group discussions with some final round sort of leadership candidates are filtering for the ability to distinguish between the things to pursue and which things to leave behind. And that honestly is very hard for that founding grit-based team. Cause that grit-based grinding team, they can’t leave any stone unturned. You just keep working. And the problem with scaling is you can’t afford to have every detail consume your time because it gets you in the way. So, the ability to put the blinders on and make sure that the blinders get tighter with each iteration is a skill that people who’ve scaled before seem to demonstrate, and they can prove that to you. We can even look at our current set of priorities that people who are working at Leaf Logistics right now are struggling with force ranking or prioritizing. Put that in front of someone who’s had scaling experience, and they’ll ask the right essential questions to be able to distinguish and at least relatively prioritize those items on the list. That’s the clear-eyed kind of perspective that I think at the scaling stage is distinct from that sort of grind it out and find a way type person or people you’re looking for at that early stage.

Aseem: just following through on that thought. Leading into 2023, Anshu, I know you’re growing, you’re adding to the team. Tell us a little bit about what would ‘great’ look like for you this year?

Anshu: There are three things, and we try to make sure that at this point, for everybody on the team, we all know what those three things are. The first is we are seeing the coordination thesis that we’ve started with actually playing out, and that’s driving an improvement in our net revenue picture. And so, as we get more scaled, frankly, people who haven’t spent as much time as you and your team have, Aseem, understanding what we’re doing and why can look at the business from the outside and understand the progress that’s being made in just pure sort of financial statement terms. That’s an area of focus. So we’re just trying to get those clear financial metrics to jump from our performance. And that is, through doing some things that are pretty cool and pretty distinct in terms of being able to build circuits and continuous moves and even deploy fleets and parts of the geography that others aren’t able to do. And we, frankly, put a lot of effort in being able to get to this point but to go execute those things and show that impact on the bottom line is job one.

Second, is the only way or the best way that we think that we’re going to get there is to double down with some of our key customers who are growing very rapidly with us, but there’s still yet another gear that we can hit together. And so, account management becomes incredibly important as a discipline to not just further build out but to enhance. And the amazing thing is that there’s just as much appetite from our customers. We’re finding engagement at such different levels and across so many different personas that it’s an incredibly intellectually stimulating exercise to find those different perspectives because many of these customers, this matters a lot. So just earlier today, we had one of our logistics service providers, CEO and CFO, in the office, specifically talking about their 2023 plans and how much the work that we are doing together could impact that trajectory. And that’s the kind of partnership we’re really looking for from an account management perspective.

And then the third thing for us is to make sure that we are prioritizing the 10x moonshots that are coming next. You know, how do we build upon some of the early advantages that we’ve established to continue to do things that other people just don’t have the established foundation to do?

So, we’re really excited about some of the payment and lending-type solutions that we can bring to market right now in an economy where those types of solutions are pretty few and far between. This is still a massive industry with huge inefficiencies and a recognized need for change. A lot of innovation needs to be brought here to mitigate the significant amount of waste that we have in the industry. That hurts both people and all of us indirectly as the environmental impact of an inefficient supply chain is experienced across the economy and climate. How do we make those investments possible? That’s going to require innovation and the 10x ideas that we’ve been of working on. Making sure that those ideas see the light of day that they’ve germinated, but also that we talk about some of those things to pull some of the next set of customers and prospects into the journey with us. There’s a fair bit of growth for us this year, Aseem, but we will sacrifice top-line growth for growing with the right people at the right pace, with the right level of innovation to set us up for the future potential that we see for the company and for the impact that we can have on this industry.

Aseem: One of the things I admire about you and the team, Anshu, is just this notion of growing right versus growing in an inflated way. And I love the fact that you’re looking at 2023 from a, Hey, what’s today? What is one year out, and what is the 10-year out change in this industry look like? And aligning yourself with that. Anshu, it’s amazing to me how the team has come together and how you’re hiring, and how you’re growing. You know, you talked a lot about you’re so focused, and you’re so close to the problem, but who is your sounding board? Who do you go to for advice? Tell us a little bit about that.

Anshu: You know, as I said before, I think people don’t scale, and that goes for founders. I think a lot about people who can work on the business as opposed to in the business. And that’s where our literal board folks at Madrona, but also just generally people from outside of Leaf Logistics, can look in and see what they see. So, I make it a point to start my week with external advisers and to bookend the back half of the week with the same because, at the end of the day, I’m not building this company for anybody aside from the problem. And the problem needs to be solved furiously, sort working on the problem from the inside is insufficient. It needs to translate. And so that external perspective is really important. I think, for me personally, making sure that the blinders aren’t on too tight in terms of the narrowness of scope. I probably read more widely than I did at the earlier parts of the company’s growth. And I’m always on the lookout for things that will just be absorbing and bring different perspectives. Understanding what’s going on in other fields, being able to speak with people who’ve just done outlandish things in other disciplines, and understanding what types of models of leadership there are. Entrepreneurs who are further along on the journey are incredibly helpful to learn from because they have run through some of the roadblocks, and they’re incredibly generous with their advice. So I think those are the three areas. Really making sure that the personal growth at least tries to keep pace because it’s not realistic that any of us evolve that rapidly. But it’s a lot of fun. And it’s a lot more interesting and multifaceted than it might feel on a day-to-day basis.

Aseem: Anshu, any parting thoughts for founders who are new or first-time founders or thinking about walking in your shoes and are just maybe a year or 18 months behind?

Anshu: I, I asked that question of founders 18 months ahead of me, and I benefited a lot. So, hopefully, this is of use to some of you. Really two things. One is to ask for advice and ask for people to pressure test your problem. And funding and fundraising will come with that. Asking for advice will bring you money. Asking for money will bring you advice, as somebody told me early on in my journey. And that was really helpful. And every single partner that we have today, we engaged in a conversation outside of, and well before, there was a fundraising opportunity. So, it was really about understanding do we see the problem the same way or do we see pieces of the problem that we could be helpful to each other before it was time for fundraising.

The second is, clearly, there are a lot of exciting things about the fundraising process itself, but if anything else, it’s a learning opportunity. You’re not just pitching your company, you’re getting an understanding of who else is out there, what perspectives they have, and what can you learn from. If you really love the problem you’re trying to solve, it’s not about winning the argument or getting your point of view across. As fun as it is to watch Shark Tank, it is not about trying to convince people to just follow along with your way of thinking. It’s about making your thinking better, so you can solve the problem you came here to solve. And the best thing I think I can say about working with Madrona and other key investors and partners, and advisers that we have around the table is they love the problem just as much as we do.

I will get pings, texts, and phone calls with ideas around why does this or why not this as much from people that are not working the problem day to day as those who are, which gives me a lot of confidence that I think we have the right team assembling to really solve something that matters over time.

Aseem: You know, I will say that having known the team for quite a bit now, I have had a deep level of appreciation for this industry and the challenges that your customers face on a daily level. And I often find myself wondering, where is this person driving the truck going from point A to point B? Where are they going? Like, how much load are they carrying? And it’s fascinating. If you’re in that job, you’re powering the economy, and yet you have a suboptimal experience as a person who’s doing a very tough job, and I think I, I feel nothing but empathy on that front, but Anshu, thank you so much for taking the time. And we couldn’t be more thankful to have you on this podcast and for sharing your words of wisdom. Good luck in 2023, and we are excited to be on this journey.

Anshu: Thank you, Aseem. Thanks for the continued partnership. It’s going to be an exciting year, but there is much more to do.

Coral: Thank you for joining us for this episode of Founded & Funded. If you’d like to learn more about Leaf Logistics, please visit their website at leaflogistics.com. Thanks again for tuning in, and we’ll be back in a couple of weeks with our next episode of Founded & Funded with Common Room Co-founder and CTO Viraj Mody.

 

Lexion’s Gaurav Oberoi on Applying AI to Change an Industry

 

Founded & Funded IA40 Lexion with Guarav Oberoi

In this week’s IA40 Spotlight Episode of Founded & Funded, Investor Elisa La Cava talks to Lexion Co-founder and CEO Gaurav Oberoi. Lexion was one of the first spinouts from AI2, and Madrona invested in the first round the company raised in 2019. Like many companies that come from AI2 – Lexion is focused on applying AI to change an industry. Specifically, Lexion helps legal teams at enterprises manage contracts – and they are one of the companies that have added features using OpenAI’s GPT technology.

I’ll go ahead and hand it over to Elisa to dive in.

This transcript was automatically generated and edited for clarity.

Elisa: Hello everyone. My name is Elisa. I’m an investor with Madrona Venture Group, and I am excited to have the CEO of one of our portfolio companies on the podcast today. Please meet Gaurav. He is the CEO of Lexion, Gaurav, welcome to the Founded & Funded podcast.

Gaurav: Thank you Elisa. It’s great to be here. I’m excited to talk to you about Lexion.

Elisa: One of the things that we love to talk about is the founding journey. You’re not a first-time CEO, first of all, and you’re not a first-time founder. I would love to hear some background on what in your career led you to this point of founding Lexion.

Gaurav: Yeah, absolutely. I studied computer science when I was in college. I enjoyed writing code as a teenager, and I knew I wanted to get into technology when I grew up. I moved to Seattle a long time ago in 04′ to work for Amazon as a junior engineer. And it was a really amazing place to learn how to build software at scale, build amazing products, and work with incredible engineering talent. I’ve always had the entrepreneurial bug and soon after that, I left to bootstrap my first company. And in my career, I’ve bootstrapped three companies, exited two, and one still runs itself. This is my first venture back company. I started Lexion in 2019 is when we launched the business. And the year before that I had spent at the Allen Institute for Artificial Intelligence as their first entrepreneur and residence, working with their research scientists and team, looking at what are the new advancements that are coming out of AI and looking at ways to commercialize them. This was 2018 when I went there, and in the years leading up to it, we had these huge step-function improvements in a lot of machine learning applications, you know, ImageNet and our ability to detect objects and images and classify them, and then revolutions happening in NLP. And so for me, it was how do we take these innovations that are real and bring them into products that actually help people get their jobs done.

I met both of my co-founders, James and Emad, at the Allen Institute. And actually what has become Lexion today, really started with Emad coming in and pushing us to look at some of the new NLP breakthroughs that were happening and applying them to a problem that his wife felt at work. She worked in procurement at a very large telco.

Elisa: I’m sure you get this from other aspiring founders, aspiring CEOs, thinking about, “Hey, I have this inkling of an idea, I kind of want a co-founder to join me”. What advice do you give future founders on how do you find a great co-founder partner to build this big idea with?

Gaurav: So my advice to folks looking for a founder, find people who are complementary to you where they can augment your strengths and maybe not duplicate them. Find people that you can have a very honest relationship with right from the get-go. That is really central to building a business. And find people that you really like and admire. People you can learn from that you respect, that you’re like, wow I wish I had some of that person’s qualities. I think those things, the right strengths to help you build your business, a good trusting relationship, and someone you look up to, those things actually go a long way.

Elisa: Wow, such deep-seated respect there and it’s just incredible.

Gaurav: It doesn’t hurt that Emad is incredibly accomplished in the field of natural language processing and James is one of those hundred X engineers that you hear about. So those strengths definitely help, but I love hanging out with them. They’re wonderful people and that really helps us get through the difficult challenges of building a business.

Elisa: So I mean, you started in mid-2019. Now we’re in early 2023. You’ve been at this for a few years at this point. You’ve gone from, your three co-founders to now a team of, is it 50? 52?

Gaurav: That 50 number sounds great, except as I learned from my legal team, it comes with all sorts of new rules and regulations we have to adhere to. So there’s also that.

Elisa: Well, maybe you put some of those contracts through Lexion to help you understand everything.

Gaurav: Oh, yes. Absolutely.

Elisa: Well, let’s, use that as a perfect segue into talking about how you and Emad and James and now your expanded leadership team, including Jessica, have thoughtfully considered how do we grow Lexion into this incredible outsized business over time. What was that philosophy going in together?

You mentioned earlier, Emad is the real deal when it comes to natural language processing and his understanding of ML. James Baird, an incredible engineer. How did you think about that formation at the very early stages when they sat down, you all began to code, what did you do?

Gaurav: So for us, the early days were really about recognizing that there is a huge opportunity in automatically understanding the contents of long-form documents. At the time you have to recognize all of the big cloud providers Microsoft, Amazon, and Google did have APIs for text parsing, but they were all really aimed at short-form text. So think a customer review or a support ticket or a Tweet. There was really nothing out there that would help you understand the contents of even a page, let alone a 50-page contract. And none of them also dealt with the complexities of the very nasty format that many contracts are in, which is PDF, which is a very poorly standardized format. So that was the opportunity. And we knew from the early days that there are myriad applications and we’ve seen companies over the years expand into many of these, contracts being won commercial contracts that we focus on, but there are also companies that focus on looking at case law and what’s happening in the courts. There’s companies that focus on patents. There’s also companies that look at big banking contracts and the big LIBOR scandal and moving away from it, there’s a whole business around doing that.

So we knew that opportunities abound because back then, while those companies didn’t exist in their current forms, there were businesses that were doing this with lots of human labor. And for us we’re like, just fast forward five years, if you do the thought experiment and this NLP technology matures, the world isn’t going to look like that. There’s going to be technology either assisting these humans or removing some of the really low-value tasks that they do. So right from the get-go, our hypothesis was if we can build a system where you can come to us with a pile of documents like Wilson Sonsini did in the early days for proof of concept. They came to us with a pile of venture financing documents and said, “Hey guys, if you can quickly design models that will go through term sheets, certificate of incorporations, voting agreements, stock purchase agreements, and spit out for us answers like what are the liquidation preferences, what are the pari-passu rights, what’s the pre-post money?” That could be really interesting because we spend considerable time extracting this information when a deal is done.

Elisa: And let’s pause for a second. We’re talking about thousands of pages of documents that normally a lawyer would need to review, get an eye on each page going into different clauses, finding out what different terms are, tracking them down, and then organizing it all in one place, and then getting onto the part that is their job, which is then what do you do about it and what are the next steps?

Gaurav: That’s right. And with our prototype back then, our goal was how can we quickly go from, here’s a pile of documents, to here are very high-quality AI models that can extract this information reliably. And also a UI that helps the user then gain confidence and identify any issues. Because I think these things go hand in hand. So that is where we started, was the crux of our technology, and we knew that we have something. Because Wilson Sonsini from those early days based on that proof of concept, entered into a commercial relationship with us to help do this processing for their knowledge management teams. But then also became an investor. And David Wong, their chief innovation officer, still sits on our board and has been an enormous help. So that was the early days. You know, and the other cool thing to recognize is in 2018, there was a groundbreaking paper that came out of AI2 called the Elmo Paper. And soon after that, Google built on top of it and cheekily called it Bert. Bert was one of the early large language models that distilled language in a way that made it richer and made representations of language or embeddings, richer so you could get more value out of them when you work with them and fed them into your neural nets.

And so that was one of the innovations. The other innovation that we were highly inspired by was looking at the deep dive project, which eventually became the Snorkel project. And looking at their approach of weak supervision. And so we combined a variety of these ideas and built the initial technology prototype that got us to launch the company and go raise a seed round. So those were the early days.

Elisa: And this is exciting because what you’re talking about is the intelligent repository product, which was your very first product, and you were testing it and ingesting and training with information from Wilson Sonsini, the law firm. But then how did you then move into your sector of customers today, which looks a little bit different?

Gaurav: Yeah they’re very different. We did not keep growing into big law. We actually moved to selling to corporates. There were a few reasons. The biggest one is just the market size. There’s a much larger market serving corporates.

The second was the magnitude and urgency of the problem. We just found as we started talking to in-house council that there are all sorts of challenges they run into with managing their agreements, specifically with managing executed agreements. Things like, “Hey, what are all the vendors that are coming up for renewal next month? Oh, we don’t know”. Or, “Hey, there’s been a data privacy change in California. Which contracts do we need to go and amend to address these issues?” We don’t know. So all sorts of analytic questions that one, help with the day-to-day running of the business. Finance, trying to do revenue recognition, or somebody trying to pay a vendor, or, “Hey, is this NDA active or not?” Two more complex queries that would happen periodically, like the ones I described, laws changing. So urgency of the problem was there. We also had insight from Emad’s wife, working in procurement, and other friends looking deeper into the bowels of large enterprises and seeing how those manifest there, like “Hey, until we structure these contracts and type them into large systems for procurement, for example, we can’t really operate our procurement systems correctly.

So a lot of energy and effort goes into people reading contracts and then typing in the handful of things that are in there. So it was clear from those that the market opportunity and the need, the urgency both exist in corporates. And so we started to make our way towards corporates. Unlike a lot of other CLM companies, we took a very different route in that we didn’t really build the end application, the repository with all its features first. We instead focused heavily on building our machine learning stack, and it meant that it took us longer to get to market. It was a higher initial capital investment. But it was a very conscious decision on our part because we recognized that, again, if you look five years into the future there, there will be players that are going after this opportunity, but the ones that will differentiate themselves are the ones that can dramatically alter the amount of effort that human users are having to put in.

Elisa: I love what you’re touching on because one of the ‘aha’ moments over the past few years is falling into that realization of these in-house councils, there’s often only one person supporting and lobbying back and forth these contracts. There’s the whole sales team that’s trying to get their deals done. There’s an ops team that is supporting, maybe there’s a customer success team or account management teams that are working with existing customers trying to get them renewed or upsold and in a certain way there’s this critical bottleneck that happens with the in-house council in working with all of these contracts. And so the ‘aha’ is what tools do these people have at their disposal? Not many …

Gaurav: Email. Spreadsheets.

Elisa: Email and spreadsheets which is a very natural workflow and, you know, you make it work. But they’re a fairly under-resourced and super critical part of what we’re kind of now calling the quote ‘deal ops process’. So thinking through that intentional investment in embedding AI and ML deep into the platform at the very beginning so that it could create that seamless magic for the in-house councils. And what you see now at your current customers is paying off in dividends.

Gaurav: It really is. And you’re absolutely right. We have since evolved the platform to help all of these teams get deals done faster. If you go back to the beginning, we started with let’s invest heavily in a deep NLP platform and we continue to invest in that. On top of it, we built a repository application, which will do things like, as soon as your contract is signed in DocuSign, your general counsel doesn’t have to download the PDF, open it to find out who it was with and when it was signed, rename it at their file convention, and then drag it into the right folder in Google Drive. No. Instead, we’ll just automatically do all that. We’ll pull it from DocuSign, identify it was with, and index it away. We’ll give you great reporting. We’ll automatically give you a list of alerts and reminders. Boom. So that was our repository. And then as we continued to grow the business, we started to hear customers say, your repository is amazing. It helps us with post-signature contract management after it’s signed. But what about pre-signature? What about all of the stuff that goes up to signing it? Generating it from a draft or a template sometimes. Just the negotiation and managing all the various red lines and drafts that happen. Getting approvals is a huge part of the process. And then things like task and status reporting. And so we started to see a lot of demand from our prospects. And this led to the next evolution of the company, which is where we are now, helping operations teams accelerate deals.

At the time, I was both excited to go after this opportunity and a little hesitant. The excitement came from natural market poll. When you hear something from your customers, listen to them, and we kept hearing it again and again. So they’re people ready to open their wallets for this. On the other hand, it’s a very competitive space. There are other contract management and other workflow automation companies that try to help with a variety of these areas. And so before we jumped into it, we spent a lot of time speaking with over 50 in-house councils, current customers, users of competitor products, and prospects to understand what is working and what is not working. And we heard one thing very consistently. It’s that a lot of these CLM projects end up failing just in the implementation phase. When you look at the effort companies put into this, it’s not just the license fee you pay to the vendor, it’s the training time, it’s the change management of it all. And if you don’t realize the benefits, it’s very taxing on the organization because everyone still has to get their daily jobs done. And that’s when it clicked. We said, “You know what? The way that we can really make a product that addresses change management and adoption is if we really focus on making something that the rest of the business can use out of the box.”

Elisa: And that’s an incredible learning. How do you iterate and hear feedback from an early set of customers or design partners that are helping you understand what features are most valuable? How much pull are you hearing and repetition about certain things that they want to be able to do moving forward?

Just hearing you talk about that now and what we can help share with other founders is listening to your customers. Come in with a point of view on an understanding of the workflows, not just of the team you’re helping, but the surrounding teams that they touch and how do they get their job done, and then what are different ways that you can remove friction in that process makes any software solution something that the core users are going to say, ” over my dead body, you’re going to take this away because it’s so critical to helping me get my job done.”

Gaurav: That’s right. And it is common advice. Listen to your users, and iterate based on them. There are a few things we did that helped us get there. One very early on, we recognize that this is a very specialized group of people — in-house council. And so we were very lucky to bring Jessica Nguyen on board. She has had a career, spanning lots of different scales of business, working in-house from Avalara in the early days, to Microsoft and working in a large tech company where she got to meet Emad, and then to being general counsel at PayScale. She kind of saw it all, and she was really the ideal person because she understood the problem. She has a lot of respect in the community, and she’s just an incredibly charismatic person. She helped us in so many ways. One, really understanding what are the problems customers face, what is useful to them, and what is fluff. How do you message so that people listen to you and aren’t just turned off, with, oh, you said AI, and I don’t really know what that means? And it sounds buzzy versus, oh, actually, you know, if we just tell them, we’ll automatically file your contracts for you and give you a reminders report. Okay, now you’re talking of value. From those kinds of things to testing product features and then all the way to helping us bring in customers, and run webinars — she really helped us understand that.

The other thing I think that has really helped us is, right from the beginning, our culture has been very customer obsessed. One of the core traits we have in our culture is to always ask why. Think it really shows when we talk to customers and they tell us I really need help with pre-signature. Why do you need help with pre-signature? And then we’d start to hear stories like I’m a team of three and me and two other council, and we’re supporting 60 salespeople. And our company is growing, next year we’re going to grow to a hundred salespeople, and I’m not getting any headcount, so I need help.

Why? Why do you need help? What do you do? Oh, all these contracts come in. Some of them are bespoke. We have to negotiate them. Others, I feel like we don’t need to touch, I want to automate them. How do we do that? And then we start to get the picture and we start to hear other stories. Oh, at the end of the quarter when we’re doing a deal rip, we’re going through our 40 deals, and these 11, just say legal. What the hell does that mean? And I tell my AEs to email them a week in advance to get an update. But it takes days to get an update. So I’ll ask the council, why does it take days? Because there are three of us and it’s all in our mailboxes and we have to update a, a spreadsheet. And you begin to understand that these teams are trying to do the best they can with the tools they have. And so that has been our natural evolution being closely embedded with these teams — listening to them, we’ve gone from a core NLP technology prototype, which is still at the core of our machine learning infrastructure, then to an intelligent repository for your contracts to an end-to-end workflow automation platform for these operations teams.

Because here’s what we also learned when we went back to council and said what about these 11 deals that just say legal? They would say, “Dude, no three of them are in legal.” The others. There are a few that are in IT security review. Finance is looking at some because of commercial terms, and like two are sitting with the CEO, who’s the one complaining about it. It’s in his mailbox. We need approval. When you really dig in, we began to realize, wow. This isn’t just about the legal team to get deals done in the company, whether you’re closing a customer deal, onboarding a vendor, even getting an employee onboarded, or doing an RFP or a security review, you’re going to touch a lot of the same teams.

You’re going to touch sales and revenue ops or procurement. You’re going to touch IT. You’re going to touch finance, you’re going to touch legal and maybe HR if it’s an employment matter. And in all these cases, the activities are also similar. You might be generating something from a template or you might be negotiating something back and forth with red lines. And then you’re going to, at some point need to get a bunch of approvals. As soon as you close a customer order, finance is going to say, “Hey, where is it? Because I need to send an invoice.” And that invoice data is in the contract. It doesn’t go into your CRM. So…

Elisa: Right. And by the way, you’re going to be renewing said customer a year down the road, and they might get a new contract. This process starts over. To put a finer point on one of the things that you just said around the evolution of the product. One of the beautiful mixes of art and science in early-stage company building is how do you decide what features to build and in what order. You mentioned, deciding one day as a team we’re going to add pre-signature work into Lexion. Based on what I remember, pre-sig, it was at least a few more quarters out on the roadmap at that time. It wasn’t an imminent build scenario. You had a lot of other things going on and a lot of other high-priority items the team was working on. But I would love to hear how did you decide which capabilities in this case, pre-sig need to be so prioritized that they have to be pulled forward. How does that decision-making process work for you?

Gaurav: As you know, these decisions don’t happen overnight. There’s some undercurrent of, we’re hearing this from customers and we discuss it in a meeting and it’s ” Hey, we should definitely do it in the second half of next year. That’s how it gets into a later part of the roadmap. So you start to build a thesis on why this is important. What can cause it to tip over and say, actually we, we need to accelerate the priority on this is, you know, one thing I found really useful are board meetings in bootstrap companies, you don’t do board meetings because you’re just going to hang out and chat with your founders or you don’t always. But one thing I found very helpful about our board meetings is it’s a time to pause and reflect on the business and look at some of the data because day-to-day you are, you’re very involved in getting it done. Shipping product, closing customers, and helping existing customers succeed. And these board meetings allow us to reflect and some of the reflections we would look at is our wins and losses in our pipeline. And as we started to look at the reasons, we increasingly heard comments like, we love your repository, but we also wish you had pre-signature workflows. And when we talked to some of our customers that had purchased our repository but had another product for workflow, they expressed the same. They said, yeah, we would love to have one platform. We only have two because neither platform has everything we need. So it was a combination of having a huge amount of signal in our active customer pipeline that told us that if we made this change now, we would actually see different results in a quarter from now because our salespeople would actually be able to close these customers that we’re having to turn away. So in this case, the acceleration was a combination of customer demand and having built up a thesis as we’re doing this of customer research and building a point of view on, if we were to execute this tomorrow, what would it look like? How are we going to do something that’s really going to stand out and give us a moat as opposed to just copying, maybe what other products are doing.

Elisa: What you’re describing is what is classically referred to as reinforcement of strong product market fit. You are hearing pull from your existing customers saying, we love what we have and we want more. And so this is an amazing moment where as a founder you probably said to yourself like, “Hey, we’ve got something. We are like rocking. We have momentum. Let’s keep going.” We need to pull forward, these really critical…

Gaurav: This is what you wait for. This is what you hope for in Startupland — when a bunch of customers with a real problem tell you ” this is a problem. If you solve it, it’s really worth it to both of us.”

We had another such moment earlier this year where one of the things we started to hear from customers was, “Hey, look, you do have ways to gather approvals in Lexion, but we need more sophisticated automated approvals.” And we, we again, talked to our customers and looked at what competitors were doing. And this was a, another case where it was clear we needed to do it. But Emad and Chris, who runs the product and came from Smartsheet and is an incredible product manager, both of them said there’s a missed opportunity here if we just go and wrote build a sort of approval chain like a lot of the other products on the market. The real opportunity here is to build a generic workflow automation tool. Because if you think about it, you could say, if this is greater than 50 grand, then seek approval from finance. But what if the action was to seek approval from finance? And then update the Slack channel saying, “Hey, finance, we’ve sent you a ticket, if someone can jump on this, it’ll be helpful.” Or once finance approves, why not then go file a Jira ticket to it to do the security review because they work in Jira and that’s where they want to be. And so the realization became if we do this right, we can build an incredibly flexible tool that will allow these backend operations teams to further stitch together the systems they use and express even more complex chains automatically. But using the same UI and the same infrastructure, it’s really triggers and actions. What’s the trigger? Hey, did you submit this ticket? And then what’s the action? Ask for approval, add a follower so they’re in the loop, set an owner so they work on it, update a Jira ticket, and so on.

And so this was a case where we actually knew that if we raced to build this feature, we would close more deals in our pipeline and have more upsell opportunities for our customers. But we actually made the conscious decision, much like our investments in ML, to say, okay, we have to look at what these teams are going to be doing on our platform. And if we pause and build a richer infrastructure, then we’ll be able to deliver a lot more value.

Elisa: Having that thoughtfulness to understand that customer journey, that user workflow, I mean, this goes for any company, is the most critical in understanding how do we start and build the infrastructure of our platform today so that we have option value for areas that we already have some kind of conviction will become critical for our capabilities in the future.

Gaurav: Yep. I have to say this is possible because of a few amazing things that happen at Lexion. One, Emad has just built an in incredible engineering org. We have a very small engineering organization for the surface area of product that we support. And Emad has been very methodical in how he’s built up the team. But then the other is making sure that everybody in the company, down to the engineer building the feature, is very aware of why we’re doing something so it doesn’t feel like there’s whiplash.

In fact, we follow a great road mapping process. Chris is again, is, he is just an amazing product manager. I love going to Chris and saying, “Chris, oh my God, we really need to build this thing. I really think it’s time.” And he is like, “Yeah, I agree with you. What are we going to cut? Let’s look at the list.”

Elisa: Ruthless prioritization!

Gaurav: Uh, And sometimes I walk away being like, oh no, you’re right. Actually, we’ve already done this exercise. So we really have an amazing team.

Elisa: Everyone says they have an amazing team. Everyone says, “Oh, my CTO is great. We have AI built at the core.” But I think in the case of Lexion, the power of the customer love that you guys receive, kind of echoes and reverberates across the investor community even where we perk up our ears and say, Hey, what’s going on with Lexion? And Lexion was a winner this year of the Intelligent Applications 40, which is an award that we run Madrona, where we ask 50 of the top VCs across the U.S. To vote for which companies did they believe are building the most intelligence into their products. And Lexion was a winner in the IA40 this year for the early-stage company category, which is incredible.

Gaurav: We were so grateful for the recognition.

Elisa: Yes. It’s amazing.

Gaurav: It’s been a really good year for us. It feels wonderful to get this recognition because of the amount of effort and investment we put into our AI. One of the things that we’ve learned over the years is if someone comes to you and says, I have 5,000, 10,000, 20,000 contracts that have been sitting in my SharePoint folder or Google Drive, and we really need to understand what’s in them because we want it to be loaded into Salesforce and Coupa and understand renewals, etc. We need your help. And from the beginning, when we would ingest these contracts, we know that the algorithms aren’t perfect. Depending on the models, you could have 95% accuracy or it could be a little lower, and so you’re going to have some errors. And right from the beginning, we knew that there should be some human in the loop before we hand this off to the customer.

Now when we look at the competition out there, we’ve seen companies that have themselves gone out and said, oh, we have lower gross margins because we have a large team that helps do the human in the loop part. Which to me really meant that their algorithms are not doing the amount of work that you’d want them to do. They’re not working as well as they should. For us, I remember early board meetings whereas our volume started to grow and customers started to come in, even we discussed, “Hey, should we look at augmenting our labor by partnering with larger offshore teams and things like that.” We made a very conscious decision and said we don’t want to create a tech-enabled services business. On the contrary, we should keep higher skilled people on our team full-time, but far fewer of them. And we’ve reduced that team size over the years as volume has gone up. Instead, we’ve said our whole process should be, if somebody comes in with a bunch of contracts and their errors, we’re not going to fix any of them manually. There should be a close collaboration with the ML team, and we should build the right tooling so you can click, retrain the model, rerun it on our validation set to make sure it hasn’t regressed, and then roll it out to the customer and improve the customer’s results. So, you know, there are significant investments there. We’ve built our own annotation tool, we’ve built our own model training, model versioning, stage deployment, and it has really paid off because it’s allowed us to accelerate how we improve these models, accelerate delivering new models, and serve a higher volume of customers with more contracts at an incredible price because we’re using technology to solve the problem, not people.

The other thing we do is we stay abreast of the latest research. We’ve been lucky to hire an incredible machine learning team. And with them, they’ve brought a culture of paper reading, of wanting to contribute to academia. And along with this, when GPT-3’s API was first released in 2021, we started to experiment and said well, first we need to understand how do these models perform at the task we already do. Can they extract parties and dates and summarize clauses better than we are and faster at a better price? And, you know, we were quickly able to identify not really, there’s still a lot of gaps to get there. But what new things can they do and how do they match to customer value? Even back then we knew that eventually, we would want to get into helping teams negotiate the actual contract itself or complete a security questionnaire itself.

Elisa: Everyone’s talking about ChatGPT and foundation models. What does this mean? And you’re saying, Hey, we were testing this a year ago. But the great news is what that evolution meant in terms of new magic in the product.

Gaurav: Yep. Absolutely. What it’s meant for us is, that earlier this year we released a Word plugin. This was in preparation for releasing AI features alongside your contract editing experience. And even there we talked a lot about whether we should we build an in-browser editing experience or should we do it in Word, which is where a lot of our customers live. And that’s where we ended up going. The initial release was really about just getting the canvas out there and providing some value. So, you can just click a button, edit in Word on Lexion, the document opens in word, you can edit it there and hit save, and we’ll do version management. It’ll go back into Lexion without you having to upload files and rename files and all that. We have a roadmap on what we want to release, starting with being able to identify clauses and say, “Hey, here are clauses from your playbook or from your prior contracts that you may want.” And all the way to automatically looking at a contract and saying, “here are the high-risk and low-risk areas.”

When Da Vinci 3 was released in late ’22 we continued to experiment and said, “Hey, let’s go see how this performs against some of the tasks that we’re giving it. And we saw remarkable improvements in generating contract clauses, and in proposing edits and because we have several in-house lawyers — Jessica is just one of them. But Krysta, who runs BizOps has lots of experience with negotiating deals and contracts. Staci on sales enablement used to be a lawyer. We have a lot of people with experience that we can test these ideas with. And when we started showing it to them, the ‘wows’ were immediate and we’re like, okay, this is great. Let’s start showing it to some customers in this prototype form. And again, we started to see, oh wow, this is very helpful. And that’s when we knew we should really move around a bit of our roadmap and accelerate getting this into the hands of customers.

And so we rolled out AI Contract Assist in December. It is a Word plugin that helps you generate new clauses or edit existing ones. You can do things like highlight a big old payment clause and in colloquial language say, “Hey, can we modify this to be net 90 and quarterly billing, and I don’t want any payment fees.”

Enter. Literally, you can type that and we will show you red lines in line that you can then accept or reject or ask for another suggestion. And this is just the beginning. The fact that we have your entire team’s workflow. We have the email that came in, “Hey, the customer wants quarterly billing and net 90, are we cool with this?” And we have all your historical red lines because we manage drafts and we have the final executed contract. We have a wealth of data to train these models. This is a very exciting area of research, but also exciting because we think this is going to turn into products very quickly and start adding value.

Elisa: The magic of what you’re talking about is incredible. I know this has been a hot item since you’ve shown it to some of your customers and, we’re very excited for the year ahead. We’ll pause there. One of the things we like to end on is a couple of lightning round questions. So to wrap things up, the first lightning round question is what do you think will be the greatest source of technological disruption?

Gaurav: It’s probably already passe to say GPT X or N. I don’t know. There are so many exciting and interesting things happening in the world right now. It’s hard to say, which is going to be the most disruptive. Some of the things that I look at, we’re going to see AI become pervasive in applications. I’d say this is a foregone conclusion and not something anybody is surprised by. But I think we will also see changes to how we consume and produce energy over the next 10 to 20 years and how we think about long supply chains versus shorter ones. And I think there’s going to be a lot of opportunity in how we think about tackling climate change using energy-consuming products.

Elisa: Next question. Aside from your own, what startup or company are you most excited about in the broader intelligent application space and why?

Gaurav: I’m inspired by GitHub Co-pilot. It’s a remarkable addition to a developer’s tool set. I think it’s very far blown that Co-pilot will start writing entire blocks of code. Having been a developer and enjoying writing code for fun, I see the benefit of removing a lot of boilerplates, making it easier to work with new libraries, of being able to jump in and produce something a little bit faster. And think it’s the early innings for co-pilot. I think it’s going to be quite a transformative product for GitHub, but time will tell.

Elisa: All right. Next one is, what is the most important lesson likely something you may wish you did better, that you’ve learned over your startup journey?

Gaurav: I think energy and emotion management is closely tied to sleep and diet and exercise. I find that if I’m having a bad day or, you know, it’s, it’s a tough day, I’m able to deal with it a lot better if I’ve just taken care of myself and done those things. And it’s a lot harder if I haven’t. I’ve generally not been very good at listening to my body and taking care of those things, but making a conscious effort really pays huge dividends in my ability to have quality output in a given day. I know it, it’s such obvious advice, but I think a lot of people don’t really take care of themselves in the way they need to. And when you’re in a high-performance dare I say sport, like running a startup, you need to be in good physical and mental shape. And it is just as important to take care of these things as it is any aspect of your business or your team, or your family.

Elisa: So important because it never really slows down so then you just have to make space for yourself. Okay, last question. What’s something you’re watching or reading right now?

Gaurav: I am a science fiction junkie, so I just finished reading “Termination Shock” by Neil Stevenson, and now I’m reading, “Dodo,” his other book. I’m a big fan of Neil Stevenson. He’s also a Seattleite, which I love.

Elisa: All right. Well, Gaurav, thank you so much for this amazing conversation. It’s been such a pleasure to work together the last few years, and so excited for the direction Lexion is headed and this big opportunity, and it’s just such a pleasure to chat today. So, thank you.

Gaurav: Thank you for having me. This has been really fun.

Coral: Thank you for listening to this week’s episode of Founded & Funded. To learn more about Lexion, please visit Lexion.AI – that’s L-E-X-I-O-N.AI. If you’re interested in learning more about the IA40, please visit IA40.com. Thanks again for joining us and come back in a couple of weeks for the next episode of Founded & Funded.

 

Magnify’s Josh Crossman on Incubating a Startup and Bringing AI to the Customer Lifecycle

Welcome to Founded & Funded. My name is Coral Garnick Ducken — I’m the digital editor here at Madrona. Today, investor Elisa La Cava is talking with Josh Crossman, CEO of Magnify, which was incubated at Madrona Venture Labs. Josh was actually recruited by Madrona Venture Labs to help launch a business that would bring AI to the customer lifecycle. He was quickly persuaded and signed on in 2021, spinning out of MVL in less than three months with a $6 million seed round.

In today’s episode, Josh and Elisa discussed the incubation process and how Magnify’s intelligent application to improve customer retention, expansion, and adoption is solving one of the biggest pain points of the customer experience — focusing on and understanding the needs of every single user. They also dive into finding product-market fit and the importance of incorporating AI/ML into a product from the very beginning. I’ll go ahead and hand it over to Elisa to dive into all this and so much more.

This transcript was automatically generated and edited for clarity.

Elisa: Well I’m so excited to be here today I’m Elisa La Cava — I’m an investor on the Madrona team, and I have the distinct pleasure of having a conversation with Joshua Crossman. Josh and I have known each other now for a little over a year, since we invested in Magnify’s seed round. And today’s topic is all around very early-stage company building. Josh, you have been in the trenches, literally going from zero to one, building a company from ideation to launch and there are so many incredibly meaty topics we can cover, and I’m excited to dive in. But first I just want to give you a chance to introduce yourself.

Josh: Yeah. Thanks Elisa. So I’m the CEO and Co-founder of Magnify, the operating system for the customer lifecycle, and I’m really excited to be here and — let’s dive into it.

Elisa: So the customer life cycle, that’s a like a big statement, right? Everything, especially in today’s economy, it’s all about how do we make sure our customers are happy. In software that means are they using our product the way that we wanted them to? Can we get more folks to use the product over time? What does it look like years down the road when you want them to continue to be happy, thriving customers and utilizing all of the new products and features that you’re building? Sometimes we call that process post-sales orchestration, which can be a mouthful — how do you think about what that space is?

Josh: Yeah. The whole concept is exactly what you’re talking about. The challenge is that we’ve built these really amazing technologies in the software industry — particularly in B2B software, and enterprise software — but most enterprise software companies have real significant challenges with adoption and retention. And so what Magnify is doing is bringing automation, bringing AI/ML, bringing software into a problem that software faces it in itself. It automates and personalizes that customer life cycle. And fundamentally, that’s about understanding the needs of the user — what are their challenges and how do we automate and connect with them in ways that can transform their experience in really delightful ways? It’s a real shift from the way that traditional enterprise software thinks about working with their customers and working with their users. And we think it’s one the industry’s really ready for.

Elisa: Let’s talk about that shift. What is the status quo? Why does enterprise not get this today, and why is this a huge pain point that is worth solving?

Josh: Let’s start with kind of the current state of the state. Let’s say you buy a piece of business software — what typically happens is that you’re assigned a human being called a customer success manager. And this customer success manager is this wonderful human that is going to work with you over the course of your onboarding, your adoption of the product, ongoing usage, retention, your experience. This is the account manager for you that is really centered on making sure that you get value out of your software product. And I’ve run some of the largest and most highly regarded customer success teams in the industry. And I have nothing but love for customer success.

Elisa: Well, and it’s important, right? You are the voice and the face of that company and that software. You are the point person they go to when something goes wrong.

Josh: That’s right.

Elisa: You know, when they have questions about things they can do better. It’s like, this is the one person on your team that is always a go-to.

Josh: That’s right. Sometimes we call it “The one throat to choke,” when things are going bad. Usually, we like to say it’s your single point of contact that’s here to make sure you are delighted by our piece of software. And so this human being is doing great work, and we’ll call her “Sally CSM” and Sally CSM is talking to you every day maybe every week or maybe every month, depending on the nature of the engagement. The challenge is that Sally’s only talking to Elisa, and Sally has another 15, 20, 30 other accounts that she’s managing. Sally just doesn’t scale. Really the model and the industry has really been relying on that customer success manager working with a single point of contact and then that point of contact in turn driving adoption and usage out into the rest of that business, into that enterprise. And Sally never talks to the thousands of users at Elisa’s company. She can’t. She’s a human being with like finite time and finite capacity. And that just doesn’t scale. You can’t scale the personalization.

And with the rise of product-led growth, with the rise of user adoption, we think there’s a real change needed in the way that we think about it. We’re still going to have CSMs. We still need that. That’s absolutely essential. And we think that we need to start centering on the user. So much of everything we do in, in software and enterprise software is centered to the account level. And we have to get to a place where we’re thinking about every single user, no matter the size of the account, and understanding their needs, understanding their behaviors, understanding where they are and their adoption journey, understanding what are their excited and thrilled, and where they’re frustrated and understand how can we automate and personalize that experience for each and every user.

Elisa: This is exciting. I love this space. I love what you’re building. Let’s talk about what building a day-one startup feels like. You co-founded this company literally day zero to now with product launch. And you also had an interesting path that I want to talk about in how you did that, which was through a venture lab. When you started working with Madrona Venture Labs to build the idea for Magnify, what was your goal?

Josh: So, I’ve been in tech for over 20 years now. I’ve been a senior executive at a bunch of different companies, scaled companies, taken companies public, and as a GM I took a company from single-digit millions to almost a hundred million and profitable. The last one we scaled and then it was acquired by a public software company. So, where I was in my process after we sold the last company, I took some time off. I sort of poked my head up and said, “What am I going to do?” And I was in conversations with a late-stage data observability company doing some really interesting stuff. They’d just taken down a big round and they wanted to bring in sort of a COO/ president type to help run the company and work with the CEO founder. And I thought, oh, that’s great. It’s a perfect kind of role for me and let’s go do that. And then I get this phone call, and this phone call is from Madrona, and I know some of the partners and folks and, and they have this phone call says, “Hey, we’ve got this really interesting idea that we’d love to talk to you about.”

And I think that’s where the conversation with MVL (Madrona Venture Labs) starts. So, MVL has two models. One model is that you are a CEO/founder type that wants to come in and you’ve got this concept, you’ve got this passion. And if it aligns with what MVL is thinking about and how they think about the world, they can help bring you along on that journey. There’s a second type of idea though, that they have, and that is ideas that they’ve been thinking about and saying, Hey, we want to bring AI to finance, or we want to bring AI to travel, or we want to bring AI and just take AI for a second. We want to bring AI to the post-sales world. And so Moderna Venture Labs has been thinking hard about this problem, and then what they say is, I think there’s something here. We need to bring in a co-founder to help solve this problem, to help create the idea. We’ve sort of developed it, we’ve tested it with other companies, depending on where they are in that process can be relatively far along and the idea, and then they’re trying someone to help take it all the way.

And that was what happened to me. So they said, “Hey, we’ve got this thesis around bringing AI automation into the customer experience, into the post-sales world. Sounds like it’d be a fit given your background, Josh, why don’t we talk?” And I said, you know, “Tell me about the company.” And they said, “It’s pre-revenue…”

Elisa: it’s not a company, yet.

Josh: It’s not a company yet, it’s an idea.

And I said, oh, no way. That’s crazy talk. You know, that’s a young man’s game. What are you talking about? Uh, you know, I’m not that old. And I’ll tell you what, once I talked to them, I absolutely fell in love with the idea, fell in love with MVL and what they’re doing there, and got so excited. And, and you know, I think the process took about three weeks.

Elisa: Wow, that’s fast. That’s a quick turn. So you were instantly hooked. So how, how was that process? Tell me more!

Josh: So, in my case, you know, it moved really quickly. They already had a relatively strong idea so we took that idea, and you know, I’d say I put my 25 to 30% spin on the idea. Like, okay, let’s change this. We’re going to modify the product concept this way. Think if we add this capability or that capability, it could make a real difference. And then, MVL really just worked with me. So the way that Madrona Venture Labs works is that they have a set of functional capabilities and functional experts on the team.

So you may have a CTO type and they’ve got, uh, in my case, a guy named Keith Rosema. And Keith is amazing and super smart and talented technically. So, they’ve got sort of the engineering side. Then they may have a product side, they may have a marketing and sales and GTM side, they have a design capability and branding, and then some general management experience and all those things. And so in my case, what happened was we took the different functional capabilities and used them as we needed to refine the product concept to get the pitch right. I probably needed a little less of that and needed more help on the product side. So I’ve got a lot of product perspectives on what the product needed. But I’m not an engineer. I’m not a VP of product. And so what I really needed was the help, like, let’s get the mocks right, let’s get the designs right. Let’s talk about the backend architecture. How are we going to think about data stores? That allowed us to accelerate and move on the build right away rather than maybe being gated until I had a technical co-founder, until we had made some hires, until we had a bunch of other stuff. So we could accelerate that process in some really important ways and allow us to get to market much more quickly.

Elisa: Okay. Switching gears — I want to talk about product-market fit. In your case with Magnify, you had an idea, you spun that out, you did fundraising, you found a co-founder, you started building your initial team, you really went from zero to one. Like you had mock-ups that you were thinking about, and then trying to build kind of three products in one to go to market in a really strong way. And that’s a hard and long process. So, I would love to hear kind of what’s your approach or what has been your philosophy over the past year as you have gone from napkin to bits flowing, product launching software company.

Josh: First of all, it happened over the course of about a year, but it feels like it’s happened really quickly. Like, just to help put context, I think we started talking with MVL in July like is 2021. I came on board in August. We pitched to Madrona as the primary pitch and then to Decibel as the second, you know, as the other investor, in September. We funded October. That was a very quick process. And then you’re, you’re kind of like, oh, great, now I’ve got this check for $6 million. You’re kind of like the proverbial dog that caught the car. It’s like, now what? I’ve got this big check. I got to go build a product. Holy smoke.

So, I think a couple of things that, at least for our journey were specific to us. We had already, MVL specifically, Madrona Venture Labs, had already talked to a bunch of customers and had a thesis on the market. And so there was a bunch of research that had been done that we could immediately plow into it. I’ve led large customer success and, and customer-facing teams for many, many years now. And so I had a point of view on the market as well. So, in some ways, we could immediately accelerate the process of, okay, what do we want to build? What do we want to design? Because I can say, I want this, not that, I want this, not that. And that’s really dangerous for myself to be like, Hey, I know all the answers and all of that because that becomes very quickly a segment of one. But I want to be clear, like, if you are a founder and have a vision for the product, yes, you need to get external points of view, and also, my council is to build what you think you need to build. I don’t think it’s going to be crazy off the mark. If you’re in the space and you know what you’re doing or you know what you want to build and understand and have empathy for the users of your product, then there’s a lot of real estate that that covers. And so I think that was kind of the first moment, or first few months of the process was like, Hey, we know we’re going to need this thing, that thing, that thing. Let’s go work on those and, and design the process.

Now, once we got into that, then you very quickly go and get a set of design partners. So, we talked to a number of folks in my network and other networks and saying, Hey, I’m going to test the positioning, the concept with you. And so there was a sort of quasi pitch deck that we built, that we tested with them, and a bunch of questions and some really helpful scripts that actually Decibel had given us around ways to ask questions and do focus groups and things like that. There were some really good things about product that we learned during that process and learned by getting input and counsel from some subject matter experts into the design. Then you start putting mocks in front of your customers. And, you know, and mock we’re talking like stuff in Figma that we’ve, we designed and, you know, it’s actual clickable demos, but it’s, there’s no code. It’s just all, all slideware. I think. Here’s one of the things that, you know, sometimes you’ll hear, was that “nos” were much more helpful than “yeses.” People will often say things like, yeah, yeah, yeah. Or, I like that. Because people generally don’t want to offend other people.

Elisa: Said another way, if someone’s saying “yes” maybe they’re just not engaging. Maybe they’re not into it, so, if you get someone saying “no,” it means some part of them cares enough to give you that feedback, and some part of them thinks the problem is annoying enough to talk about.

Josh: That’s right. And so what would happen is sometimes we test product concept. And I’d be like, oh, this is like the best idea ever. And I, I remember this one call with this one woman who is a chief customer officer at a, a fairly large company, and she’s like, I don’t get it. This doesn’t make sense. I’ve already got the solution — it is a solved problem. I’m like, here are the differences. Here’s why this isn’t a solved problem for you. And she’s like, yeah, I’m not compelled. And I remember getting so pissed at the end of that call. I was externalizing my failure to persuade and convince and you know, you ask all the why, how, what questions in that conversation, trying and unpack, well, what would make it compelling? How do you solve this problem today? So, we got a ton of data, but I remember the whole time just sort of seething inside being like, how dare you not believe in my vision? And like…

Elisa: That’s the fire you need because you’re thinking back to, I mean, you’re talking about building these mockups in Figma, which is basically a, a pretty picture for prospective customer to look at and react to, but they can’t really click through get genuine insights. But it gives them enough context for the conversation that they can give you feedback. And so I can imagine that kind of a conversation with a number of prospective customers, kind of put some fire in your belly. Like, how do I translate my belief and conviction, and what my product can solve onto a page and do it in the flow that resonates with my prospective customers?

Josh: That’s right. So after we did, you know, a bunch of these mocks, we did proof of concepts. And this is, you know, this is not very revolutionary, but we did a lot of stuff manually — some in code, some with a little bit of mechanical turk, like human beings sort of doing the work to show that yes, indeed this would work and that we can solve these problems. In one case we worked with a company and they took a year to build basically an equivalent product with us and generate some insights. They had sort of built something homegrown and we have a product that can replace their homegrown solution. And we were able to do something that took almost a year for them to do in the span of a couple of weeks. And that is awesome. Like when you can do that sort of proof of concept and get somebody excited. And in particular, I’d say a great indicator of product-market fit is when you start running into multiple customers that have built something similar as a homegrown solution. Because that means it’s important enough for them to invest dollars. It’s important enough for them to invest time. They think it’s a real problem. And so even if you can’t displace a homegrown solution immediately because maybe you just are not far enough along and your development process or whatever it is, that’s still great validation of product-market fit.

And so anyway, . We did a bunch of proof of concepts, got sort of good validation, good fit. And then ultimately we get to a place where we are today, which is releasing the product to getting to minimum viable product. And what we like to talk about a lot is MDP, which is minimum delightful product.

Elisa: You brought up an interesting point, and I’m curious if this has been a part of your discovery with your prospective customers in selling and figuring out if there is a space in the market and budget in the market for me to build this incredible solution that a homegrown product couldn’t fix, for example. And what we see with a number of our other companies we work with is that decision some companies have between buy versus build and how do you make that decision? Engineering resources are scarce. Time is always a really critical factor. And so what it sounds like with that discussion you had is, Hey, what you guys, took you a year to do with part-time resources, that we could do in a number of months.

Josh: Weeks days, you can’t sell me short.

Elisa: Okay. Weeks to days. Plus, even more importantly, what you plan to do and what’s on your roadmap for the year ahead in creating value and helping them drive effectively top-line revenue to the business is a very strong value proposition.

Josh: And I think the way that you get there is you have to demonstrate sort of competency, and I think you have to create transparency and trust. It’s a two-part process because first, the product has to work or you have to be able to show that you can actually do something. And, and all three of our initial customers that we did, uh, some of our initial proof of concepts with, in every single instance, we had to demonstrate competency and trust and the product work, that our ML insights were significant and superior, all of that sort of stuff. And if we didn’t do that, I’m not sure we’d be in the place we are today with the product. Because that pushes you to excellence, it pushes you to build something that’s very compelling. From our point of view, it’s not enough just to meet a minimum bar. It’s like, no, we have to demonstrate real and significant value. Then I think at the same time, the whole time I’m in conversation with the executives that are sponsoring projects saying, Hey, I’m going to be really transparent about where we are, here’s what we’d like to do. We’d like the opportunity to earn your business. We’d like the opportunity to partner with you. You know, is that a ridiculous question? Like, For you, a big company to work with, a small company, you know, and then they say, no, it’s not ridiculous. That’s great.

And I say, okay, good now what would you need to see from us to feel comfortable to do that? That then also shapes your product roadmap. That shapes your deployment model. It shapes a bunch of different things around what they need to see from you. And I think that that becomes really important when you’re moving from kind of zero to one. When you’re moving from some proof of concepts to actual revenue. What are those components that you need to have? And it could, by the way, it could not just be product features. It could be, well, am I going to get a customer success manager to help me manage or, what are your service level agreements? Like, there are a bunch of things that sit in there that you have to understand, and if you’re open early days in that process and you’re clear and transparent about what your priorities are and what their priorities are, you can usually get to a place where that translates to revenue in a relatively short period of time.

Elisa: Okay, one last topic.

Josh: We still haven’t talked about the product, by the way.

Elisa: Okay. Let’s get into the product. Uh, well this is almost a segue into that, and it’s thinking about how do you build intelligence. You have effectively multiple different products that you’re building right now inside of Magnify. And one of the beautiful and magical things is how do you build artificial intelligence, machine learning, deep learning into the data that you’re collecting and pulling, to create these incredible insights for your customers. So let’s hear more about what those multiple products are, and then I would love to hear how are you thinking about building an intelligent application.

Josh: So, Magnify automates and personalizes the customer life cycle for every single user. So, the way to think of it is it’s taking all of the data in your systems, across all the disparate disconnected systems, taking that in, generating business insights on them, and then automatically engaging every single user to drive product adoption and satisfaction. And if you can do that, you’re going to grow your revenue, you’re going to improve your retention rates, you’re going to get your upsell, and your customers are going to feel better for it, which is, which is arguably the most important thing. And we do that through three components. The first component is actually, that we have to ingest all this data. So, there’s user behavior from product telemetry, and from other systems. There’s sales revenue data, there’s customer success data, there’s marketing data, there’s all this stuff. So, you have to pull all these different disconnected systems into a single repository.

Now, some folks have a CDP, customer data platform or sort of other data warehouses that grab all that information, which is great, and that makes our life a lot easier. Most do not. And so in our experience, we take all that data and we connect and stitch it all together and we create what we call the lifecycle graph. And that is where every single user is in their adoption journey. And how that connects to other signal like revenue or retention or churn or growth. Okay. Then what you do is you do the second piece of our product, which is we then apply user insights on top of that. So, we can then say, ah, we now have looked at every single user, we can understand where they are in their adoption journey and what is the next best step for them. So, user 258 is in this place in their adoption journey, and here’s what they need to do next. And that’s different than user 259 and different than user 260, and so on and so forth. And what that allows you to do is actually create a real level of personalization for every single user. That’s fundamentally important.

What matters there is that we can also then say, how does that correlate or connect with things like churn, revenue growth, trial conversion. So, we can then say, great, here are the risk indicators. Here’s revenue prediction. Here’s how we think about the world both at the account level, and at the user level, and this set of insights, we can literally predict revenue quarters out now for our customers. Then, okay, cool. Great that you’ve got these insights, but frankly, insight without execution is kind of worthless. And so what we ultimately do is we then connect and drive orchestration/ execution. Most companies have lots of different digital touchpoint systems. So maybe they have an email system, maybe they’ve got some sort of end product platform — there’s this laundry list, this Cambrian explosion of tools: Slack, text, chat all of that stuff that sits out there. And each of those are disconnected. And so what we do is we then can connect via API to every single one of those different systems. And then drive the next best touchpoint. So, let’s say user 259, they need to use this feature. The next time they log into the product, that feature pops up. Did they use it? No. Oh, maybe we send them a text reminder. Maybe we send them a Slack reminder. Maybe we open a support ticket because they could only get a third of the way through and they seem stuck somewhere. And we execute and drive each of those actions in those different existing digital touchpoint systems, which allow our customers, the users of Magnify, to ultimately get more value out of their go-to-market investments.

Elisa: And I love this because it’s personalization and almost magic at scale. Those three product pillars that you’re building inside of Magnify to deliver that, that is the hardest technical challenge and the most beautiful and magical solution when you can deliver it. So that’s the vision for Magnify today.

Josh: That’s right. And so you’d asked the question like, Hey, how does AI connect to this? And so here’s what I think some companies do that is the anti-pattern, which is you can’t just sprinkle some sort of user, magic pixie dust on something and it will all of a sudden fix these problems. The way that we’ve thought about AI and ML is that it has to be incorporated from the very beginning. We had applied scientists on our team from day one, and they were influential in understanding what are the product requirements, and how you think about that. For instance, we pull in all this data to create the customer lifecycle graph, but you’re pulling data from all these different systems and each of those data fields has different contexts. There’s different metadata around that that the AI systems need to understand in order to be able to correlate and compare signals across different systems. So if I just said, well, I’ve just created this AI insight engine, but our applied science team isn’t talking to our product and engineering teams around their requirements upstream for data ingest, and the work that we’re going to have to do and to our customer success teams on what is needed for customer onboarding and onboarding wizards and things like that, all of that needs to be incorporated into the product. So, when you think about AI – AI is delivering value because it’s giving these insights, and it can optimize the journeys, the next best steps for every user. There’s all the stuff that it can do, but In order for it to do it well, it has to be woven into the very fabric of your product from day one.

Elisa: We should talk about this in a part two because this is exciting and I know that the journey for Magnify is just beginning. But I have a couple of lightning round questions I do want to end on.

Josh: Bring it on.

Elisa: Okay. So the first one, what do you think will be the greatest source of technological disruption in the next five years?

Josh: Hmm, I’m really torn. I’d say the short answer is the generative AI I think will be a significant source. If you asked me 15 to 20, I think it’s going to be something in biotech.

Elisa: Second question, what is an important lesson you’ve learned becoming a first-time CEO?

Josh: I think how important it is to constantly reinforce the vision and the messages you’ve already communicated over and over and over.

Elisa: It’s so funny hearing you say that because every single CEO I talk with says the same thing. It’s like half my job is just reinforcing the vision, repeating the vision, making sure that what we are doing and our go forward is ingrained with all of our people. Alignment is huge, especially at these early stages when you’re building your first product. Okay. Finally, what’s one thing you’re watching or reading right now?

Josh: Ooh. I just finished binging “Welcome to Wrexham” with Ryan Reynolds and Rob McElhenney, which is a story of them buying a tiny little English football slash soccer team in the middle of nowhere North Wales, and them kind of going this turnaround journey. And neither of them are business types. That’s very clear. And it is gripping television in a fun, lighthearted way. I cannot, it’s like, it’s like a real, it is not just like it is a real-life Ted Lasso. If Ted Lasso were the owner of the football team. It’s a great show.

Elisa: Well, Josh, thank you for spending so much time with us so thoroughly enjoyed the conversation, love working together, and it’s exciting to get more of the Magnified story out into the world.

Josh: Yep. Fantastic. It was great to see you, Elisa, and I hope you have a wonderful next few days.

Coral: Thank you for joining us for this week’s episode of Founded & Funded. If you want to learn more about Magnify visit Magnify.io. That’s M-A-G-N-I-F-y.io. Thanks again for listening, and tune in in a couple of weeks for another episode of Founded & Funded.

A-Alpha Bio’s David Younger on machine learning in biotech, building cross-functional teams

A-Alpha Founded & Funded.

This week on Founded & Funded, Partner Chris Picardo is talking with A-Alpha Bio Co-founder and CEO David Younger for our first Intelligent Application 40 Spotlight episode of 2023. We announced the 2022 IA40 winners in October, and A-Alpha was the first biotech company to make the list, which was no surprise to us. As one of our portfolio companies, we know the work David and his team are doing at the intersections of biological and data/computer sciences will change the world — but having a group of judges agree with us makes us all the more certain.

Protein interactions govern just about all of biology and A-Alpha uses synthetic biology and machine learning to measure and engineer protein-protein interactions, speeding up a traditionally slow wet lab process. The company’s proprietary platform — AlphaSeq — uses genetically engineered cells to experimentally measure millions of protein-protein interactions simultaneously, generating enormous amounts of data to inform the discovery and development of therapeutics.

It is within that enormous amount of data that the company is adding to every day that so many answers will start to be found as the company is able to use machine learning to train predictive models and begin predicting new antibody sequences that could be effective against different viruses and diseases — improving the way that we are able to discover drugs.

Chris and David dive into that future that A-Alpha is working toward and so much more — the power of data engineering, building business models around data, building cross-functional teams, having both tech and biotech-focused investors. It is an episode you won’t want to miss. So with that, I’ll hand it over to Chris to take it away.

This transcript was automatically generated and edited for clarity.

Chris: Thanks everyone for listening today. My name is Chris Picardo. I’m a partner at Madrona, and I’m super excited to be here with the co-founder, and CEO of A-Alpha Bio, David Younger. Welcome, David.

David: Thank you, Chris. Yeah. Wonderful to be here.

Chris: So, A-Alpha is one of our intersections of innovation portfolio companies. And for those who are new to that term, at Madrona, we use that to mean companies that combine machine learning and wet lab life sciences on a day-to-day basis as sort of a core part of what they do. And I think A-Alpha is an extremely good example of this. And building on that, they were named in the 2022 Intelligent Application 40 list — and were actually the first IOI, or life science, company named to the list. David, could you just share a little bit of background about A-Alpha Bio — kind of the founding story and how you started working on this problem?

David: I didn’t realize that we were the first IOI company on the top 40 — that makes it even more special. So A-Alpha Bio is the protein-protein interaction company. So we use synthetic biology, machine learning, and protein engineering as a tool to improve human health. We use these technologies to measure, predict, and engineer protein-protein interactions for a variety of different therapeutic applications. Protein interactions govern just about all of biology from how your cells communicate with each other to how genes are regulated. So for example, how coronavirus enters your cells, it’s through proteins on the surface of the virus binding to proteins on the surface of your cells. And so by understanding protein interactions, we can do things like design therapeutic proteins that will bind and block to prevent those proteins from interacting and thereby curing someone of coronavirus for example.

Chris: Historically this has been a really hard problem for people to figure out, right? How does one protein interact with the other protein? Because, as you said, that dynamic is really key to understanding what’s going on but also designing possibly an effective therapeutic for some disease that you’re looking for. I’d love to just go a little bit deeper into why this has been a hard problem and how you guys uniquely approach mapping and understanding these interactions.

David: Across all of biology, the traditional approach is to do wet lab experiments, to express proteins, to purify proteins, to measure things more or less one at a time. And those approaches are powerful but very, very slow. So, I think a great analogy to this is determining the structure of a protein — where historically, once you have a protein sequence and you express that protein, you purify that protein, it can take months or potentially even years to figure out how to crystallize that protein and then solve a crystal structure to determine a three-dimensional experimental structure. What we have been seeing over the last couple of years is essentially the infusion of data science, the infusion of machine learning into this space that has now allowed groups like DeepMind, like my alma mater, David Baker’s lab with Rosetta Fold, to develop software that can near-instantaneously predict the structure of a protein without doing any of those experiments.

With protein binding, the complexity of the experiments is maybe not quite as challenging as solving a structure, but they’re, they’re pretty close. We have to purify proteins, so express them, purify them, and then one at a time measure to see whether or not they interact. With the example of coronavirus, because this is a virus that is mutating so rapidly, there are hundreds of different variants that have been observed. There are near-infinite numbers of combinations of those mutants. And there are vast numbers of mutations that potentially could occur in a coronavirus that hasn’t ever been observed. And so, if we really want to understand how an antibody or multiple antibodies are going to bind to the coronavirus, we can’t possibly use experimentation to measure antibody binding against all of those different variants. So, machine learning becomes a really effective tool to measure subsets of those interactions and then train models that can essentially infer the binding properties of the remaining ones.

Chris: You talk about the old version of experimentation — one-to-one throughput. You know, it might be worth mentioning a little bit about how A-Alpha differs on the scale side of your capabilities.

David: A great way to kind of put this into perspective at a very high level is to say that the largest public repository of protein-protein measurements is a database called BioGRID. Any measurement of protein interactions that anyone publishes gets collated into BioGRID. And BioGRID currently contains about a million and a half protein interaction measurements. At A-Alpha, with a relatively small team of 39 folks, we’re measuring about 6 million protein interactions each week. So, we have a database that now has over 200 million protein-protein interactions. We expect that it’s by orders of magnitude the largest repository of PPI data in the world. And each assay that we perform because we’re able to leverage synthetic biology and next-generation sequencing and advances in DNA synthesis to really scale these experiments up, we’re able to measure millions of interactions at a time instead of one at a time.

Chris: It just impresses me every time you say the numbers. When we talk about data generation and the life sciences and in biology, A-Alpha is a really good example of differentiated data at scale. And I think that brings us to this machine learning point, which is I don’t think that what you guys are doing on a daily basis is really possible to parse through without machine learning. So, I’d love to get your take on how do you think about the role of machine learning in A-Alpha and how it fits into the way that you run the company and you run your assays on a daily basis.

David: I think that’s exactly right. I mean, we think about the power of our platform kind of in two different scales. So, one of those scales is what we call our platform advantage. And essentially our platform advantage is a technology that allows us to measure protein interactions faster, more quantitatively, and at higher throughput than other techniques. And even without machine learning, we can use our platform advantage to solve high-value problems across the pharmaceutical industry better, faster, and cheaper. It’s not that without machine learning there isn’t value to the data that we generate. But without machine learning, we’re leaving so much on the table because we are able to essentially extract insights from a tiny, tiny subset of the data that we generate. So, if we’re measuring a million interactions in a single assay. Sure, we can find the interactions that are the strongest, or we can find the interactions that fit a particular profile most closely. We can do those things without machine learning, but what we can’t do is uncover all of the nuances of the patterns that go behind the influence of each possible amino acid mutation and how that influences binding and stability, expression, and all of these other really important biological characteristics.

So, if we start to get to, how do we start to optimize proteins faster? How do we do multi-parameter optimization for different properties like affinity and specificity and cross-reactivity and epitope engagement? To solve these types of properties, without machine learning, we’d be living in the Stone Age.

Chris: A couple of things David just said came in at rapid fire, but I think one of the ways to understand this is that there are a bunch of different ways that proteins can interact with each other. From the strength of the binding to the behavior to the specific physical places where they actually touch. What David’s talking about here is that A-Alpha has built a way to interrogate all of those types of interactions at scale in a way that nobody else can. And I think one of the interesting things too is, for a long time, a lot of these experiments as you talk about, were kind of binary one-on-one. Put two things in a tube and see what happens. And I think with A-Alpha, you get all of these other variables. You get the stuff that doesn’t happen, you get the stuff that kind of happens. And so, as you parse that apart, it’s really this machine learning that kind of takes over. And I’m curious, as you start to observe those types of things, where is it going? Now you’ve got a 200 plus million data point, let’s call it a training set, you’ve got the ability to screen 6 million PPIs every week, and it’s getting bigger. Where are you pointing this? What’s ML going to start to really unlock when you think about, what the outputs look like down the line?

David: So, the first place where we’ve already seen our database have a substantive effect is in understanding what antibody sequences don’t work. Being able to rule out particular sequences, particular patterns of amino acids that just don’t produce good binders that don’t produce stable antibodies. By ruling these things out, we can design libraries that are essentially enriched for things that are better, or we can computationally knock down observations that we see about sequences that we know are not going to behave well.

If, for example, we generate a lot of data around a library of antibodies binding to a number of different targets, and now we want to generate a second set of newly predicted antibodies that have an improved binding profile, we can use all the data from that experiment, but we can also essentially reference all of the historical data that we’ve generated in order to take those predictions and screen them for how likely is this antibody actually going to function properly. So that allows us to essentially move faster, and we can optimize these really complex problems in fewer iterations, which means cost savings and time savings.

Where we see this going eventually is getting to a point where we’re doing more and more in silico prediction of new sequences. So coming back to the coronavirus example, if we have a dataset that consists of thousands or millions of different antibodies binding to thousands or millions of different coronavirus variants, we can start to map a landscape such that if a new coronavirus variant crops up, even if it’s one that we’ve never seen before, we can take all of the data that we’ve generated historically and predict a new antibody sequence that’s likely to be an effective drug against that never-before-seen virus. That’s an incredibly exciting promise of what we can really harness this quantity of data for.

Chris: I think that’s just such an important point that you just made. In the sort of early drug discovery process, it’s just been a very difficult search problem. Where you have to have, to use the industry’s term, a library of specific things that you want to try out, and then you search it — slowly, one by one. And if nothing works in that library, you have to find another library. And I think the way you’ve reframed the problem, which is, hey, if something kind of works or if a subset of things totally doesn’t work, oh, we go get another library, but we build that based on all of the information that we just learned and all of the information that we previously screened. I think that’s a powerful new paradigm in this entire space. And I think A-Alpha is the type of company that’s pushing that forward from like a “how do we change the game perspective here?”

David: I think that’s exactly right. The technologies that have generated many of the antibodies that are currently in the clinic as cancer therapies or autoimmune therapies were discovered by essentially throwing millions and millions of darts at a dart board and picking the ones that hit the bullseye and ignoring all the others. And we’re in a day and age, and our approach at A-Alpha is: Sure, we’re going to pick the ones that get closest to the bullseye, but we’re also going to take a snapshot of that entire landscape and learn from it so that when we’re developing the next therapy, we can get more shots on goal.

Chris: Yeah, every time you do it, you’re just making better darts, and you’re really figuring out what that looks like. It’s such a cool way to think about it. One other thing I wanted to ask you about here is just the power of data generation. I would say it’s something that we hear about a lot in biology and life sciences, but it’s hard to wrap your brain around it. How important do you think novel data generation is? And if you look forward 10 years, is every company going to be just figuring out, like, how do we produce huge novel data generation at scale? What’s the sort of importance level in terms of how you think about that as a component of strong discovery?

David: Yeah, I think it’s, central to everything that we do. It’s central to many companies that are taking a similar thesis to ours of really being a data-driven drug discovery company. And I think that this isn’t just true for companies that are developing therapeutics. This is really a paradigm shift across all companies that are doing biological research and all academic groups that are doing biological research. The tools that are at our disposal today are enabling just a step-function increase in the pace of scientific discovery. And that’s because experiments don’t have to be run one at a time anymore. We can now synthesize millions of different defined DNA sequences that are all in a test tube that arrive in a week and at an affordable price. And we can build experiments essentially thousands or millions at a time and then use next-gen sequencing as an output. No longer do we have to create a single controlled experiment that tests for one hypothesis that focuses on just one question. We can now ask a million questions simultaneously, which allows for very, very rapid discoveries for therapeutic applications, but also just in accelerating our understanding of biology.

Chris: And it becomes such an important machine learning problem because historically, there have been a couple of data sets, like you mentioned BioGRID, and you could try to throw the 500th algorithm at that and see if you can find anything interesting, or you can go generate totally novel training data and do that iteratively. I think this is an interesting transition that when you think about applying machine learning to this space — this is something that’s really compelling to people who want to do that. And so, when you think about that, something that you’re doing obviously every day is building a cross-functional team — software engineers, ML engineers, wet lab scientists — how do you think about this? It’s not a challenge that every company faces. It’s very multidisciplinary. What is your thought on team building and bringing different personas and backgrounds together?

David: I think the most important thing in our experience is that the teams have to be really excited and passionate about working with each other. It’s not necessarily that a data scientist needs to come into A-Alpha knowing everything about biology, it’s that they have to be incredibly excited about getting into the weeds and learning about, at least to the depth that’s needed for them to understand the biological context, for them to ask the right, ML questions.

The same is true for biologists and biochemists. They need to understand what the data science is capable of and the parameters by which we need data in, in order to effectively train models. So having a very close collaboration between those teams and a mutual interest in understanding what each other do is essential for the company to be effective. I think if we were building a team that was just wet lab and we were outsourcing all of our data science or vice versa, building just a data science team and outsourcing all of our wet lab, we would not be able to do what we do as effectively and certainly not nearly as quickly.

Chris: That leads me to a question, we hear this a lot now, right? That people are encouraging ML scientists to go work on life science. And the problem can seem scary. You’re walking into something where there’s all these biological terms, they don’t really make any sense — it feels like people are speaking a foreign language. Is it possible to teach an ML scientist biology and vice versa?

David: I think both are possible, but both are hard. My experience, so I, was trained as a wet lab biologist and would certainly not consider myself a computational biologist at all. But during my graduate work, I realized that in order to stay relevant long term in biological science, there is a need to figure out how to get more proficient at data science. Because biological science is moving more and more toward these massive data sets that are impossible to parse without some sort of data science or bioinformatics tools. It would be very hard to stay relevant. And so, from that perspective, there’s a good amount of sort of healthy pressure on biologists to pick up some of those skills. If a biologist does not have any of those data science skills, over time, they’re more or less going to be relegated to generating data and then not having the tools to be able to play with that data. And I think, from the perspective of just about any scientist, the fun part of the job is not just generating the data, but it’s digging into that data and trying to get insights. From a data science perspective, I think a lot of people who choose to work in companies like ours have some reason to be passionate about biology. Maybe it was interesting from high school. Maybe it’s that a family member had a particular disease and that launched a passion to get involved in some healthcare-type aspect of a career. But you know, typically, there is some driving force that leads to that interest.

Chris: I’ve watched you between these two, so I can tell everybody that David is more than capable of it. But I do think these worlds are colliding in a very interesting way. I think what you mentioned about the scientists realizing that the quantitative tools are becoming table stakes is getting more and more true. And the ML engineers and the ML scientists are realizing there are some interesting problems to go solve on the life science side where you’re just going to get handed a pile of novel, interesting, totally untouched data, and you’re going to go generate new insights. And that has to be a powerful message, right?

David: Absolutely. Part of it is the impact, right? You’re, potentially involved in better understanding biology or curing disease. But I think also there is something that’s just innately complicated and messy and noisy about biological data. It’s just a very exciting source of data that I see as very much the next frontier of data science.

Chris: Yeah, as we talk about, you know, the Intelligent Applications 40 and the companies that are on it this is a core theme right here, especially as the world of biology and life sciences moves forward — you’re not going to be able to interrogate the data without machine learning tools. It will be too high scale. You won’t be able to do it manually in the ways that you used to. And so, that really brings up another question, which is on the company creation side — it used to be, life science companies generally spun out of an academic lab, got a really core life science-focused investor, and then tried to get something into the clinic. A-Alpha and companies of your style have been built differently than that, right? You did spin out of an academic lab, but the funding and the investors are, are different and the goal is a little bit different. So, I’d just love you to talk a little bit about how the kind of different company-building process here has worked. You have a tech-focused investor on your board, — us at Madrona. You have a much more life-science-focused investor on your board over at Xontogeny. What has that dynamic been like, and how has that influenced the way you think about building the company?

David: We are very much a platform company in the biotech universe, so we are not focused just on getting single drugs to the clinic. We’re focused on building a platform that can be used over and over again to glean biological insights and to develop multiple therapeutics across potentially many different disease areas. It was very important to us to have the credibility and the know-how of a group like Xontogeny Perceptive and Ben Askew, who is on our board — traditional life science investors who know that process of taking biological data, finding targets, finding drugs, getting those drugs into the clinic — really core expertise that we absolutely need. But we’re also thinking again about building this platform that can have a long-term impact across lots of therapeutic programs, which is a big part of all of these efforts in machine learning to train predictive models and improve the way that we’re able to discover drugs over time. This is all sort of core to our platform thesis. And so, bringing Madrona and you, Chris, and Matt onto the board, have really helped to bring that sort of platform perspective. And has led to very productive conversations with some amount of healthy tension between that traditional life science investor and the kind of longer-term tech-enabled enabled VC perspective.

Chris: As someone who gets to go to the board meetings, it’s been so fascinating to get to see these discussions play out across a bunch of different angles because, at the end of the day, right, you’re one way or the other going to help a therapeutic get to market. It might come directly off of the platform, you might enable someone else to do it, but that would be the goal. However, you’re going to use all of these modern approaches and tools to do it. And so, you kind of have to think equally about, how do I build the software and ML capabilities of my business. How do I build them in line with the life sciences and the biological capabilities of the business? And it seems like this paired investor approach, at least in your case, has worked really well. Is that something you’d recommend to other companies — this kind of hybrid style of investors?

David: I think that if you’re building a company that really lies at this intersection of data science and biological science, and your goal is to build a platform company, I think having investors, having board members who have experience in those two different domains is incredibly valuable. I think a good example of that right is across the pharma industry, across biotech, there is still a lot of uncertainty around how to structure business models around data, right? So, you have all of these companies that are generating massive data sets and they’re using those data sets internally to discover drugs. But one of the things that has been fun about many of our board meetings is that we’ve started to have creative discussions about, you know, how else might we be able to use that data in creative ways. And I think those conversations only happen when you have folks with very different perspectives and different experiences at the table.

Chris: Yeah, you beat me to the question because before we jump into the lightning round, I was going to ask you about: changes in business models. Historically there have been three ways to build a business model in this platform space. One, you build a drug internally, you take it into the clinic, and then likely you license it to someone else.

Two, you have amazing platform capability, generate insights, and you partner with people who have interesting things for you to work on — and hopefully, they take a drug to the clinic. And three, you can be a little bit more, call it service or CRO, like — contract research organization, which has often been like, hey, someone needs some stuff done, and so they’ll send it to you. And A-Alpha, I think, has been very creative in figuring out how to create different versions of all of those models. And I’m curious, how are you seeing the business models evolve, and is there a type of business model here that you’re more excited about or that you are really excited to see emerge?

David: So, because we have a highly differentiated technology that really allows us to enable both ourselves and potential partners to discover new drugs that they wouldn’t be able to discover otherwise, we are able to work with partners in the context of a service-like model. But it’s sort of a very high-valued service. It’s a partnership that is structured around work done by A-Alpha but then comes with an upfront payment and milestone payments, and eventually royalties, which is that traditional pharma partnership model. We also can use our capabilities, kind of turn them inwards and build a pipeline of our own and there are certainly exciting opportunities there. What we’re also starting to think more about is, again, how we can really leverage data and data generation as an asset in and of itself. So, talking with partners about the potential to build data sets together that can be used to train predictive machine learning models in a way that really is only enabled by the type of tool that we have that gives us that competitive advantage for generating data sets that no one else is able to generate. I think that there are a number of different exciting ways, over different time scales, in which we are able to leverage the types of data, the types of capabilities that we have, both directly for discovering and optimizing therapeutics, but also for moving the whole field forward by generating these massive data sets.

Chris: Personally, like you just said, I think these data partnerships are going to be crucial for how this industry as a whole pushes forward with machine learning and as these data sets get created and these models get trained, we really will start to see these new types of deals emerge, and I think it’s emblematic of what companies in the IA40 are doing in general, which is pushing the boundaries of the existing business model by leveraging data — one way or the other, right? Data plus intelligence. So, I think you guys being out on the leading edge of that is particularly telling, and makes sense why you are part of the IA40.

David: I think as well, in the pharmaceutical industry, there has been a very strong historical resistance to sharing any type of data. Targets are confidential. Data is confidential. I mean, everything is behind closed doors, with a lock and key. I think because you’re starting to see more and more proprietary sources of kind of niche data sets that help to explain different aspects of biology that are hugely valuable in and of themselves but might be even more valuable when they’re all combined together. It really starts to create better incentives for companies to get creative about how we can leverage data sharing or different ways to enable the entire industry to grow by getting more creative with data partnerships.

Chris: And I just think it’s going to be such an exciting thing to watch going forward. To wrap up we’re going to do a lightning round of the three questions that we’re asking all of the IA40. So, I’ll start with the first one, aside from your own company, what startup or company are you most excited about in the intelligent application space and why that company?

David: There are so many of them. I’m most familiar with biotech, biopharma, so I’m going to give one from that space, which is Octant. This is a company that I’ve been a fan of for a long time — founded by Sri Kosuri, who I’ve also been a fan of for a very long time. They are using synthetic biology to engineer cells to essentially develop different disease models. They’re leveraging the power of DNA synthesis and DNA sequencing — but to create cellular models so they can test small molecule drugs in a much higher throughput setting.

But they, like us, can use the power of high throughput experimentation to develop these massive data sets and essentially improve their platform over time by training predictive machine learning models. So yeah, a very, very cool company that has some analogies to what we do but really focused on the small molecule drug discovery space.

Chris: Totally agree, Octant is a super cool company, pushing the boundaries. Second question. So, outside of enabling and applying artificial intelligence to solve real-world challenges, what do you think is going to be the next greatest source of technological disruption in innovation in the next five years?

David: Keeping this close to home and in kind of the realm of biology and biotech, I think it’s DNA synthesis. That has been a major limitation for many decades around, know, how quickly you can do experiments, how high throughput you can do experiments. We’ve made a lot of progress even over the last few years. We’re now at a point where we can order from a company like Twist on the order of tens of thousands of 300 nucleotide oligos and get them back in about a week. But that’s only long enough DNA for us to produce very, very short proteins. So over time, what we really want to see at A-Alpha is advances in DNA synthesis so that we can synthesize arbitrary-length proteins and massive, massive libraries of those arbitrary-length proteins. I think that that’s going to be a major driver for even higher throughput experimentation, even more, precise experimentation, because then we can fully define those proteins that we make. And we’ll just have a really, really huge impact across biotech, biopharma. And there are a lot of companies working in this space. I think there’s a good chance that there’s going to be at least one company that really cracks this.

Chris: All right. Last question. What is the most important lesson or something that you look back on and you’re like, boy, I wish I could have done that better, that you’ve taken away from your journey building A-Alpha so far?

David: One of the things that I’ve kind of gotten there eventually, but maybe it took me a little bit too long is to find the right balance of stepping away from all of the technical details. I think being a technical founder is a blessing and a curse. It’s a blessing because you understand the ins and outs of the system, of the platform. You can be involved in those technical conversations and help to steer the technical strategy of the company. But I think, at a certain point, there is a drawback to being too involved. Partly that’s a bandwidth issue, right? I mean, I have other things that I need to be focusing my time on. But I think the other thing that it took me maybe too long to realize was that when I’m in the room, the conversation is inherently different. So, sometimes, to have the best technical discussions, it’s important for the CEO not to be in the room. That was something that probably took me too long to figure out, and I’m glad that I’ve gotten there eventually.

Chris: I think that’s a really important and thoughtful insight for similarly technical founders out there and is a good one to end the conversation on and a good note for everybody to take home. So, David, we really appreciate having the conversation and for you being on this episode of Founded & Funded.

David: Thank you, Chris. Wonderful to be here.

Coral: Thank you for listening to this IA40 Spotlight episode of Founded & Funded. To learn more about the IA40, please visit IA40.com. To learn more about A-Alpha, visit AAlphabio.com bio.com. That’s A-A-L-P-H-A-B-I-O.com. Thanks again for listening and tune in in a couple of weeks for our next episode of Founded & Funded with Magnify CEO Josh Crossman.

dbt Labs’ Tristan Handy on the Modern Data Stack, Partnerships, Creating Community

dbt Labs, Tristan Handy, Founded & Funded

In this week’s IA40 Spotlight episode of Founded & Funded, Madrona Partner Jon Turow talks with dbt Labs Founder and CEO Tristan Handy. Here at Madrona, we just announced our 2022 IA40, and dbt Labs is one of our two-time winners. The company has positioned itself as the industry standard for data transformation in the cloud, and it raised $222 million at the beginning of the year, which was led by Altimeter, but Databricks and Snowflakes both participated in the round, further solidifying dbt’s place and the modern data stack.

Jon and Tristan dive into the concept of epistemic truth, how the modern data stack has significantly improved the frontier of what’s possible in data, the collaboration that needs to happen between data analysts, analytics engineers, and data scientists, the right way to use partnerships, and how the best way to try and create a community around your product is to not try and create a community, but you’ll have to listen to Tristan’s explanation. So I’ll let Jon and Tristan take it away.

This transcript was automatically generated and edited for clarity.

Jon: I am thrilled to have Tristan Handy here, who is the founder of dbt Labs. Dbt is an analytics as code tool that has quickly become an industry standard, and it has championed a new type of role — the analytics engineer — whose job it is to provide clean data sets to end users by modeling data in a way that empowers end users to answer their own questions. And with all this, they’ve attracted more than 9,000 weekly active customers and a Slack community of 38,000. So, thanks for joining Tristan.

Tristan: Yeah, thanks for having me. To set the record straight, we are at 15,000 companies using dbt today, which is really unbelievable. I remember in the early days thinking, “Hey, maybe a hundred companies would use this thing.” So this is a very unexpected world that we’re living in.

Jon: Maybe unexpected, but you do have a very special product and community. Tristan, to get us started off on the right foot, can you tell us in your words what exactly is dbt?

Tristan: Dbt has two parts. It has a commercial part, it has an open-source part. Overall, dbt wants to enable data professionals to work more like software engineers — to steward an analytic code base that describes the reality for an organization, or at least as close to reality as we can get, and allows people to move quickly. So it doesn’t unnecessarily force people to go through a bunch of hoops, but it allows people to work with a lot of maturity and governance with confidence that the data is accurate and timely, etc. So, the open source part of dbt is all of the language features. It’s, how do you write dbt code? How does it run against a data warehouse? And then dbt cloud is functionality that makes it easier to run dbt in production, makes it easier to write dbt code this kind of stuff.

Jon: So, considering the two sides of dbt, how do you think of who is the customer and what problem he or she is trying to solve?

Tristan: Let’s leave the word customer for a second. If I can just talk about users for a second because those two things are not the same in an open-source context.

Our users are folks typically on data teams or who are at least in a line of business that is very quantitative. Their companies have invested in modern data platforms, you know, Redshift, Snowflake, Big Query, Databricks, etc. And they are tasked with either providing clean data sets to folks who are going to answer business questions, or they are themselves tasked with answering business questions and need to do data preparation to get there.

So the core problem, and I ran into this back in 2015, which is kind of where the initial insight came from, but I was involved in launching a product called Stitch, which is a data ingestion tool. When we launched, it was just pulling data from about 10 different SaaS products into Redshift. And I got to use some of this data, and it was awesome. It was amazing to have my hands on all of this without having to do any CSV exports and VLOOKUPS and all this stuff. But the amount of data was pretty overwhelming. I remember having to do an analysis for Facebook ad ROI, and I was looking at literally a hundred tables in this one schema, and I was just like, this should be easy. Like, there should be a table that has campaign names and dates, and it should tell me the ROI. And it turns out that you actually have to make that for yourself. And so, dbt is the tool that I initially built for me to sort through all of these ingested data sets and turn them into the thing that would actually be useful to answer business questions.

Jon: So you’ve spoken about your users, but who are your customers?

Tristan: It really spans from one-person data teams at 20-person seed-funded startups all the way to some of the largest companies in the world with the most data professionals. Super high level, we started the company in 2016, dbt barely exists, we land a small handful of consulting clients, and a small community develops. We continued to bootstrap the business over the course of three and a half years as a consulting company. Over that time period, the dbt community grows to about 4,000 individuals, and there are about a thousand companies using dbt in production. That’s when we raised our first round of funding. At that point, the company had maybe eight or nine consultants and four or five engineers, and that was it. That was the whole company. We did have a SaaS product that was hacked together by these very small number of folks, but it was a way to help productionize dbt. You write all this dbt code that does all this data transformation, and you need a way to actually run it in production on an ongoing basis and to do logging and alerting and all of this kind of stuff. And so it provided all of this around the edges of the kind of open-source experience. It barely existed when we first raised money, and since then, it’s grown very quickly. It’s turned out that dbt scales well into the enterprise, and that’s something that I think you often don’t find because the enterprise needs a different set of features and functionality to deal with the complexity there. But what we found as essentially a developer tool is that as long as we are providing the language constructs that people can use to describe their business, we don’t get caught up in a features and functionality conversation going into the enterprise. So I think we’ve had a little bit of a smoother transition there.

Jon: So, you know, before we get a little more technical, I want to talk about a concept that you’ve spoken about a few times, which is the concept of epistemic truth. Can you talk about an example of how truth has played out with a customer and the analysis that you’ve run, and how the customer has benefited from that?

Tristan: I’ve become very interested in this topic, and I’ve realized there are actually two parts to it. The first one is how can you know that something is true? Originally, I was of the opinion that your e-commerce company processed a certain number of orders yesterday, and that number of orders was simply the ground truth of the matter. And our job was to write analytic code that reflected that ground truth of the matter. What I have come to believe on that topic is that that is insufficiently nuanced. If you look closely enough, there are always these cases of, like, well, does this one count? It was returned? Does this order count, it was, whatever, in some different state? It turns out that the question, ” How many orders did your company process yesterday,” is actually undefined without additional specificity. And so, what we do in our companies is we create these windows into the world, and we kind of socially construct these definitions.

Let’s leave the concept of ground truth for a second. And what I actually believe the job of data teams and dbt to be doing is that we need to steward this social construction process. So, what happens in many companies is that there’s a tremendous amount of chaos. There’s no accepted definition for metric A, metric B, metric C, so as a result, you get a lot of confusion about — “How do we talk about our reality?” There’s no good way to have a conversation about reality when everybody’s using different definitions. And so while I don’t think that there’s a way to reflect ground truth about the number of orders, there is a way to run a social process whereby the end of that social process is, we all agree — even if this is not the perfect metric, it is the metric that we use, and we can all consistently use it to talk about our business. And it turns out that like this is something that software engineers already know how to do. There’s no such thing as bug-free code, but there is code that works in production, and there is a process to get that code to production. And there’s a process to when you find a bug in production, like what do you do about it? So the goal in software engineering is not to architect some perfectly working set of routines. It is to build something that basically works and that you know how to, over time, continue to iterate toward correctness. That’s a big topic. I hope that was useful at all.

Jon: It’s very useful. My question, though, is how organizations have responded to that kind of discipline with the benefit of a tool like dbt? Has that changed the thinking that they did — the decisions that they made?

Tristan: Here’s how I’m going to try to answer this question because it’s a big one. How have organizations responded to dbt and dbt’s way of thinking? I think that that is very mixed. It’s not surprising that it would be mixed because there’s this Harvard Business School article from like the early ’90s that essentially says the primary task of any company is to create and disseminate knowledge. Everything’s changing all the time, so this group of humans who are here to advance a goal, that the most important thing that we do is figure out how to learn new stuff and then spread that learning to everybody else. So, you would then expect — and I would not actually say this is just in response to dbt, but in response to what in the industry, we would call the “Modern Data Stack.”

I think the modern data stack has very significantly improved the frontier of what is possible in data. And it’s not that any one given analysis using the modern data stack was not possible previously. I think about it as this kind of two by two — and on one axis there’s governance. Like, do you have any control over how knowledge is flowing through your organization? Can you trust it, etc? And the other access is velocity. And previously you could have things that were fast and not very trustworthy, or you could have things that were very trustworthy and incredibly slow — but you couldn’t have fast and well-governed.

And what the modern data stack has allowed companies to do, who really adopt it, is to get fast and governed. The implications of this for how organizations operate, I think, will be continuing to play out for a very long time.

Jon: Let’s take in a little bit to dbt and the sort of world around you. Way back, how did you get the dbt community started? How did you kick that flywheel into motion?

Tristan: I think of this as, you know, the best way to see stars in the night sky is to not look directly at them because it turns out that the edges of your eyesight are much more low light sensitive. And I think one of the best ways to start a community is to not try to start a community. Sometimes I field questions from software founders who have gotten from zero to one, and now they’re thinking about scalable marketing channels to continue to grow. And they say, “Ah, I want to start a community.” It’s like, I, that’s not generally how that goes. For us, we were practitioners, we were trying to solve real problems. We were using dbt as a tool in solving those problems, but we were not really just talking about dbt, we were talking about the problems that we were facing in the context of doing the kinds of analytics that digital native businesses do. And so we had a Slack group, and the Slack group was like for our little network of people to kind of collaborate. And three months in, we ended up working with a company called Casper. They were really very central in the New York tech community in 2016. Their data team was like a dozen people. And they hosted meetups and all this stuff. And as a result of this project that we worked on together, they adopted dbt, and they became this kind of central locus. They brought other folks from the New York tech community and told them about this new tool that they were using. And things have kind of spread from there. And Drew and I, who don’t live in New York, we were kind of observing all this from afar. We just continued to be helpful on Slack, and people were trying to get set up and running, and people were trying to absorb the things that dbt kind of assumed about how you were going to work. We just worked with people without any expectation of an immediate payoff there. So if I were going to kind of summarize all that — create some focus that people can kind of organize around and then just provide value. But then do it for a very long period of time. I mean, it was years and years before there were a real meaningful number of people in there.

Jon: You know, one thing I asked myself looking at the history and the growth of all of this is at what point did you start to get access and support from bigger tech players who were in the ecosystem as well?

Tristan: Hmm. Like as users or as partners, or…

Jon: However you would define it. If you think about the hyperscaler clouds — AWS, Azure, GCP — or you think about Snowflake or Databricks or whatever else. Did you get engagement from them? Did you get what you needed? Would you have liked to have more in looking back?

Tristan: For a long time, we thought that that would be a really big accelerator. In the early days, we tried to get the Redshift team to talk to us, and they were like, “Come on, two guys with a little open-source project. Like, sorry, we don’t have time.” And that’s, that’s fair. That’s like probably the right decision on their part. But what that ended up making us do was we learned Redshift inside and out. Like I, I don’t know that there’s that many people in the universe that know the Redshift optimizer as well as my co-founder Drew. That story has played out similarly with, other big tech companies over the years. I think that what we’ve learned, and certainly at this point, we do have fantastic partnerships with really all the major players in our space. But what I’ve come to terms with is that if your product is unique and if it assumes that users are going to need to go through some learning, some mindset shift in order to use it, then really you’ve got to do that work yourself. No other vendor is going to understand how to sell our version of reality — that’s on us. And maybe we can get invited to do road shows with partner A. We can get on invited on stage at partner B. That stuff is really good. It’s good exposure, but like at the end of the day, none of that stuff matters to us as much as being consistent with our own messaging, consistently growing our own community of users, getting them to see the value, and getting them to share that value with their colleagues.

Jon: Got it. That helps.

Tristan: Let me just say, though, we would not be successful without our partners. No enterprise technology company, I think, is really successful without its partners. But I think that sometimes, especially founders in the earlier stages, think that partnerships are going to save them. And that’s not true. You have to do the work yourself, and your partners are an accelerator once you’ve done the work.

Jon: And you’re in the other position now where you’re the captain of an ecosystem, and you can see any number of younger companies — Features & Labels, Continual, Monte Carlo, Datafold — who explicitly are playing in the dbt ecosystem. And they are finding their own path. If you speak to any of them, they’re going to tell you why what their building is important and how they’re making it happen. But as the captain of that ecosystem, how are you able to help, and what should new founders know who are going to forge that path in your orbit?

Tristan: Yeah, you’re totally correct that we’ve had to do a lot of navigating on this front ourselves over the past couple of years. I have never been at a company that is in the position that ours is. So, I don’t know how all partnerships teams think about this, but my guess is that frequently partnerships are thought of primarily from a commercial lens — how do we drive revenue together? And not that that’s unimportant to us over the long term, but we are a community first — we’re open source first. And so, our primary relationship with partners is one of neutrality, which is a strange thing to say, but actually, the last thing that we want to do is try to influence the technical choices of our community members by putting our thumb on the scale for one partner versus another partner because they drive more pipeline. We think that it is a real privilege that so many companies have chosen to use dbt. Dbt has become the standard for the layer in the modern data stack that we are. And, if we are going to continue to be trusted in this way, we have to act in a way that is good for all of the community as opposed to just constantly pushing things toward our own commercial interest.

So that’s, that’s lens one. You know, how do we stay neutral here and create space for these vendors to participate? The second part of this is we want to paint a vision of where we hope and believe that the ecosystem can go so that vendors can try to figure out if they are aligned with that. And if so, how do we work together to get there? Because any vendor that is explicitly trying to build toward the same type of future that we envision, we want to roll up our sleeves and do everything that we can to make that happen with you. We have this explicitly ecosystem-focused view of how the future will unfold. What that means is we cannot sit in a room by ourselves and just decide on a product roadmap for the entire ecosystem. That’s not how that works. We can talk about where we hope everyone will get to and what our part of that can be, but then we need everybody to build all those different puzzle pieces.

Jon: So how do you signal to the ecosystem what principles or visions you have for the future?

Tristan: I get one time a year to talk live to everybody in the community — my Coalesce keynote. I have a newsletter, that I use to test ideas and also ask challenging questions about the future.

One of the interesting things I think about this ecosystem curation role is that many technology companies only want to talk about how amazing their stuff is. But if you think about the future, inevitably, if you want to improve along some axis that then you are implicitly saying we are not as good today as we would like to be along that axis, which is something that, I think can be very uncomfortable for many commercial software companies. But if we don’t actually paint that vision of the future, and if we’re not transparent about the ways in which the lives of practitioners today probably could be better than we’re not going to be able to all get there. I think this is one of the interesting tensions between open source and commercial. Open source wants to be talking about all of the terribleness that we all live in today and how tomorrow we’re going to build the following things to fix it. And commercial wants to be focused on, “Look how amazing the world is today.” So, we have to do both.

Jon: Well, let’s talk a little bit about the, the technology environment. You said that dbt was created to take advantage of all this latent compute available via SQL run times, and now there’s new things like Snowpark or the notebook Runtime from Databricks that you, Tristan, and you dbt can access that same compute now with the flexibility of a procedural language like Python. What are the implications of that for what dbt can now achieve that you couldn’t achieve before under the covers for your customers?

Tristan: I come from a sequel background. I think dbt’s original M.O. was to take people like me who were not fully fledged data engineers but were reasonably technical data analysts and to give them tooling to build mature data pipelines. This made sense because originally Redshift and later — Snowflake, Big Query, Databricks, etc, — they had this, cloud infrastructure that all of a sudden, you didn’t have to think about, how your, compute and storage was provisioned. You just signed up online and clicked a few buttons and, maybe, opened up an IP address in your firewall, and you had infrastructure. This was a permissionless way to get a new set of capabilities into a lot more hands. The challenge there, though, is that that’s only a subset of data practitioners. There are the folks that consider themselves more data scientists or statisticians or other folks who are using more statistical methods in their work. And then there’s also a set of folks who are less comfortable describing all their work in code. If we take seriously the idea that dbt’s job is to create and disseminate knowledge, that’s not a job that is done exclusively by one role at a company. We’re trying to, over time, address more and more of these user profiles. So, while we didn’t start with Python because Python was not quite as turnkey as SQL was, we’re seeing the big data platforms roll out services that make Python more and more turnkey, and so we saw some of these innovations, and we said, “Okay, this is a great time to welcome a new set of folks into the dbt workflow.” And it’s a big experiment. It is actually the first time in six years that we have changed the primary persona of the product. And I don’t actually know if that means that the floodgates will open, and honestly, that was never our experience with our original persona. So, I don’t think it’s going to be our experience with persona number two. But what I do think is that over time we need to take data analysts and analytics engineers and make them share a lot more of their infrastructure and tooling with data scientists. The work that they do is too related to each other to be operating in these 100% separate ecosystems.

Jon: There’s something I don’t hear you saying, and I just want to press on that for a moment, if I may, which is that for the SQL persona, Jon Turow, writes some SQL, dbt’s going to go executed historically in SQL. Dbt now has the ability to execute Jon’s instructions that came in SQL, but by dbt’s choice, could be in Python instead. Does that give you the flexibility to do things that you’ve never done before? Have you started to look at that?

Tristan: I want to make sure I’m understanding your question right. In your example, you are writing SQL inside of dbt, and dbt is then turning around and running that SQL against a data warehouse. In the future, what if you wrote SQL and then dbt turned around and executed Python against the warehouse? Is that right?

Jon: Yeah.

Tristan: That’s theoretically possible. But here’s the way that we think about dbt. I don’t know if this dates me, but I learned to build web applications in the mid-2000s using Rails. I am by no means a gifted programmer — I figured out in my time how to build some basic web applications. But I think the only way that I was able to actually do that was that I had a framework that told me to put these files here, we’re going to wire them all together, and we’re going to kind of deal with the complexity. So, Rails doesn’t actually go about writing your code for you. It doesn’t express your business logic. It doesn’t do any real magic under the covers. It just provides you a framework wherein you can kind of focus on the high-value add activities and not focus on like, how a web request passes from M to V to C. So, we think about dbt in the same way. It is a way to create a lot of leverage for practitioners such that they don’t have to think about, ” Oh, what’s the create table syntax on this data warehouse?” or “How do I ensure transactional consistency” or any of this stuff that’s critical but it’s also kind of boring. We take care of all that stuff. But what that means is that the user actually has to ultimately express the thing that they want. And so with python in dbt, we’re allowing users to express what they want in Python, but dbt itself does not write Python on your behalf.

Jon: You’ve spoken about SQL as almost a TCP/IP layer between components of the modern data stack in the sense that it’s ubiquitously implemented and, therefore, kind of a lingua franca for those infrastructure components. But, of course, there’s the other benefit that very many people, very many information workers, at least can write bad SQL. The stories you’ve told Tristan, and that people told about you, is that you can write queries like music. But I, Jon Turow, cannot. I can write queries that run, and I think we see a lot of queries that are even sort of hacked and edited and remixed over Slack. So my question to you as an industry is, what do we need to do to further democratize and expand access to data and analytics?

Tristan: Okay, let’s start small and tactical. There’s remix culture in software engineering, right? This is a big innovation in music, and it’s where most content on the internet comes from at this point with memes. It exists in software engineering too. It’s like Stack Overflow culture, right? It’s something that is often made fun of, but at the same time, it’s tremendously valuable being able to say, “I have problem X, I am going to go find other people who have problem X” — and there is a way to find those people and to learn about their solutions and potentially adopt them. That’s incredibly valuable. We don’t really have that in data. And I think that there’s a couple of reasons why. One of them is that while you can share the code, you can’t actually understand what the code is doing without data. You can’t execute that code without the data that backs it. And often that data is proprietary. And so, while the code, some seven-line subroutine is not particularly proprietary, and generally employers don’t really care if you share seven lines of code on the internet, you can’t share a single row from a database on the internet. So, it makes it very hard to collaborate in this way.

I also think that there’s tooling that doesn’t exist. Data people generally have been kind of far away from the Git workflow, and so we don’t have the concept of a gist, really. There’s not a lot of like public code sharing in this way, and then commentary on it. This is one potential answer for DuckDB. Like what if something like DuckDB made it easy to kind of package up little data sets that you could run shared code samples on top of so that you could start to build some of this remix culture. Because not everybody has a computer science degree, but it turns out that by using Stack Overflow, you can still build some actual working code, and we need that to start to be true in data as well.

I also think that since dbt kind of took the hard line that we needed to bring Git into data teams, we’ve seen more and more products bake Git in as a first-class element of their experience. One of the elements of the “SQL process” is that none of that SQL that’s kind of passed around feels like it’s permanent, that it’s not part of some mission-critical working application. It’s just like a message that you sent to your coworker in Slack. But if you take that collaborative process out of DMs, and you put it into Git, you will find that just the mechanisms of Git and pull requests and comments will slowly tend the process toward higher quality over time. And I’ve watched individual data professionals go through this experience time and time again. And it has everything to do with the signals that their context is sending to them about, like, does quality matter? And then longer term, we could get into — how do you create a no-code tool that still participates in the Git workflow? Because at the end of the day, not every single person sitting in front of a computer for work is going to learn to write SQL. And that’s okay. And we have to also allow these folks to have experiences that give them the accessibility that they need but do not sacrifice, as tools in the past have sacrificed, that accessibility for quality. We have to be able to have both. And I think there’s ways to do that.

Jon: I just want to touch on something you said and maybe expand on it — there are these signals, either active or passive signals, that happen in the communication between producers and consumers of data or between consumers and consumers of data. You can even think about things like Amundsen catalog types of tools that add a social layer where if I know there are three fields called active customer, but Tristan uses the second one and might use the second one too for a similar query.

Tristan: I think that what we are fundamentally trying to do is not so dissimilar from what Wikipedia did. Whether you want to call it the grand total of human knowledge or whatever it is that Wikipedia represents. The thing that was built there was not some piece of software. It was a sociotechnical system. It was an interlocking between software tooling and community and individual motivations, and the end result was this process whereby subsequently, more and more information was contributed, and that information had a process to go through that could be more and more correct and more and more validated.

So, what I think we need to do inside our companies now is create a similar sociotechnical system. And it’s funny that most tooling providers don’t seem to think like that. They’re worried about capabilities. Does my thing do X? And what I think, actually, we need to start thinking about is — when I build my tool in the following way, what behaviors does that encourage this set of people to undertake?

And it’s actually not possible to know the answer to that at the outset. It’s a much more experimental mindset. And I think that in the past, enterprise software broadly, and data tooling as well, has not tended to think about experimentation as much as the consumer space has. But I think that what we’re trying to do is create this emergent behavior where companies are constantly contributing new knowledge and refining it and collaborating on it. And that’s not just a list of features that’s required to do that.

Jon: Super cool. So, looking ahead, we spoke about who are your users and your customers and what problems they all want to solve. And you’ve really been consistent about data analysis and data analysts and analytics engineers. You’ve separately spoken about how ML engineers occupy a different space in many organizations than their data colleagues do. They have different tasks. They get treated differently. They may have different tools today. But, of course, they still have lots of appetite for distilling truth from data. And my question to you, Tristan, is whether you see ML engineers as sufficiently similar to data engineers that ML engineers should also use dbt, or whether you think it’s such a different persona that they need their own community, their own technology, and their own product?

Tristan: I think that what I feel very strongly about is that the world of the ML engineer, the data scientist, should be less bifurcated from the world of the data analyst, analytics engineer. I’m not a maximalist when it comes to what should the product surface area of any given tool be. I have no need to conquer the world. I don’t need to convince ML engineers to use dbt if that’s not the right answer. Honestly, the thing that feels problematic today is that there are huge teams whose entire reason for existence is to take the upstream mess of data that flows into the cloud data environments and distill it into something useful. And that process typically happens twice — in most companies. A majority of that effort happens to support analytics and BI. But then there’s this separate effort that kind of does much of that again in support of machine learning projects. And sometimes, using today’s tooling that’s actually required.

Sometimes the latency requirements are completely different. There’s different types of transformations that are needed. Whatever. So, sometimes there are good reasons for this, But I think that we need to sand those edges off. This whole refinement of the truth process needs to happen as a collaborative effort. Now, if you take this truth, and you want to run ML models on it, and that process does not happen in the same stack, that seems completely reasonable to me. I don’t think that you’re going to find a lot of folks who are strongly opinionated that like Tesla’s self-driving car algorithms should also run inside of dbt. So there’s a line somewhere. But I think that we need to kind of bring this lower-level processing closer together and then kind of see where things go from there.

Jon: I think that’s a great answer. Just a couple more questions — I want to go through kind of a lightning round. First, either inside your ecosystem or outside it, is there a startup that we haven’t discussed today that you’re particularly excited about?

Tristan: Oh, oh my gosh. Okay. I don’t want to name a specific startup. There are actually two attacking a similar problem today, I honestly don’t know if it’s time for this problem to be solved, but it’s a problem that I’ve been asked to solve for 20 years. It is the ability to ask questions in plain English and get correct answers out of structured data. And the thing that I think these two startups are betting on, and with some success so far, is that large language models are the bridge that we have been waiting for to be able to finally do this. So it is way too early to tell if this is going to be successful. But I do think across the two decades of my career, it’s been kind of that thing that everybody wishes that they had for the data last mile problem.

Jon: We could pile out of that a lot, but I did say lighting round, so we’ll save that for next time. Second question. In the bigger picture, what do you believe will be the greatest source of technological disruption or innovation over the next five years?

Tristan: Hmm. This is a boring answer, but I still think that we’re incredibly early in the cloud transition. All of the work that we do is a part of the cloud story. And I think that people say things like, “Cloud adoptions is at 50% of enterprises globally.” But what does that mean? Does that mean that somebody bought an EC2 instance? That’s not cloud adoption. Cloud adoption actually implies organizational change. And that takes a long time. And I think that we’re still pretty early in that process.

Jon: Next question. What is the most important lesson, maybe from something you wish that you did better, that you have learned over your startup journey so far?

Tristan: I got a coach about three years ago, and it was the single most impactful decision I ever made. At the time, we spent 2% of our monthly revenue — this is when we were bootstrapped — 2% of our monthly revenue on this coach. But it ended up being money very well spent. I think that any leader has to come to terms with the fact that many of your instincts as a human do not serve you well in a leadership position. I felt like I was never good enough for the people around me — like people always wanted more from me or like I made a decision that wasn’t as good or that there’s always things that could be better. And as a human, you end up being very defensive I found myself wanting to explain, “Well, we had to do it that way because…” And a lot of times, your role as a leader is actually just to absorb those thoughts and feelings and just say like, “Okay, I hear you. Let’s talk more about that.” And that’s only just one example of, you know, something that I worked through with my coach. But I think that investing in the human side of leadership is just all too infrequently done and so, so important.

Jon: I will follow up with just one point on this. What have you learned about doing that in a Zoom world?

Tristan: I’m better at this on Zoom. I bet you that many people are better at it on Zoom because I think that sometimes getting into some of the harder human topics can be very intense when you’re literally right there with the person. I actually like having the safety of being like, “Okay, at the end of this, I’m going to close my browser, and that will be that.” I don’t know that that’s true of everybody, but I’ve learned to be really on and present via a screen in a way that I wasn’t actually as good at in person.

Jon: Tristan, thank you so much for your time today. I really enjoyed the discussion.

Tristan: Thank you. It’s been a lot of fun.

Coral: Thank you for listening to this week’s IA40 Spotlight episode of Founded & Funded. Keep an eye and ear out for coverage from our inaugural IA40 Summit. And tune in in a couple of weeks for our next episode of Founded & Funded.

Data Visionary Bob Muglia on the Modern Data Stack and Lessons from Snowflake

In this week’s episode, which is leading up to our Intelligent Applications Summit on November 2nd, Soma speaks with Bob Muglia. Bob has thought deeply about the Modern Data Stack, and they speak about it here — what is needed in the data stack to enable Intelligent Applications (or data-driven apps as Bob calls them) and the opportunities for new companies to innovate. Bob is also well known as the CEO that took Snowflake from a promising application for the public cloud to success by focusing on the problem of scaling a data warehouse in the cloud, and building product and sales teams that could win the hearts and minds of their loyal customers. Bob talks here about the early days after he joined Snowflake and what he did to get a product to market, how partnering with the big public cloud providers worked- and had its challenging moments. It’s a great view into how both Soma and Bob are thinking about the future of enterprise data and intelligent applications.

Note: Bob is on the board of IA40 company, Fivetran, which was the focus of a recent podcast, and he is chairman of the board of FaunaDB and RelationalAI, Madrona portfolio companies. Madrona holds shares in Snowflake,

This transcript was automatically generated and edited for clarity.

Soma: Bob, good afternoon. It’s fantastic to have you here with us. I’m very excited to talk to you about the future of data and data-driven and intelligent applications

Bob: Good to be here with you, Soma.

Soma: Absolutely Bob. And as you know, we at Madrona have been longtime believers in ML/AI and, more importantly, how do we apply ML/AI to different enterprise use cases and to different scenarios to be able to build what we refer to as next-generation intelligent applications.

And I was thinking about this, and as I was getting ready for the session, I couldn’t think of a better person to have this conversation with — and let me tell you why I say that. First of all, let me introduce you. Here with me is Bob Muglia, the former CEO of Snowflake, and prior to that, a long-term senior executive at Microsoft. He has done a variety of incredible things in his career — a lot of it data-driven, and this is where I come back to why I think you are the best guy for this conversation. Bob. Ever since I’ve known you, and I’ve known you for almost 30 years now, I think about you as a data guy first and foremost.

You go back to when you started your career at Microsoft, you were a Product Manager in SQL server. And through the following decades, I’ve seen you do something or other with data in one way, shape, or form. After leaving Microsoft, you decided to take on the range of Snowflake when it was a pre-product and a pre-revenue company. You spent over six years at Snowflake, growing it literally from zero to hundreds of millions of dollars of revenue. And I think you laid a lot of the foundation for Snowflake to be the leader that it is today in the cloud data platform world. After your stint at Snowflake, you’ve been working with half a dozen or more companies, startups, I should say, private companies as an investor, adviser, and board director.

The one common thread among all these companies is they all are doing something or other with data. I just look at the body of work behind you, and I say, “What a fantastic opportunity for us, and by extension, our audience to be able to hear from you about the future of data and how you see the world of intelligent applications evolving.” With that as a backdrop, I thought let’s just dive into some questions to kickstart this conversation. Let’s first go back to your days at Snowflake. As I just finished mentioning, when you started at Snowflake the team was still working on a product.

The product wasn’t in the market, and you went through this sort of what I call the “growing pains” of birthing a product and bringing it to market, thinking about the business model and getting it to scale. But along the way, I’m sure there were a handful of what I call “defining moments.”

Where you had to make a decision, or you had to think about something that literally laid the foundation for why Snowflake is who is today and what it is today. Can you think about a couple of those defining moments and just share with us what those are and how you navigated through those?

Bob: Sure. There were a couple of things early on that happened in the early time of Snowflake. You’ve got to go back to the period we’re talking about, which is 2014, 2015, where it was the early days of the cloud. Really, AWS was the most viable cloud at the time. Azure was still very early, and GCP was in some ways even earlier. it was a very different time. And a lot the of the focus of Snowflake was really about changing that. But a big part of it was also getting the product to the market because we were fortunate in the sense that we could scale to data of any size basically and as many users as you wanted to throw on it. And only having one copy of data for the whole organization instead of having to have copies scattered hither, tither and yon, which was the default at the time. So, it was a revolutionary product, but it still had to come to market.

And it was funny because when I started at Snowflake, the founders, said to me that their plan was to make the product generally available to enterprises in about six months. That was in June of 2014. I knew that somewhat unlikely, frankly, from all of our experience. You’re smiling, Soma, so you know what this is like in terms of developers in the early days. And I watched them for a period as they went through a couple of these two months’ milestones that were doing. And I had this observation that during that two months, they said they were going to do a bunch of things, and basically, none of them got finished during that period of time.

Other things did get done. They were working hard. They were certainly working hard, but it wasn’t like they were really working towards some well defined goals. One of the things I focused on was really trying to help bring some rigor and discipline to what it means to be an enterprise class product , , over the period of the next year or so, a little bit less than that, we went through a process whereby we really defined what general availability meant, and we went through a process of, getting, focusing on getting those tasks done.

I literally turned weekly team meetings into a project review only thing the sales people cared about was the status of the product as we don’t everybody cared about. And, we went through focused effort and got the product shipped in the middle of June. And that was really the beginning the beginning of the Snowflake experience.

The another thing that, we had, there’s always these sort of things that happen to companies in their early days that they, that, that happened to them and they survive or not. one of the more challenging things we went through early on was inside Snowflake, the transactional heart of the product is a technology called Foundation DB.

And Foundation DB the time in 2014 was a company, it was actually a sister company of one of our VCs, Sutter Hill. So we knew the company well. But I was in the process of negotiating agreement and I was able to negotiate an agreement. If anything happened to that technology, if it went off the market, that we had access to it through a code escrow, of course we hoped never happened, but it turns out it did happen.

And in seven, eight months later, Apple bought Foundation DB and immediately took the product off the market and made it unavailable to which is our worst nightmare. And, fortunately we were running it and we’re a bunch of database people and the source code escrow actually worked and we got the product through that we actually had to do what it took to actually learn how to be good at bugs and Foundation DB.

And pretty much had to do that all from scratch. And that was a very big deal. Had that, we not had access to Foundation DB, there really was no other good choice. Not at that time. Products like Fawn and Cockroach didn’t exist. And I don’t know what we would’ve done, to be honest with you.

I don’t know. I honestly don’t know what we would’ve done. But we survived that and now, fortunately, Foundation to be as open sourced and that’s very healthy. It’s actually healthy and Snowflake’s a major, actually a major contributor to it. So it’s actually a really good story, but it was a tough one.

So there’s, things like that happen in your process. I would say the other thing is just customers, I focused on all the time and being successful with customers. And, we didn’t lose customers basically speaking because didn’t take on things we couldn’t do, times would turn people down because we couldn’t do the work they wanted us to do occasionally.

And and really focused on the success of working with costumers.

Soma: That’s super helpful, Bob. I also remember that there was a lot of talk during the initial days of Snowflake about hey, we should think about like, you separating out computer and storage and that could enable us to get to the level of scale and economics, right?

That would be good for our customers and hence for us kind of thing. Any sort of color on, on, like how that came about.

Bob: think architecturally that was always part of the idea of separating compute and storage was always part of the design demo. And Terry had, the architecture at Snowflake has something called Global Services, that manages the metadata and does the query and does the query planning.

And then they have an execution processor that runs the actual, is the virtual warehouse and runs the actual SQL jobs. And now I believe it’s running the Python jobs too and, other languages. So it’s become multi-language really. And, the evolution of that whole thing changed dramatically over time and how we stored the metadata and everything and that separation of the metadata was a fundamental component of Snowflake.

But the way we did it certainly changed over time and I think, we were able to stay ahead from a scale perspective. I always said was interesting because, we were just ahead of our customers. In the early days, they were chasing our tail and scale in a variety of ways. And we, it was always, we were always working hard to stay ahead of customers, so customers had a great experience.

Soma: Bob, I do want to take this opportunity to say thank you to allowing both Madrona and I to be able to invest in Snowflake and be a part of the journey along with you and see it come to scale

Bob: …you guys were super helpful too. At the time we were opening Bellevue, the Bellevue office, which was very, I think, very pivotal office for Snowflake. of course Madrona has such strength in the Seattle area,

Soma: . I’m glad it worked out all well. But when I got involved in Snowflake, one of the things I heard, a fair bit is you got all these like big cloud platform providers, whether it’s AWS or Azure or GCP, all wanting to have their own solution in the space and how they’re going to , you crush Snowflake and how a fledgling startup can just not compete with any of these, at scale, massive cloud providers , but somehow Snowflake navigated through that and just reached a level of scale and success and is literally a leader in the cloud data platform world today.

, two questions that I want to ask you. In that context, how did you feel about it when everybody was probably telling you, or you heard the same thing about “Hey, all these big guys are there, they got their own, data warehousing solution, the cloud”, and how did you feel like confident that Snowflake was going to be able to navigate through it?

But then the second part, which I want to focus on is the partnership that Snowflake had with all the cloud providers. Because on the one hand, you could argue like, if if there is a customer that know goes with Snowflake on Azure, it is still a win for Azure. On the other hand, if you think about Snowflake running on AWS, Snowflake is competing with Redshift on AWS, right?

So you got this what I call a cooperative, in mind, midway, you are, partnering with the platform, but you’re competing with the service. How did that whole landscape, work out for you?

Bob: Yeah. To, so in terms of how we competed with the big cloud vendors, we had a better product. It was really that simple. If the product has, I said many times, if Snowflake was 10% or 20% better than Redshift, snowflake wouldn’t have gained any material share, but it was many times better.

I It worked in situations where Redshift didn’t work and Redshift is a very good product, it, paved the roads for cloud data warehouse. And for that I’m eternally grateful.

By bringing out Redshift very early in the marketplace, which they did. It was, the thing is it was a on-premises product brought to the cloud, so it didn’t really take advantage of the cloud. And what it was cheaper. It was definitely cheaper than anything you could buy. It on premises.

, but it didn’t ultimately scale.

People in the cloud world particularly wanted it to scale. And we saw, I wonder, in fact, one of my earliest, my earliest first salespeople, Vince Trada, I remember vividly, I was on the subway with him in New York in February or so of 2015. And we’re seeing customers who were talking about adopting Redshift.

And Vince said to me, “Bob, don’t worry about this because every one of those customers that adopt Redshift are going to come to us in the next 18 months when they run out of gas”. And he was right. That’s essentially what happened. A lot of Snowflake’s early business was Redshift conversions, well as working with semi-structured data, which we did a good job on and nobody else did.

Certainly we were better than Hadoop, which is what people were using at the time. And and so that was a major, the major part of the success force. So we were just better and frankly, particularly in AWS, we had a much better product. We’re lucky that we were in the timeframe we were establishing, and that’s the other thing, it was the right time. That was the time for establishing , the position in that, in the data space in the cloud. Because it was all pretty new.

Terms of our relationships with the vendors, they were challenging to say the least. We certainly had, many challenging times with Amazon, who we were competing with in Redshift. What I would say is first of all, is Amazon did an, incredibly good job of supporting Snowflake at all times and they were great at support and

AWS is a great product to build on top of. But they were brutally fought against us in the business marketplace in the early days and, it was pretty challenging it sometimes, but we were winning. We, the thing is we won those challenges partially because again, we had a better product and frankly we had them much a better trained sales team.

Our sales team. Was able to, to outsell Amazon’s. And so that was the early days of Amazon. And then, when we moved to Azure and we established Azure as our second cloud in part because of my relationships with Microsoft people, we were able to build good partner relationships there and actually had some amazing, very positive going to market motions with Microsoft in the early days where they did a bunch of joint selling with us and really discovered, whole different business.

What we discovered was that Azure, it was a whole set of customers we’d just never seen before. It’s just a whole, it’s a whole different market almost, really customers and we always said this, choose your cloud first and then choose your data warehouse.

And Snowflake ran on all of them, so it makes it a little easier. But at the time we were just running on AWS and then Azure and so it was positive it was a win-win situation in some senses for Microsoft and Snowflake to go together. I think that about the time I left Snowflake in 2019, Snowflake was probably becoming more competitive in a number of ways.

And in some senses the strength of the partnerships in Snowflake, I think flipped really. And they had a rough time with the Azure folks for a while and they actually built some very strong relationships with AWS, .

There’s lot of good things happening there. I think Google is still tough, if I’m not mistaken. , Google has, generally speaking, not the most partner-centric company on the planet. And I know that’s been a little bit more challenging for Snowflake, in part because they really love Big Query and they have the same feelings about Big Query that, the folks used to have about Redshift.

Only time will tell. These are challenging because they’re competitive they’re definitely complimentary and competitive.

Soma: Yeah the thing that was interesting for me to watch is, there would be a face off time when you would think now, hey, this particular cloud provider is the best partner. And then things will change and then things will change back. Just the volatile team, the partnership as Snowflake went from strength to strength and depending on where the other cloud providers were all, it was just fascinating to see how it was a very interesting and ever changing landscape.

Bob: It It just proves my first rule of partnership. Soma, partnerships are tactical. You know when it’s win-win, they work. When not, they start to falter a bit

Soma: But I think no Snowflake could be a sort of a good good, what should I say? An inspiration or a good role model or a good case study. For a lot of the other new startups that are coming up and saying “Hey, am I competing with the cloud providers or am I partnering? How do I navigate this tough thing?”

And depending on what space they are in and what the cloud provider’s aspirations are, it, many companies could be in a similar situation. That’s why I want to make sure that we talked a little bit…

Bob: That’s very true in fact. I have this conversation with a number of the companies that I talked to about potential, about their potential conflicts with cloud vendors. A lot of the stuff people are working on these days is complimentary to it’s new things that I don’t think have the same kind of conflicts that we had with Snowflake.

I do think in general though, that Snowflake is a good role model. Building a partner-centric company in general. In addition to really working with the strategic, the strategic cloud vendors and spending a lot of energy there, we spend a massive amount of time working to build an ecosystem and working with partners all around, whether they be, partners, like BI partners, ML partners, I mean, ATL partners, whatever it might be there, as size, and I feel very good about what Snowflake has done in that space.

And I think that, I definitely felt like I had something to do with that. And the history our shared history together at Microsoft, is the lessons that I learned from

Soma: Great, Bob. I thought it’d be good to take a step back now and think about hey you’ve been, as I mentioned, you’ve been working in data in one way, shape, or form for the last 30 years. How do, how have you seen the..

Over 30 years,

Bob: Been over 30 years. It was Windows NT Summit. It was Windows NT.

Soma: Over 30 years. But during this period of time, Bob, how have you seen the world of data evolve? New platforms, new computing paradigms, new devices, new, everything has happened. But the importance of data seems to have only gone from strength to strength and has exponentially gone up in the last 10, 15 years.

Now, I wanted to get your quick thoughts on “Hey, where do you see data today and where do you see data moving forward?”

Bob: It was just over 30 years ago that Bill did his information at your fingertips. Saw talk, which really, I was working on, I was a program manager when I started at Microsoft and SQL Server. So I was working and I had been working on database things as a, as really building applications inside a company before I joined Microsoft.

So I’d been focusing on data pretty much my whole career. And so I, while I’ve been focusing on SQL in the business side, but I still feel in some senses, like the beginning and the perspective began with information at your fingertips and all the focus that we had on information of all kinds at Microsoft and building out businesses and enabling people to work with data.

In the early days, I was found in, involved in SQL Server from the, from very early on. And then I watched as other folks at Microsoft built SQL Server and built it into the business that it really became. And I watched that transform. I watched these kinds of data systems together with the applications that sit on top of it, transform businesses of all sizes.

And, Microsoft contribution was the of all sizes really. You know, if you were a big company you could buy from IBM, Digital or Sun. A big, expensive set of systems, but Microsoft made servers that were quite inexpensive and brought computing to literally millions of small businesses around the world, maybe tens of millions never had it before.

And that was really, data was a centric part of that. We’ve watched, now we’re clearly head to head, we’ve gone through the internet era and the evolution of that has been new types of data that have become important, in particular, semi-structured data, that’s generated in large quantities by machines.

In some ways is some of the most important data we analyze today. We’re now living in a cloud centric world, which allows us to do things that we never could do before. I am a big believer that, data’s generated everywhere, but you need to centralize it to a certain extent to do analysis around it, to bring different types of data together so that you can, you can perform the relationships, but look at the relationships between them and perform the kinds of and dashboarding that people want to do, as well as deeper analysis with machine learning. So things have changed so dramatically from, a fairly simple environment where literally people worked on pencil and paper.

Literally we’re, Excel was a massive step forward and in or 1, 2, 3, was a massive step forward in dealing with information now this world of the cloud where we have this vast amount of data available to us. It’s pretty amazing really.

Soma: You just summed it up Bob. It’s pretty amazing actually. What in the world, how far we’ve come along, but for all the progress we’ve made, I feel like there is still a turn more that is waiting to happen. And it’s just that the rate of innovation is only getting faster as opposed to slower as we move forward here.

Today, like now you can’t have a conversation about data without know, talking about the modern data stack. That’s that’s a sort of buzzwords or a new concept or whatever you want to think about it kind of thing. But everybody talks about like the modern data stack. In your mind, how do you define the modern data stack?

Bob: So I people have been trying to work with data in a variety of ways and fundamentally the cloud and the ability for companies to work together to provide a complete solution for organizations on the cloud, has never been as strong as it is today. And that’s really what the modern data stack is about.

Really enabling the industry to work together to provide solutions to companies. And those companies take on a cer, those solutions take on a certain shape in the modern data stack. And there’s three defining characteristics that I think exemplify that the modern data stack is really building modern, building data analytics.

That is delivered as a SaaS cloud service first and foremost, which means that building these components, you’re purchasing them from third parties that are providing the service for you and means that a lot of the things are taken care of for the customer. So the first thing is that it’s a SaaS service.

The second thing is that it runs in the cloud and it takes advantage the scalability, that the cloud provides to allow you to work. All of your users and any kind of data, and I mentioned earlier that data, is both structured data that people work with SQL and semi-structured data that comes out of machine generated systems.

But it’s also more and more other types of data that are, is quite rich in terms of its content. People sometimes refer to this as unstructured data. I really tend to think of it as complex data. Data types such as video, audio, photos, turns out to be a rich source of complex data.

All of these things that exist in business in the form of documents of all kinds and recordings of all kinds are essentially sources of data for the modern data stack. And with the cloud, it needs to scale for all, to work with as much as many users as you want, and so the final point is that when you’re doing the analysis against it the data is modeled for a SQL database.

And that’s, that I think is a distinguishing element of the modern data stack. Is when the data comes into the system the way you actually transform it, there are multiple techniques for doing it. And so let’s put that aside. But the target environment you’re modeling it for is a sequel database and you use relational commands.

Relational algebra, basically to operate against that, that data in a relational form. So three things. Data analytics is a service that leverages the cloud for scale and models data for SQL.

Soma: That’s great. And today, Bob, if you look at it and this is know I don’t know whether I would’ve predicted that. Maybe in hindsight it’s easy to say I thought about it this way kind of thing. But today you could say that there are like, key or big technology vendors that are providing like, vast parts of the modern data stack.

, you got the three cloud platform guys in Microsoft, Amazon, and Google. And then you got Snowflake and Data Bricks. And the fact that Snowflake and Data Bricks have literally come out from nowhere in the last, let’s say, eight years or so is fantastic because hey, that shows that hey, you can innovate, you can get to scale, you can get to a level of success even outside the biggest platform guys kind of thing.

And that I think is just goodness for the whole innovation ecosystem. Do you feel like, five is too many? Do you feel, five years going to become eight or any sort of thoughts on.

Bob: It’s about right. The database market has always been somewhat fragmented. It’s never been a winner take all. I Oracle has classically been the largest winner in the database market, but even they’re only like 40%. It’s a, it’s, it is a market that has a number of vendors and I think that’ll change.

My guess is you’ll actually see some more vendors appear. We see some smaller players coming in trying to on these five vendors in a variety of ways. I think it’s hard. There may be some, we may see six or seven having some small percentage share because for some more niche markets, I think these are the big five.

I think that, there’s going to be a big dog. There is a big dog fight happening between Snowflake and Data Bricks, and we’ll watch that get fought out for the next year or two. And meanwhile, the cloud vendors will just do what the cloud vendors do and their products will all get better. I do believe that the cloud vendor products, what clearly behind things like Snowflake, are getting better.

Google is probably the furthest along. And this micro, I know a fair amount of what the Microsoft team is doing. There’s a lot of actually great work happening there. I see some good stuff coming in the future.

Soma: That’s excellent as much as all these five players, and there are a whole host of other companies that are talking about “Hey, I’m building this for the modern data stack, “. As you see what is happening in the world today, do you see any big gaps in terms of what needs to happen in the modern data stack to make it really more complete and more robust for the next set of applications?

Bob: Yeah, I think there’s a number of really major gaps. I’m fairly sure that the platforms people are using for machine learning are fairly nascent and will evolve. I mean, I’m fairly sure that’s true. That, Spark is, has a lot of adoption, I don’t think it’s the end answer to every problem

and I think we’ll see evolution in that space. There are data types, , problem characteristics that are very poorly solved today, like graph. Problems are really situations where you have a lot of relationships between things and and if you look at the data model it’s a very large number of relationships that need to be managed more than a sequel database can handle.

And in general, the graph problem is poorly solved by today’s products. Meanwhile, there’s other things, that are critical to business logic. Like reasoning, which are still done pretty separate from the modern data stack. And you have bits and pieces of code all over the place and I think that’ll converge into more model based things over time.

I think a lot of the future is really around the evolution of model based development. And I think we’re in the early stages of that.

Soma: You talked about know, SQL systems and you talked about graph databases. Bob my perspective, and I’ll share that with you and tell me if that makes sense or not, is historically, and even today, the world is bifurcated into “Hey, you can go deal with relational database systems”.

And or you can go deal with the knowledge graph systems. Those two words are what I call two silos. They don’t, they really haven’t come together.

kind of thing.

Bob: You mean? relational systems or procedural systems today.

Soma: Or procedural systems. Yeah.

Bob: Yeah, like you’re working writing code in Python the one side and then sequel on the other side. Is that what you mean?

Soma: Do you think they’ll come together? They should come together? Do you think there is an opportunity?

Bob: I do. And I think that’s, as you said, a knowledge graph, that’s what a knowledge graph really can do. It and really the idea behind a knowledge graph is that you can encode of the attributes of the business into the database, the logic associated with the business.

And, that makes it, the idea then is it becomes a complete model. The organization is actually executable, where the model is the code itself. And in a way this has been a dream of computer science since I was a kid, when I was just not far out of school. I did work in early models based things where you model stuff with these diagram stuff and they spit out coball code at the bottom, which of course didn’t really was unmaintainable and didn’t work and all kinds of issues.

But, and because it never worked those sorts of ideas of modeling became more of a whiteboard effort, and I will argue that people always model business. Every, when you’re working at anything you’re doing, you’re modeling. But in today’s world, we do it implicitly. Implicitly, and we do it, with, might write a model of something that’s relatively thought through on a whiteboard, but then that gets implemented as bits and pieces of code all over the place, implicitly within the systems.

And I think we’ll move to a world that’s much more explicit in our, in what we’re defining. And that will happen when the knowledge graph comes about and when we think about implementing a knowledge graph. I’m pretty clear that they will be relational and they will leverage relational algebra and relational mathematics.

Partially because the industry has moved forward significantly in the last 10 years in terms of understanding algorithmic changes. New algorithms that allow you to work with large numbers of relationships sufficiently, and actually do things that you could never do previously with a SQL database because we just didn’t have the sophistication of the algorithms.

That are now appearing. So it’s pretty exciting actually. But it’s also early.

Soma: That’s true but as data becomes, what I call more democratized. One of the things that you talk to enterprise , CIOs, they’ll tell you that “Hey, we are really putting a lot of energy and effort into consolidating and standardizing our data infrastructure”.

But along with all these huge volumes of data and what can you do meaningfully with the data kind of thing. One issue that keeps coming up pretty much in every customer conversation today is data governance. This is also an area where, particularly in the last two years I would say, there have been a ton of new startups emerging.

All addressing one part of data governance or the other part of data governance kind of thing. How do broadly aid the space of data governance, and the kinds of companies that are coming up? Are there any specific companies that particularly catch your attention in the space?

Bob: Not really. They’re good companies. There are some good companies. They’re doing pieces of, solving pieces of the problem. But when I think about, you the issue with the modern data stack, governance is a very real problem. I It was always going to emerge as a major issue when we took data that was scattered everywhere and we put it together that it creates a certain risk profile where, which makes access control to that very important.

And in particular, that’s the aspect. I There are many aspects of governance data modeling. Data observability all sort, many things but the one that, that I think is that sort of at the top of people’s list is access control. And while there are products in the market that, that address some elements of that, I don’t think we’ve really reached the pinnacle of where we need to be.

And I, I don’t feel like we’re well served, that our customers are well served here. I believe that while there are some ways to, there are different ways to solve the problem and perhaps there are some shortcuts that people can take. I think in the long run, the right way to solve that is by having, establishing a semantic model that understands what the business is, which is essentially a knowledge graph.

And then from that you can derive the rules, for the policies that you want to establish on your data have a very much of a policy based approach, that’s based on the business data itself. And I think we’re still away from having a standardized platform to enable that. And that’s what we really need.

You know, one of the challenges we have, and I think one of the reasons why there we’re not seeing as much success in governance and modern data stack as customers might want, is that all of these tools that are coming out don’t use the modern data stack as their database. And the reason they don’t is cause they can’t. Because doesn’t solve the problems for them.

So they all use some sort of operational database of their own. They take different approaches, but none of them inter operate. I think what we need is a common platform for a semantic model, that will become the basis for modern data stack governance. I believe that platform will be a relational knowledge graph.

It’s still early. But that’s where I think it’ll go. In the meantime, I hope we can get customers and get some answers out there if it helps to solve their problems.

Soma: True. Let’s move up the stack a little bit. You’ve seen like open AI, do some great work and in the last many years on large scale machine learning models. You’ve got all kinds of face recognition and other kinds of machine learning models that are coming up at scale.

What do you think this, the situation is today in terms of these machine learning models? Do you feel , the right amount of innovation is happening there? How do you think these models are going to be evolving over time? Any perspective on where we are with models and where we are going.

Bob: It’s very exciting, I have to say. I mean it, we’ve seen incredible progress in the last five years even. I would say it’s accelerating progress. I recently had a conversation with Xuedang Huang who runs the, all the artificial, the machine learning team at Microsoft and is working with open AI and working on foundation models it, and they’re doing a lot of work on combinatorial foundation models where they bring multiple different types of data together into one large model.

These foundation models, let’s talk about that for just a second. What they, sometimes they’re called large language models. Which is fine, but it only speaks for one domain, which is the language oriented ones, because some foundation models also apply to photos and other domains besides written language.

What they really are is world scale trained machine learning models that are trained on a corpus that approaches, global scale. And so you know, what they become essentially is incredible concentrators human knowledge. Into a model. Now the models are statistically driven.

They’re not perfect. There’s still advances that need to be done in these. But this idea of of using machine learning to take the expertise of a given domain in the world and distill it into a model is fairly incredible in my opinion.

I don’t, I can’t think of a domain that it won’t effect. Honestly, I think it affects everything. I think it affects every single element of everywhere we go. So I think that’s a very exciting, element of what’s happening. We see some incredible stuff. This DALL-E stuff is interesting. Now people are doing videos against it.

This model that, that came to OpenAI, that was one of the early rewrite code, writing models code X. You know, has done some amazing things. GitHub copilot has been an incredible success for Microsoft and is really doing dramatic things to improve developer productivity.

And, I’m seeing people use that for different purposes. Can take and improve it writing in and do things around running sequel as an example. And very powerful ideas can come from that. On the other end of the sort of spectrum I think that there’s an opportunity, where you’re trying to use artificial inte machine learning AI to improve the business workflow in a given organization where the domain is actually the terminology of that organization itself.

And it, it’s much smaller and it’s, there’s no global model to look at. There’s some local set of content you can look at. And in that case, the interesting thing is how do you inform the model more and more about the business? I think what we’re going to see is, user assisted, interactive training models appearing, which are really applications they look like, applications are working with a given domain and then leveraging machine learning to really improve business process.

And the company I’ve been involved in that’s been working on this, that I’m pretty excited about is Docugami our friend, Jean Paul of Microsoft is CEO. And they’re really focusing on taking business documents, and other high value business documents and inverting them and in turning them into data that can be processed by data systems.

And in order to do that, you really need to understand the semantics of that document. And that requires user assistance. And that’s why this interactive development is important. So that’s a real, a UI kind of experience in that. That’s two different ends of the spectrum in some senses but both examples are pretty interesting to me.

Soma: . Bob you heard me talk about this quite a bit in in the last many years. We we are absolutely convinced that every application that gets built today and moving forward is what we refer to as intelligent application. You like to call it data driven applications but it’s basically hey, taking a corpus of data that is a available to the application.

Being able to, build a continuous learning system using some machine learning models, and then continue to get better and better. Deliver a better service, you get more data. The process just, it creatively makes the application, the service get better. That’s the world we see happening today, and as we move forward. (A) do you agree with that viewpoint? And then (B), what are some of the core things that you think are happening that is going to drive the world, getting there?

Bob: If I didn’t agree with that viewpoint, I wouldn’t be continuing to do this work. I Let’s face it, that’s the reason I continue to do the work. Look, my, my whole purpose basically, in, in my business career has been to build infrastructure components that make people’s lives easier in business in one sense or another.

And data has been a huge part of that. But I I worked on Windows Server for a lot of years and I built System Center and helped with Visual Studio and all those sorts of things where we’re not databases, they were all, they’re all about making it easier for people to build.

Systems to help them more effective in their lives. And in particular in business, I’ve mostly focused on the business side, not the consumer side. It’s interesting because, because when we think about this world today, we’re seeing a world where machine learning is transforming. I think pretty much every application category I talked about, foundation models, essentially distilling the world’s knowledge into a model in some sense or another.

It’s not perfect. And even though I would say it, it’s it it is a great learn model. It might not it probably isn’t what one would call fully intelligent. It doesn’t reach the point of saying this is intelligent. However, it is an incredible source of information and can be used as a base as many things.

But there’s things that are missing machine learning and that are missing in a more full intelligence things. And in particular, that’s things like reasoning. The ability to to reason over something that says, is this because I know that this is something else. And these systems, these models have a very hard time with that today.

They have a very hard time with that. And they can, sometimes that’s when they go off into wacko things, it’s because they haven’t got the ability of adding reasoning to that. Now, I’m sure that’ll change. I mean, I’m very confident that we’ll see reasoning get added to these models in a variety of ways.

I think of this problem is what’s the infrastructure that would actually solve this problem in a more generalized sense? I mean, give somebody a Python compiler or a C compiler. In a hundred nodes they can do an awful lot. But, to me it’s what sort of infrastructure you can build, can you build to make these systems more available to a larger number of companies.

And that’s why I think it all consolidates ultimately into a relational database that will take the form of a knowledge graph. And I do think ultimately these things will come together that where you can take all of the components of intelligence, and let me to somewhat define that, is a program that can sense.

It can reason, it can plan, act, and adapt. And we see these components coming together in different systems today, in different parts of intelligence systems today. But the idea of them coming together in a cohesive platform, we’re still some distance away from that. And to some extent where I’m thinking, how does the world, lots smarter people than me are going to build these models that do these amazing things. But to me, I’d like to figure out how I can help facilitate the creation of platforms to enable all these things to be created cohesively by mere mortals. Not just the great, smartest minds on the planet.

Soma: That, that’s awesome, Bob. That’s great to hear. And I’m so glad you’re continuing to be fully focused on that mission because I think the world needs that kind of an infrastructure and the kinds of innovation that the infrastructure can provide. Like you say, that makes what I call building an intelligent application, something that every developer can do and not just the rocket scientist in the world kind of thing.

So that I’m a big believer in democratizing access to all the developers and so kudos to everybody who’s working on the infrastructure that’s going to enable that to happen. In a, I know that we are coming up on time before we wrap up. There is one thing that keeps coming up quite a bit.

When I talk to all these, what I call modern data companies, are startups, right? There are two things that they ask about “Hey, how should I think about open source?” and the other thing is, how should I think about in a product led growth, these are two things that every startup founder or CEO’s thinking about, Hey when does it make sense for me to think about open source?

When do I not think about this? When do I think about this? Particularly given your sort of, experience with a variety of sort of an proprietary and open source work and product led growth versus enterprise sales kind of thing. Are there any sort of parting words of wisdom that you want to share with the next generation entrepreneurs?

Bob: What I would say is, the biggest advantage to open source is the potential path to rapid adoption of particularly a developer focused technology. And the, the ability to get more end users, using it more, developers using it more quickly. It’s appropriate if the component runs in the infrastructure of the customer.

And it, I would say today it’s may even be essential. If you expect a component to run as a core integral part of an infrastructure, Kaftka is a good example of it, right? I Kaftka’s a perfect sort of example of something like this where that thing’s going to be sitting in, know all over the place inside customer’s infrastructure and they’re going to, they just want it to be open source for their ability to choose vendors and all sorts of stuff.

Those are good reasons to, to do open source. You know, the, the challenge with open source is that you essentially have to abandon it. To build to build a business. I’m not going to say it’s a rouse, but it’s a, it, it’s it’s a, you’ve got to do an extended focus at the very least, where you’ve got open source and then you have something that’s commercial because that’s the only way to monetize.

I In the old days, people monetized open source with services and that was Red Hat’s business model. That’s gone away with the cloud. There’s no, the cloud doesn’t help that. You can’t take what you just put in open source classically and just run it in the cloud because the cloud vendors can do the same thing and they have infinite distributions.

And their cogs are lower than you, so you’re screwed from day one. But if you differentiate and start with an open source integral component, then build on top of it. In some ways it can be very successful. There are, there are certainly examples of that. But again, what is these companies are going off and innovating in, in, in non-open source ways right now.

Soma: Bob, fantastic to chat with you as always, fun conversation and really appreciate you taking the time to be with us here and do this podcast. Thank you.

Bob: Great. It’s good to talk to you again, Soma. Thanks.

Coral: Thanks for joining us for this week’s episode of Founded & Funded, and don’t forget to check out our Intelligent Application Summit event page if you’re interested in these types of discussions. Thanks again for joining us and tune in a couple of weeks for our next episode of Founded & Funded with dbt Labs Founder and CEO Tristan Handy.