Chroma’s Jeff Huber on Vector Databases and Getting AI into Production

Chroma's Jeff Huber on Vector Databases, Community-Led Growth, and Competition

This week, Vivek Ramaswami talks with Chroma Co-founder and CEO Jeff Huber. Chroma is an open-source vector database, and vector databases have taken the AI world by storm recently, with Chroma quickly emerging as one of the leading offerings in the space. Jeff’s background is in machine learning, and for him and his co-founder, starting Chroma was a classic “there has to be a better way moment when it comes to going from demo to production”. Vivek and Jeff talk about why on earth we need another database, the new wave of software engineering happening right now, community-led growth, the competitive landscape, and so much more.

The 2023 Intelligent Applications Summit is happening on October 10th and 11th. If you’re interested, request an invite here.

This transcript was automatically generated and edited for clarity.

Vivek: I’m Vivek Ramaswami. I’m a partner at Madrona, and today, I am joined by Jeff Huber. Jeff is the co-founder and CEO of Chroma, an AI-native open-source embeddings database, which is also more commonly known as vector databases. Jeff, thanks so much for joining us today.

Jeff: Yeah, thanks for having me. Looking forward to the conversation.

Vivek: Jeff, you’ve got a history in the tech and startup world that predates your founding story at Chroma and I would love to hear about that. Maybe you can just share a little bit of background on yourself. How did you get into this world? How did you get to the founding of Chroma?

Jeff: Yeah, maybe the quick life story version of this is I actually was born in the Peninsula, grew up in North Carolina, and then moved back out to the Bay Area about 11 years ago to work in early-stage startups. It’s really fun to build things from scratch. A number of at-bats there. The most previous was a company called Standard Cyborg. We went through Y Combinator in the winter ’15 batch, which at the time felt like being late to YC, and now there’s winter ’23 or whatever, or summer ’23. So now winter ’15 feels like a long time ago. Standard Cyborg had many journeys, I think a few of the journeys that are maybe relevant to our conversation today. We had built out an open-source, online, dense reconstruction pipeline for the iPhone. The iPhone 10 came out in 2018, and we thought this was really interesting. There was a new sensor that was going to be in everybody’s pockets, and that was going to enable accurate 3D scanning to be in every person’s pocket for the first time.

Apple just gave you the raw depth data, so we built up this whole SLAM pipeline around that, open-sourced it, and then we were working with a bunch of companies to develop both conventional computer vision-based approaches as well as machine learning-based approaches to analyze that data at scale. We were doing things like virtual helmet fitting, powering a smart glasses company’s virtual try-on and fitting solution, working with a cycling company around shoe sizing and fitting, all kinds of interesting sizing and fitting applications.

We were doing stuff with ML. It was, like we said, the Wild West. It still is today, but at the time, TensorFlow 1.5 had just come out. PyTorch wasn’t really a big deal yet. We were doing point cloud segmentation, going way beyond object detection images. The thing that we really felt the pain of quite intimately was how hard it is to take anything that has machine learning in it, to take it from demo to production. That was the pain point that my co-founder and I, Anton, connected over. Was, wow, machine learning is really powerful and likely has the opportunity to change the way that we build software and change the world more broadly, but getting from demo to production is incredibly hard. That’s actually part of the reason we started Chroma — to solve that problem.

Vivek: Super interesting. Well, first of all, you were building in machine learning far before it became cool in this current hype cycle that we’re seeing and with all the interest. What were the challenges of building in that period relative to today?

Jeff: Yeah. Obviously, large language models are a bit of a paradigm shift. Before, if you wanted to build a machine learning model to do a thing, let’s say, for example, draw a bounding box around all the cars in a picture — a classic computer vision use case. You had to train a model to do exactly that from scratch in many cases or maybe you use a base architecture and you fine-tune on top of it. And cars is common enough. You could pick up a model probably fairly easily and get something decent, but let’s say that you wanted to draw bounding boxes around chairs, maybe that data doesn’t exist. It’s a huge effort to collect the data. It’s a huge effort to label the data. It’s a huge effort to train anything at all. And you can put in all of that effort before you even get your first taste of how good this thing could be.

Then you have to put in a month or more of work probably for each iteration cycle. You use the thing that you built, it works, we kind of have a sense for where it’s weak, and let’s do some things, label some more data to go try to make it better in that way, and then you wait two to four weeks to get your next answer. It’s a really long feedback loop. You basically have very little information as to why the thing is doing the thing that it’s doing, and really honestly, you have educated guesses for how to make it better, which makes for an incredibly infuriating developer cycle. Take somebody who has also built out React applications and you can change the style sheet and dynamically updates instantly. You can get that feedback loop down to milliseconds. Feedback loops of months or more is really painful in software development.

So yeah, those are a bunch of challenges with traditional deep learning. I think with language models and this news zeitgeist of people calling it generative AI. The interesting change that happened is that there are now general-purpose models, and you can do things with just base models that before you’d have to put a ton of effort into. Now application developers can pick up GPT-3 API and pick up some embeddings and in a weekend, have an interesting thing, and that is a very new thing. Before, building even the small demos was only the purview of machine learning engineers, data scientists at best and was sort of untouchable for your average developer, not average in terms of skillset, just average in terms of backend development, frontend development, full stack development, infra. And now those engineers can build things, and they are building things. And that’s a pretty incredible and exciting thing.

Vivek: Yeah, the resources and tools available to you as a builder in this space now are obviously significantly greater than they were not even just five years ago, even a year or two ago, and that’s terrific for the devs and the engineers in this space. I definitely want to get into how you think about generative AI, and even before getting into Chroma itself. Let’s just start at a higher level. We’re all hearing about AI, we’re hearing about vector DBs, we’re hearing all of these things. Just in terms of this market, why does AI memory matter at all? When we talk about things like AI memory retrieval, just paint a picture of why that matters, how that connects to this generative AI boom that we’re seeing today, and all the applications like ChatGPT and others that so many of us are familiar with?

Jeff: One way to think about what’s happening right now in AI is a new wave of software engineering is starting. Historically, the only way to get a computer to do a thing was to write explicit instructions in the form of code, and hopefully, the computer wrote those instructions, and hopefully, you didn’t write any bugs. If you followed those guidelines, you didn’t write any bugs, that program’s going to run 9999.999999 times out of 100. It’s always going to work. That is not the case with AI. At least today, we don’t yet have the tools necessary to get that level of reliability. That being said, traditional software is also bounded because there’s only a very narrow set of use cases where you can so tightly define all of the inputs such that traditional software development works. Imagine the space of all possible programs that can exist; code written by humans likely only covers 1% of that map of all possible programs that can exist.

It’s a classic example in xkcd. I believe it’s a manager goes to his employee and says, “Hey, can you add this new column to store a last name?” And the person says, “Yeah, it would take three minutes.” And then he says, “Hey, can you also add a thing that says if they upload this photo, whether it’s a cat or a bird,” and the person says, “Yeah, that would take three years.” That was an xkcd comic from a while back. I think that has changed and has gotten a lot easier, but it remains true that it is difficult to bring things in AI to production. So there’s, I think, a lot of exciting work going on. Obviously, retrieval is one method. So zoom out again. We want to be able to program language models, we want to be able to get them to do what we want them to do every time. And you basically have two options, and they’re not mutually exclusive.

One option is to change the data that the model has seen in its training, and you usually do that today through fine-tuning. The other method is to change the data the model has access to at inference time, and that is through the context window, and that is through, generally, retrieval. Instead of programming with code, we program with data. And that’s what this new zeitgeist in programming will be about — how do we guide the execution path of language models, this new generation of software, and how do we guide it using data? And again, retrieval is a really, really, really useful tool. I think that most production use cases today need to use retrieval, and I think most production use cases in the future will also need to use retrieval. I think fine-tuning today is probably a fairly low percentage of use cases, maybe like 10% need fine-tuning, though I expect that number to go up over time.

Vivek: That makes sense. And so retrieval is important, and I think there’s this new paradigm of companies that are being built. Especially if you are leveraging all of the AI tools that we have today, retrieval is more and more important. At the same time, one could argue that there’s a number of databases that already exist. And you just go on DB-Engines, and it lists 400 plus. Why do we need a new type of database? Why are we hearing so much about new databases and new vector DBs that are popping up, not just yours but others? What’s your sense of why we have this novel need?

Jeff: I think each generation of software, whether you look at web, mobile, web 1.0, they have different ergonomics and they have different needs that developers have, and that leads to new kinds of databases. What happened with web and mobile, developers both wanted to have faster development speed, so they wanted to get away from tightly rigid schemas and migrations, and then they also wanted to be able to support much larger scale. There’s the classic meme version of this, which is MongoDB is Web Scale, which is pretty funny to go back and watch, but there are some truths to that: different tools are better at different things. And you could put racing tires on a Honda Accord, but it does not make a race car.

In the current phase that we’re in, my recommendation to application developers is to not necessarily try to take an existing technology that’s added supposedly vector search on top of an existing database and try to make it work for your application because application developers need to squeeze all of the juice and get all of the alpha out of the latest and greatest technology. And you can’t be six months behind or a year behind because you’re too far behind. Furthermore, you really need to have things like really great recall. Again, production is very hard in this world, and so doing things like getting much worse recall but in an existing database is not a trade-off that I think developers should make.

So there’s a bunch more to it, which I can speak to. Everything from integrations, ergonomics, dimensions around scalability. There’s a lot to do with even just the low-level data structures that certain storage algorithms inside of traditional relational databases, for example, are good for and certain things that they’re not good for. I think, in some ways, the proof is in the pudding. Developers are reaching for, in many cases, these purpose-built solutions because they solve their problems. And ultimately, that’s what developers want. Developers want to build things, and developers do not want to spend lots of time taking a Honda Accord and retrofitting it to be an F1. They just want to go. They want to build something. And if both tools are open-source and both tools are free, there’s kind of an obvious answer there.

Vivek: I think you named a number of dimensions in which, to your point, you can’t retrofit a very different kind of car with different types of tires. It needs to be purpose-built for what the engineer and what the developer is trying to do.

Jeff: Adding to that, we’re just so early in all of this stuff. We’re just so early, and I think it’s comical how people are adding vector search to their existing databases. Oftentimes, they are pretty bad versions of vector search. There’s a ton of hair on them not advertised, but they add it and they throw up the marketing page, and the VP of marketing is happy, and so is the CEO. Maybe the public market is happy too. We’re so early.

It is not even obvious that cosine similarity search will be the predominant tool 18 months from now. It’s not obvious that points, which vectors are, and fancy rulers, which search is — points and rulers. I do not think that’s a sufficient tool set to build a production application. A lot more has to exist and has to exist at the database level. I think that’s the other point here is that just we’re so early to what memory for AI means and looks like, and it’s pretty likely that a learned representation ends up being the best anyways. You actually want to have neural nets running against your data and at this very low level to power retrieval. And that’s how you get the best memory. And if that’s how you get the best memory, that’s what you’ll want to use.

Vivek: Yeah, you made the point. It’s sort of like how every app is trying to figure out their generative AI strategy or their generative strategy, and they’re just creating something that sits on top of OpenAI or GPT-3 or 4. And now it’s, we can add this vector capability, but time will tell how much that really moves the needle or not.

Jeff: For the record, for the database companies listening to this, keep doing it, please, because you’re educating the market and doing me a huge favor, so we appreciate it.

Vivek: That’s great. Well, let’s get to Chroma then. So you kind of laid out really nicely what the market looks like and the needs for developers today in the kind of databases that they need and especially as they’re building these kinds of products. So what was the founding story to get to Chroma itself? Where did you see that there was a unique need for you and your co-founders that led you to this ah-ha moment?

Jeff: I’ll say upfront, in many ways, I think we’ve gotten really lucky, and I think the community ultimately is why the project has grown. We released the project on Valentine’s Day of this year. It was the day after when it was supposed to release, so we had a bug we had to fix. The growth we’ve seen from those early days to this point in terms of the number of users and people talking about it online has been entirely organic. For most of this year, we’ve been a very small team. We still are, to this day, a very small team. That will remain true for probably some time, honestly, into the future. With a small team, you can only do so many things. And the thing that we chose to focus on was trying to serve our users to developers. They use Chroma. Trying to serve them as best as possible, take their feedback seriously, give them help on Discord as fast as possible.

The community helped to make it better in terms of giving us feedback, opening up pull requests, and opening up issues. I think we’ve been lucky thus far or our path to this point. We started the company, again, because we felt like the gap between demo and production and building anything with machine learning or AI was too hard. We suspected that embeddings mattered, that there’s something about latent space and embeddings, which, pun intended, had some latent alpha. There’s something there that has not been fully explored yet. When you go on an exploration, you can have a hunch that there’s a promised land at the other end of the journey. You don’t know. You’re not certain. And so, the thing that you want to do is run really clean experiments. They’re tightly bounded for time. You’re very, hopefully, opinionated about whether something’s working or not. And don’t trick yourself. Don’t trick yourself into thinking, “oh, this is good enough, we’re getting good enough reaction to this, let’s spend another three years of our lives.”

I think that we as former founders had a high bar for what we wanted to see in terms of what existed versus having to push it for years and years. There’s a longer story there, but basically, we had built out a bunch of retrieval systems ourselves to serve some analytics we were doing for other use cases in embedding space. We found that we didn’t feel like any of the solutions that existed really addressed us as developers. We felt like they were really hard to use, really complicated, really strange deployment models, really strange pricing models, just didn’t fit. It just didn’t make sense. One of those classics, there had to be a better way, and started talking to some more people, some more users that were building stuff back in December when LangChain was a very early project. OpenAI embeddings had just gotten 100X cheaper or something like that with ada-002 embeddings.

And yeah, just started training a bunch of users and found that they shared the sentiment that something that was easy to use and powerful didn’t exist. And so we’re like, that’s interesting. Let’s go see where that goes. Let’s follow that trail and see where that goes. And that led us to, again, launching the project in open-source, and now cumulatively, the Python project has crossed 1.5 million downloads since launch and 600,000 plus of those in the last 30 days.

Vivek: Let’s talk about the community. It’s important and integral in the Chroma story and the momentum you’re seeing now and where you’re going. You talk about how you got some users, and people start to find out about you. There’s some organic growth. I’m sure a lot of the founders listening to this podcast, especially the ones who are building open-source products, are like, how do I even get the first few people in? How do I seed the initial community and start to attract more people? Is there something that you and your founders did to help build that up, or would you say it’s truly organic? I’m sure there’s some elements of both.

Jeff: The general sentiment is if you build it, they will not come. I have certainly always believed that because people underrate distribution, overrate technology day in and day out. I’ve done that before. It feels like that didn’t apply here for some reason. I’m not exactly sure why. It feels like we kind of did build it and they came. So I don’t know if there’s really much to learn.

I think one thing that we did do unintentionally is at the end of December, I had reached out to a bunch of people on Twitter who had tweeted about LangChain and embeddings and all this stuff just to get their feedback on what we were building and see if they shared our opinion. What I realized now, is that we not only were doing user testing and user feedback, but we were also talking to the loud users who had large presences, generally speaking, or were active on Twitter and thinking about the edges of technology. And so in a weird way, we ended up, I think, getting a lot of the early adopters and the early vocal people to be aware of what we were building and why we were building it and in many cases using what we were building. And again, that was very unintentional, but that didn’t hurt.

Vivek: It’s interesting. It’s funny when we say, “back in December” as if that’s years and years ago. It was eight months ago, but it feels like that because of how much the entire ecosystem has grown since. You talk about early days of LangChain and the number of users that they have now and what that community looks like and what Chroma is growing into. Do you find, and I think would be interesting to hear your take on, is building a community in this era and this cycle in AI very different from building a community a few years ago? One of the reasons I ask is I think we all see the number of GitHub Stars. To get to 50,000 stars is just sort of achieved much more frequently now than it was before, and you can kind of go down the list of metrics that matter and what you think about this open-source community. How do you think about building a community today and growing that and sustaining that community because a lot of this is going to be novel relative to maybe the open-source products of yesteryear?

Jeff: Yeah. I don’t know that any of the best practices have changed. I think you need to care about the people who are engaging with your thing. I think you need to give them good support. I think you need to try to get them on Zoom calls, and get to know their names, and get to know what they’re working on, and figure out how you can make what they’re working on better if you can. There’s just a lot of boilerplate community stuff, which isn’t particularly glamorous, but it is the way. There’s a difference, maybe with AI. In some ways, it’s a good thing. In some ways, it’s a bad thing. So in some ways, it’s a good thing that there’s just a lot of eyeballs on this stuff, and people are paying attention to it. Top of the funnel problem is a little more solved than it is in other areas where you’re trying to get people to pay attention.

I’ve heard stories of the early days of HashiCorp with the whole infrastructure as code movement. And really, it took years of going to conferences and tons of evangelism to get people to understand what this thing is and why they should like it. It’s not quite like that. It feels like people, at a minimum, want to like it, and their CEO is saying that they should have an AI strategy. There’s the zeitgeist. The downside of that is that you get a lot more tire-kicking type applications, or not every user who joins our Discord is going to stay engaged. The Discord is bigger than probably it would be if we were building infrastructure as code in 2017 or whenever. HashiCorp got started earlier than that. But the number of engaged users is also percentage-based, probably lower than theirs was at the time. I don’t know, and it’s kind of a mixed bag. It’s kind of weird and interesting to have that top-of-the-funnel problem halfway solved. That’s sort of a strange thing, I think.

Vivek: Yeah, I like the top-of-the-funnel analogy you draw because you’ve got a lot of people interested and willing to try something. Now the question is, how sticky can you make it? So you’ve got this big top-of-the-funnel, and then you’re going to have some percentage that comes down, and how do you keep them on? So any lessons that you have learned from that process? Which is saying, we’ve got all these people, a lot of initial tire kicking interest, how do you get them to continue using?

Jeff: Yeah, I think we’re in some ways not that focused on that as a metric. Clearly, if you get someone to use it and then they stop using it, it could be because what you’ve made sucks, in which case you should care. It could also mean because, I don’t know, they had a spare extra couple hours on a Saturday and their kids at soccer practice, and they hack something out for fun. And they’re going back to their job on Monday, and they churned. But is that a bad thing? I think, no. I think we’re very much in the age of experimentation with all this stuff. In some ways, what we care about is maximizing the number of things that are built. Even if a lot of those things don’t necessarily stick immediately, we take a very long-term view of this.

And so what matters right now is building the best thing possible that serves the community and making sure people don’t get stuck and trying to increase the chances that they get to production. They get us in reliable enough that they would want to use it. And then I think there’s some just returns to time. We’ve been going for five months. What will be true five years from now? What will be true 15 years from now? Just keep going. Also, if you become obsessed over things like churn and retention, you start doing a lot of bad behavior, which is anti-community. And so you start gathering people’s emails and sending out terrible drip content marketing just to try to keep them engaged. You do weird, bad stuff. And I never wanted to join a community that did that to me. And so I feel like there’s a degree to which you can overemphasize that.

Vivek: I think it’s a very good reminder of how early we are in all of this. Where my brain goes to funnel metrics, what does that look like over time? But really, this is experimentation. We’re just so early in this. To your point, ironically, I’m sure a lot of the classic marketing elements will do more to push people out than it will to bring people in and keep them engaged.

Jeff: Exactly. Broadly, people want authenticity, and the developers, first and foremost, just want authenticity.

Vivek: This is clearly one of the most exciting spaces when I talk about vector DBs in this broader category of AI we’re seeing right now, which also means that there’s a number of folks that are doing this. And there’s the number of competitors, and it seems more often than not, we hear of a new competitor coming up and getting funded and all of that. How do you think about operating in a space where a number of new companies have emerged and raised money? And I’ll ask this in two ways. I think one is there’s competing for users over time that’ll be customers, but the second part is competing for talent, which is not easy in any environment, but I think especially so in a highly competitive space like this. We’d love to get your thoughts on both.

Jeff: Yeah, in open-source, we don’t call them competition. We call them alternatives. I think, in some ways, the first rule of competition is don’t be distracted by the competition. If they were perfect, they would’ve already won. And you’d have no room to be good, and you’d have no room to win. I think we don’t spend a lot of time paying attention necessarily to what the alternatives are doing. We spend a lot of time focusing on our users and what they need. That’s sort of the reactive side. The proactive side is, again, going back to this drum I keep hitting, which is there’s a gap between demo and production. It is true about traditional deep learning. It is also true about language models and embeddings. It’s easy to build a sexy demo in a weekend, and people should. It is hard to build a robust, reliable production system today. Still very hard. And there’s a lot that needs to exist to get people there.

I think when you think about our obsession with developer experience, our rich and deep background in AI, our location in San Francisco, I think those will play out to be differentiating features and factors. I wish our competitors, our alternatives, well. I feel like we all had different theses about what’s happening here and what’s playing out. There’s maybe one point, which I will say, is for better or for worse, we are building a lot of this stuff in 2023. And in that sense, we’re “Late to the game” maybe compared to other people, but I think that we’ve made some different decisions specifically in the architecture of what we’ve built in our building. Those decisions, you probably couldn’t see a couple of years ago because nobody could see a couple of years ago. Those are a little bit hard. If you’ve ever made those decisions, they’re a little bit hard to unwind. Again, not necessarily important to get into detail. I think all that really matters for us is to serve the community we have well. We just got to do that.

Vivek: When you serve the community well, good things happen. And to your point, I think that 2023 we’re saying, “We can’t call it late because the next best time is now for all of this stuff.” It’s like things are happening right now, and this is when you want to jump in. We’d love to get your thoughts if we were to zoom out for a second; we’re kind of nearing what I would at least call the first year of this new generative AI boom that we’re in right now.

If I think of ChatGPT launching in November of 2022 as sort of the initial point, you can say we’re ending the first year, and we’re already starting to see from some applications there’s been a little bit of a decrease or at least a flattening of new users and a little bit of the muted version of that exponential growth we saw a year ago, especially on the application side. You have a really interesting vantage point as a founder building this space, more so on the infrastructure side than on the end-user application side. Give us your perspective. What’s happening here? Where are we in this game? Why might we be seeing some flattening in the growth, and what do you think is happening in all of this?

Jeff: I think it’s sort of natural. It’s like there’s a new technology. If you look at the curve of LK-99 reproductions, I was just thinking it’s all a very similar curve. It’s going to be this ramp-up really fast. Everyone’s like, what’s happening? Go, go, go. And then it’s going to taper off, and there’s going to be a bunch of people trying different variants of it to see if you can do something differently. I’m not an expert in this. I won’t pretend to be either. Eventually, maybe somebody finds a variant that works, and then it has this long, what is it called, the plateau of productivity and the adoption cycle, the hype cycle. It feels like Twitter and AI stuff tends to be these hype cycles are extremely compressed. It was pretty funny to watch.

Right before we raised vector databases, we’re like, “Just on the way up.” And then everybody on Twitter is like, vector databases are stupid. There was an immediate trough of disillusionment four weeks after we raised. And now it’s just a long road back up where people are like, “Oh, actually, long context windows, maybe you’re not a panacea with current architectures. Maybe it is useful to be able to control what goes into these context windows. We don’t want to just dump it all in there because these things could distract.” It’s just funny. These hype cycles are so silly. I think that the biggest risk here, period, to this entire space is if we can figure out a way as a community, as an industry, whatever you want to call it, to cross the chasm from a lot of demos that have been built to a lot of useful production applications, and that means many nines of reliability. Not one nine, 90%, which is even probably pretty good for a lot of LLM-based stuff today, but many of nines reliability.

If we can figure that out, then the amount of value that will be created, I think, through this kind of technology is easily in the trillions. It’s just really, really, really, really big. If we cannot, if it remains more, it’s great for some internal tools, it’s cool for some creative stuff, but we’re not going to put it in front of a user because it’s not good enough, or we can’t predict where it’s going to do well enough, so we’re just not going to put it in front of a user. We can’t trust it, ultimately, if we can’t get to that point. It’ll still be valuable, but it won’t be as valuable. It’ll be 1/10, maybe 1/100 as valuable. That’s something that we think a lot about, and so do the language model companies, and so do many of the other observability folks in the space. Everyone has the same goal, the same mission, which is how can we turn this thing from today, which is alchemy, and how can we turn it into engineering?

Vivek: Oh, that’s a great point, the alchemy to engineering point. I think just going back to what you said earlier is so true, which is the hype cycles were always fast in tech, but they have been increasingly compressed. I’m sitting here talking about nine months ago, there’s euphoria, and now the whole thing has been compressed so much. We’re in agreement with you, which is you’re going to have a natural tapering off of these things. I do think it encourages everyone in this space. There’s great experimentation happening, and then the next phase is going to be what’s useful? How do we get this in production and get someone to actually use it versus talking about all the cool marketing, hand-wavy use cases? I think when that happens, there’s no doubt that it’s going to be game-changing for every application, every piece of software, every workflow that we can think of.

Jeff: Exactly. I think more broadly, multimodal, large model-aided software development will more fully approach the space of all possible programs. That’s a good thing for humanity.

Vivek: Well, before we get to the lightning round, take us through where Chroma is going next. You’ve built this amazing community, and you’re getting an incredible amount of traction. In the last month, two months alone, you’re seeing the majority of your usage and growth come in, and so there’s this amazing compounding effect. What’s next for the company, and the community, and the business going forward?

Jeff: We want to do what’s best for the developers that use Chroma today and in the future. A few things need to happen to do that better than we do today. Number one, we need to have a good story around the distributed version of Chroma, which is mostly about scale. I think most data is not big data, but big data does exist and people want to have a path to scale, and I understand that. We’re working on that right now. It’s a distributed cloud native database version of Chroma, which will scale forever in the cloud. The next thing we’re working on is a hosted offering that many developers in their path of hacking out something on their local computer. Then they want to throw it up on a hosted offering to make it easy to share with their team or with their friends. There needs to be an answer to that, and it feels like none of the existing solutions are that answer. We’re really excited about that, and that’s coming down the pipe pretty soon.

And then the other big basket of work that I’d say is I want to help the community answer a lot of the questions that every developer faces in building applications. It’s very basic things. How should I chunk up my documents? Which embedding model is best for my data, both my data and my query pattern? How many nearest neighbors should I retrieve? 3? 5? 10? Are these nearest neighbors relevant or not? Did I get five of the same thing? Did I get five that are super far away? This is the most basic workflow that every single developer faces when building with these tools and technologies. And there are no good tools or strategies really that exist, at least that have been productized or popularized to help people cross the chasm.

So that’s sort of the first, I would say, task that our applied research team will focus on are those kinds of questions. Going deeper and going longer, again, our goal is to create program memory for AI. I alluded to earlier in the conversation that viewing cosine similarity search as the end of the road for advancements in programmable memory for AI would be extremely foolish. We’re not going to do that. We’re going continue to aggressively research and build more and more powerful approaches to do that well because it matters. We have to cross the chasm here.

Vivek: Really exciting roadmap and I’m excited to see where all this goes. Let’s dive into the lightning round here. Let’s start with the first one. Aside from your own company, what startup or company in general are you most excited about in the AI space, and why them?

Jeff: I’d say Meta. Not a startup, clearly, but I think they’re interestingly counter-positioned to a lot of the other labs. I think that the open-source motion has been extremely bold. And I think that open-source matters here for a few reasons. Number one, it really matters because a lot of the a dvanced retrieval and a lot of the advanced application that you’ll want to build need access to a level of the model where the weights could leak if you’re using a closed-source model. Again, in that goal of going to production, open-source models matter because you want to do that level of surgery, and in order to do that level of surgery, as we currently know and understand, it would be a leaking situation for a closed-source model.

Vivek: They are doing amazing things here. Outside of AI, what do you think is going to be the next greatest source of technological disruption in innovation over the next five years?

Jeff: Does it have to be a contrarian view?

Vivek: Any. Spicy takes or non-spicy takes. Whatever you want.

Jeff: I’ve written about this before. I feel like, in some ways, the two fundamental primitives of advancement are energy and intelligence. You need to have energy to take stuff out of the ground, energy to move it around, and energy to repackage it into other things. You need to have intelligence about what should we take out of the ground, where should we move it and how should we combine it together. And so broadly, technologies that lower the cost and hopefully increase the sustainability of both energy and intelligence are sort of in the direction of flourishing. So obviously, I think AI is marching along the intelligence sector, whether you believe that we’ll get to artificial super intelligence or not, which I’m not sure that I do. And then I think on the energy side, there’s a lot of interesting things happening already around net positive fusion, new geothermal techniques, the list goes on and on. You probably know more about this than I do, but I think that’s pretty exciting too.

Vivek: I was going to say that I think this might be the first episode we have that we talk about LK-99s because it’s probably the first one after this.

Jeff: This might be the first and only episode you talk about LK-99 — now that it’s been disproved, so I guess I’ll take that honor.

Vivek: Exactly. Yeah. We caught the moment in time really, really well. That’s awesome. Okay, last one. What is the most important lesson that you’ve learned over your startup journey but specifically related to talent? You’ve been a co-founder, you’ve been a founder before, you’ve been in this game for a long time. You know that talent is difficult to acquire and retain. Any lessons that you have that you can share with the audience about?

Jeff: I think I’ve really appreciated the idea of go slow to go fast. How that pertains to hiring is I think founders should be extremely picky about the technical bar for engineers. Obviously, or the broad talent bar for other roles, cultural fit, and stuff like EQ. How much of a grownup is this person versus not? And a lot of people are not. I think that that consistently gets underrated. I think what you want to do is build this very small, tight-knit team that shares the same vision and is working in the same direction with a ton of trust. That’s so rare. If you want to make magic, you need to do that. Magic might not still happen, but that’s the best chance that you have to make magic.

And to do that, you have to think in a bit of a weird way about what you’re doing, just dare I say, historical sense. It’s a weird mentality to put on. I think most people don’t want to allow themselves to be that choosy, and they believe I’ll just play the hand I’m dealt. It’s better to hire somebody now than hire somebody three months from now. But if you look at lots of success stories. I think Stripe — it took them over a year to hire their first engineer. I think that even after that, they were still under 10 for a couple of years after that. But that culture, that early DNA and culture of Stripe, even to this day, is paying huge dividends.

Another way to say that is if you believe that what you’re doing has compounding returns across time, you should optimize for the long term. And that’s for both personal health but also things like team composition. Do the thing that you want to be true 10 years from now. I think that’s hard because most founders, including myself, are not very patient. I always want to go faster than we’re going, but I think it’s right. We’ll see if it’s right for us, but I think it’s right.

Vivek: I think what you say is so true, which is coming from the fact that you’re a second-time founder, you’ve been through this before, allows you to take a much longer view. I would think that it’s probably easier for you to step back and take a look at the bigger picture versus eight years ago, your previous startup and that original founding journey. It gives you that perspective, which I think is really helpful for everyone to hear. Jeff, thank you so much for your time. I really appreciate everything, and best of luck with everything at Chroma.

Jeff: Thanks so much. This has been great. Appreciate it.

Coral: Thank you for listening to this week’s episode of Founded & Funded. Please rate and review us wherever you get your podcasts. If you’re interested in learning more about Chroma, visit That’s If you’re interested in these types of conversations, visit to learn more about our IA Summit and request an invite. Thanks again for listening, and tune in in a couple of weeks for our next episode of Founded & Funded with NYSE Group President Lynn Martin.

Related Insights

    Cohere’s Ivan Zhang on Foundation Models, RAG, and Feedback Loops
    Seattle Tech Week: Thank You to All Who Made It a Success
    Snowflake vs. Databricks: Two Cloud Giants Battling in the AI Domain

Related Insights

    Cohere’s Ivan Zhang on Foundation Models, RAG, and Feedback Loops
    Seattle Tech Week: Thank You to All Who Made It a Success
    Snowflake vs. Databricks: Two Cloud Giants Battling in the AI Domain