AWS re:Invent — It's All About Applied AI

TL;DR

With AWS re:Invent 2023 kicking off November 27th, Madrona Managing Director Matt McIlwain dives into what to expect at the conference. He examines the significance of Bedrock, an AWS service that enables developers to access, customize, and operate a broad set of AI models, including Amazon’s own. Bedrock represents a distinctly different approach from OpenAI for developers working with GenAI, and Matt expects we will hear more about Amazon’s strategy to be a central player in the rapidly growing and evolving field of applied AI. Matt shares his ideas for who may join AWS CEO Adam Selipsky on stage and what it would signal to the cloud and AI ecosystem.

AWS re:Invent — It’s All About Applied AI

A few weeks back, Amazon announced its third-quarter earnings, which included $143 billion in quarterly revenue — only 16% of which came from its most profitable business: AWS. The company generated $11 billion in operating income in the quarter, with over $21 billion of free cash flow for the twelve months ending September 30, 2023. These results were very strong and received a positive market reaction.

So, it may have been a surprise to the analysts on the earnings call that Amazon CEO Andy Jassy chose to focus a substantial portion of his prepared remarks and responses on a new AWS AI service – Bedrock. Andy named the service seven times in his prepared remarks and seven more in the Q&A. He framed it this way: “Bedrock is the easiest way to build and scale enterprise-ready generative AI applications, and a real game-changer for developers and companies trying to get value out of this new technology.”

The prioritization of this newly released service leads one to the following questions: What is “Bedrock”? Why did that service get so much attention on the earnings call? Why is Amazon in general and AWS in particular “all about AI” today? And, what does this all mean for what to expect at AWS’s re:Invent conference taking place the week after Thanksgiving?

What is Bedrock?

Bedrock is an applied AI managed service that sits above the hardware (Nvidia, AMD, other chips) and below a variety of foundation models to help customers tailor, optimize, and orchestrate those models for deployment. It is a systems layer service that simplifies experimenting with specific models and the underlying hardware and helps to optimize those models to run as a service as part of an intelligent application.

During the earnings call, Andy said Amazon Bedrock “offers customers access to leading LLMs from third-party providers like Anthropic, Stability AI, Cohere, and AI 21, as well as from Amazon’s own LLMs, called Titan, where customers can take those models, customize them using their own data but without leaking that data back into the generalized LLM, and have access to the same security, access control, and features that they run the rest of their applications within AWS — all through a managed service.”

In many respects, Bedrock is a marketplace of AWS-hosted models that can be discovered, evaluated, customized, and then deployed via an API by end customers. Amazon is very familiar with marketplace models, which often require substantial fixed costs to enable, including their third-party sellers’ marketplace, Prime Video, and the AWS application marketplace, among others. These points of aggregation provide a greater selection to customers and create a layer where Amazon can offer services to both the providers and consumers of the items in the marketplace. Given its historic market leadership in cloud computing, AWS is a natural aggregation point for developers and businesses wanting to leverage their data and context to build and deploy tailored models and intelligent applications.

With Bedrock, Amazon is taking a distinctly different strategic approach for attracting developers to their platform than OpenAI. OpenAI, backed by Microsoft, is ironically a comparatively closed system that only offers OpenAI models like GPT-4, DALL-E 3, and ChatGPT. In addition, the OpenAI models are far more closed in areas like modifying the underlying model weights, the ability to fine-tune the models, and otherwise customize/optimize them for specific applications. OpenAI, even after its newest model releases and pricing adjustments last week, is more expensive to run in production than most open-source alternatives.

Why Did Bedrock Get So Much Attention on the Amazon Earnings Call?

Bedrock is strategic to Amazon and AWS. As noted above, it is a layer between the computer chips/infrastructure and the AI models. Both chips and models are increasing in variety and selection, highlighting the need for aggregation and optimization. Many internal businesses within Amazon are rapidly experimenting with and deploying generative AI enhancements and, in some cases, will use Bedrock. This includes the various online stores, its advertising business, customer service operations, and others. They all need a platform to try different models for evaluating different chips and customization strategies, such as prompt engineering, fine-tuning, and retrieval augmented generation (RAG). Beyond these internal customers, there is a massive opportunity for AWS to support the existing ecosystem of developers, software providers, enterprise customers, and solution providers in the largest area of innovation over the coming years.

The landscape at each layer of the applied AI stack is rapidly changing, and a unifying layer where customers can discover and try the latest model options is likely to become an enduring hub. Amazon understands how to build these types of ecosystems and deliver iterative cost improvements and standardization to the broader market.

Nvidia and Microsoft/OpenAI are well aware of this dynamic and have their own efforts underway to attract developers and data scientists to their platforms. One of Nvidia’s more overlooked strategic capabilities is its Compute Unified Device Architecture, CUDA, software platform — a software layer that enables developers to optimize how they run models and applications on Nvidia GPUs. As Nvidia has become an increasingly successful provider of cloud-based GPUs, they have attracted over 4 million developers to the CUDA platform. They are also working to develop more direct relationships with end-user enterprises by offering the Nvidia DGX Cloud delivered through major cloud service providers like Google, Azure, and Oracle (but not AWS) and emerging GPU clouds like Lambda Labs. While CUDA and DGX Cloud are not directly competitive with services like AWS Bedrock (and a more nascent Azure offering called Semantic Kernal), the Nvidia capabilities are designed to attract similar developers and customers to a more “full stack” Nvidia platform.

These offerings will keep aggregating different model types and model-chip pairings that provide customers with the optimal cost and capability blend. Real trade-offs exist for intelligent application builders between performance, accuracy, reliability, and cost, and one customer may be willing to accept 99% accuracy for 90% less cost! There will also be ongoing security, reliability, and governance concerns for enterprise customers and software companies that embed AI into their applications to leverage GenAI capabilities. Outside the CSPs, Nvidia and some emerging AI-focused cloud companies — a group of models-as-a-service businesses, such as OctoML — are also flourishing.

Why Is Amazon, and AWS in Particular, So Focused on Applied AI

Amazon is focused on this market because applied AI is where the greatest disruption is occurring in technology. The companies providing the tools and services developers use to store their data, train their models, build intelligent applications, and deploy them will be their most strategic partners.

Amazon has an opportunity to partner with many of the open-source model builders and systems while also creating differentiated internal models. The company can offer greater choice and flexibility on its cloud platform relative to OpenAI, which has a more proprietary approach and set of constraints. And, as techniques like fine-tuning (with LoRAs), RAG, and model ensembles enable developers to design customized applications, the ability to rapidly experiment and learn will become even more important. During the earnings call, Andy called Bedrock “advantageous” because it allows customers to leverage Amazon and third-party models of all sizes while making moving workloads between them simple.

What Should We Expect From AWS at re:Invent

In light of Amazon’s emphasis on AI and new services, including Bedrock, what should we expect to see at AWS re:Invent? Starting with the safest prediction, we will see the usual series of new compute, networking, storage, data management, and security enhancements to AWS’s market-leading services. The most interesting of these new services will be in AWS’s AI chips (Tranium and Inferentia) and modern data management services (data ingestion, cleaning, and management for unstructured and semi-structured data that helps train AI models).

On the AI front itself, we expect a big emphasis on the tools and strategies that help customers move from prototype to production. I recently interviewed a Slalom executive who shared that they have 400 GenAI projects underway, but only 20 are in production (most of those with internal vs. external customers). The vast majority of companies are, at most, in the prototyping phase of GenAI. As the diversity and complexity of GenAI techniques expand, customers will need more guidance on optimizing their models and the applications they power. It is frequently true that AI prototypes struggle to meet accuracy, cost, and performance goals needed for production. AWS will announce further capabilities to help different customers build or adopt GenAI services that have a clear ROI.

To conclude with some more risky predictions, plenty of key people and partners could appear on stage at the main AWS keynotes to put an exclamation point on AWS’ AI strategy. I have been wrong on these predictions before, so reader beware! For 2023, when the world is all about AI, here are a few guesses as to what might happen:

Andy Jassy joins Adam Selipsky on stage. I think this will likely happen as Andy can speak about how other parts of Amazon are embracing AI and leveraging AWS capabilities to experiment and more rapidly deploy their intelligent application services.

Dario or Daniela Amodei, co-founders of Anthropic, appear on stage. I think this is almost certain to happen given the recent $1.2 billion investment (and up to $4 billion commitment) by Amazon to Anthropic, Anthropic’s commitment to using Amazon chips, and positioning relative to Anthropic’s other key partner – Google.

Jensen Huang, Nvidia CEO, appears on stage. I think this is unlikely but possible. Even though there is some “co-opetition” between AWS (its chips, services like Bedrock) and Nvidia (pushing for DGX Cloud, GPU-focused IAAS companies), they are still strong and close partners. Having Jensen join Adam on stage would highlight that it is the early days of applied AI, and these two companies will more often be working together than competing.

Sam Altman, OpenAI CEO, appears on stage. I think this is highly unlikely. While the contractual restrictions between OpenAI and Microsoft are unclear, my sense is that it would be a “bridge too far” on the interpersonal relationships between those two companies for Sam to appear and announce an OpenAI on AWS service at re:Invent. Notably, Microsoft CEO Satya Nadella appeared with Sam at the OpenAI summit last week, making it even less likely that Sam will appear on the AWS stage soon. If it somehow happens, that is further evidence of how rapid and dynamic the world of applied AI is progressing. (Hours after posting, OpenAI announced a leadership transition.)

Mark Zuckerberg or a senior Meta executive appears on stage. This is a longer shot since it is not really Meta’s style. But having an executive join Adam or possibly AWS CTO Werner Vogels on stage to talk about open-source models, Meta’s Llama2, and related offerings would be brilliant.

Concluding Thoughts

No matter who ends up on stage, the modern data stack, the models you can train with that data, and applied AI services moving from prototype to production will highlight AWS re:Invent. There will be plenty of customer case studies and an emphasis on an open-model ecosystem and real-world ROI. The world of generative and applied AI is just getting started, and AWS is where the majority of cloud data is today. And, for most customers and use cases, the cloud will be the most effective place to train and run models (until we get to the edge use cases – but that’s next year). Amazon and AWS need to demonstrate that their overarching strategy of price, convenience, and selection can be at the core of this transformative market opportunity. One might even say they are trying to be the “bedrock” of the applied AI era!