Building and Deploying Real-World RAG Applications with Ram Sriharsha

EPISODE 669

Join our list for notifications and early access to events

About this Episode

Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new serverless offering. Currently in public preview, Pinecone Serverless is a vector database that enables on-demand data loading, flexible scaling, and cost-effective query processing. Ram discusses how the serverless paradigm impacts the vector database’s core architecture, key features, and other considerations. Lastly, Ram shares his perspective on the future of vector databases in helping enterprises deliver RAG systems.

Connect with Ram

Thanks to our sponsor Pinecone

I’d like to send a big thanks to Pinecone for their support of the podcast and their sponsorship of today’s show. We’ve talked a lot recently about retrieval augmented generation or RAG, and the role of vector databases, and Pinecone is a key emerging player in the space, offering a trusted vector database for ambitious AI applications. In the show, you'll learn more about their new product, Pinecone Serverless. Key innovations of Pinecone Serverless include:

  • Up to 50 times lower costs
  • Incremental indexing for consistently fresh results
  • Fast search without sacrificing recall 
  • Powerful performance with a multi-tenant compute layer, and
  • Zero configuration or ongoing management

To learn more, head over to pinecone.io.

Pinecone_logo logo

More from TWIML

Leave a Reply

Your email address will not be published. Required fields are marked *