Document-Oriented Agents: A Journey with Vector Databases, LLMs, Langchain, FastAPI, and Docker

<h1>Introduction</h1> <p>Document-oriented agents are starting to get traction in the business landscape. Companies increasingly leverage these tools to capitalize on internal documentation, enhancing their business processes. A recent McKinsey report [1] underscores this trend, suggesting generative AI could boost the global economy by $2.6&ndash;4.4 trillion annually and automate up to 70% of current work activities. The study identifies customer service, sales and marketing, and software development as the main sectors that will be affected by the transformation. Most of the change is coming from the fact that the information that powers these areas within a company can be more accessible to both employees and customers through the usage of solutions such as document-oriented agents.</p> <p>With the current technology, we are still facing some challenges. Even if you consider the new Large Language Models (LLMs) with 100k token limits, the models still have limited context windows. While 100k tokens seem to be a high number, it is a tiny number when we look at the size of the databases powering, for example, a customer service department. Another problem that often arises is the inaccuracies in model outputs. In this article, we&rsquo;ll provide a step-by-step guide to building a document-oriented agent that can handle documents of any size and deliver verifiable answers.</p> <p>We use a vector database &mdash; ChromaDB &mdash; to augment our model context length capabilities and Langchain to facilitate integrations between the different components in our architecture. As our LLM, we use OpenAI&rsquo;s chatGPT. Since we want to serve our application, we use FastAPI to create endpoints for users to interact with our agent. Finally, our application is containerized using Docker, which allows us to easily deploy it in any type of environment.</p> <p><a href="https://towardsdatascience.com/document-oriented-agents-a-journey-with-vector-databases-llms-langchain-fastapi-and-docker-be0efcd229f4"><strong>Learn More</strong></a></p>