How to Chat With Any PDFs and Image Files Using Large Language Models — With Code
<h1>Introduction</h1>
<p>So much valuable information is trapped in PDF and image files. Luckily, we have these powerful brains capable of processing those files to find specific information, which in fact is great.</p>
<blockquote>
<p>But how many of us, deep inside wouldn’t like to have a tool that can answer any question about a given document?</p>
</blockquote>
<p>That is the whole purpose of this article. I will explain step-by-step how to build a system that can chat with any PDFs and image files.</p>
<blockquote>
<p>If you prefer to watch video instead, check the link below:</p>
</blockquote>
<p><iframe frameborder="0" height="480" scrolling="no" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FAHbM2wCyd8s%3Ffeature%3Doembed&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DAHbM2wCyd8s&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FAHbM2wCyd8s%2Fhqdefault.jpg&key=a19fcc184b9711e1b4764040d3dc5c07&type=text%2Fhtml&schema=youtube" title="Large Language Models: How to Chat With Any PDFs And Image Files - Source Code Available!" width="854"></iframe></p>
<h2>General Workflow of the project</h2>
<p>It’s always good to have a clear understanding of the main components of the system being built. So let’s get started.</p>
<p><img alt="" src="https://miro.medium.com/v2/resize:fit:700/1*jMAGouB3s_LA1YoslX5Z_A.png" style="height:838px; width:700px" /></p>
<p>End-to-end workflow of the overall chat system (Image by Author)</p>
<ul>
<li>First, the user submits the document to be processed, which can be in PDF or image format.</li>
<li>A second module is used to detect the format of the file so that the relevant content extraction function is applied.</li>
<li>The content of the document is then split into multiple chunks using the <code>Data Splitter</code> module.</li>
<li>Those chunks are finally transformed into embeddings using the <code>Chunk Transformer</code> before they are stored in the vector store.</li>
</ul>
<p><a href="https://towardsdatascience.com/how-to-chat-with-any-file-from-pdfs-to-images-using-large-language-models-with-code-4bcfd7e440bc"><strong>Read More</strong></a></p>