Using LLaMA 2.0, FAISS and LangChain for Question-Answering on Your Own Data
<p>Over the past few weeks, I have been playing around with several large language models (LLMs) and exploring their potential with all sorts of methods available on the internet, but now it’s time for me to share what I have learned so far!</p>
<p>I was super excited to know that Meta released the next generation of its open-source large language model, LLaMA 2 (on 18th July 2023) and the most interesting part of the release was, they made it available<a href="http://ai.meta.com/llama" rel="noopener ugc nofollow" target="_blank"> free of charge for commercial use</a> to the public. Therefore, I decided to try it out and see how its performs.</p>
<p>In this article, I’m going share on how I performed Question-Answering (QA) like a chatbot using Llama-2–7b-chat model with LangChain framework and FAISS library over the documents which I fetched online from Databricks documentation website.</p>
<h1>Introduction</h1>
<p><strong>LLaMA 2</strong> model is pretrained and fine-tuned with 2 Trillion tokens and 7 to 70 Billion parameters which makes it one of the powerful open source models. It comes in three different model sizes (i.e. 7B, 13B and 70B) with significant improvements over the Llama 1 models, including being trained on 40% more tokens, having a much longer context length (4k tokens ), and using grouped-query attention for fast inference of the 70B model . It outperforms other open source LLMs on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.</p>