Fine-tune Llama 2 on Your Computer with QLoRa and TRL

Llama 2 is a state-of-the-art large language model (LLM) released by Meta. In the paper presenting the model, Llama 2 demonstrates impressive capabilities on public benchmarks for various natural language generation and coding tasks. Meta also released Chat versions of Llama 2. These chat models can be used as chatbots. They mimic OpenAI’s ChatGPT capabilities and can solve many problems with the right prompts. Both versions of Llama 2 are currently available in different sizes: 7B, 13B, and 70B parameters. Note: A 34B parameter version is presented in the paper but has not been released yet. The 7B and 13B models are especially interesting if you want to run Llama 2 on your computer. With recent advances in quantization, using GPTQ or QLoRa, you can fine-tune and run these models on consumer hardware. I have written about Llama 2 and GPTQ here: <a href="https://medium.com/@bnjmn_marie/fine-tune-llama-2-on-your-computer-with-qlora-and-trl-8d2e3f46fde1">Click Here</a>