A Beginner’s Guide to LLM Fine-Tuning

The growing interest in Large Language Models (LLMs) has led to a surge in tools and wrappers designed to streamline their training process. Popular options include <a href="https://github.com/lm-sys/FastChat" rel="noopener ugc nofollow" target="_blank">FastChat </a>from LMSYS (used to train <a href="https://huggingface.co/lmsys/vicuna-13b-v1.5" rel="noopener ugc nofollow" target="_blank">Vicuna</a>) and Hugging Face’s <a href="https://github.com/huggingface/transformers" rel="noopener ugc nofollow" target="_blank">transformers</a>/<a href="https://github.com/huggingface/trl" rel="noopener ugc nofollow" target="_blank">trl</a> libraries (used in <a href="https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32" rel="noopener" target="_blank">my previous article</a>). In addition, each big LLM project, like <a href="https://github.com/nlpxucan/WizardLM/tree/main" rel="noopener ugc nofollow" target="_blank">WizardLM</a>, tends to have its own training script, inspired by the original <a href="https://github.com/tatsu-lab/stanford_alpaca" rel="noopener ugc nofollow" target="_blank">Alpaca</a> implementation. In this article, we will use <a href="https://github.com/OpenAccess-AI-Collective/axolotl" rel="noopener ugc nofollow" target="_blank">Axolotl</a>, a tool created by the OpenAccess AI Collective. We will use it to fine-tune a <a href="https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/examples/llama-2/qlora.yml" rel="noopener ugc nofollow" target="_blank">Code Llama 7b</a> model on an evol-instruct dataset comprised of 1,000 samples of Python code. <h1>Why Axolotl?</h1> The main appeal of Axolotl is that it provides a one-stop solution, which includes numerous features, model architectures, and an active community. Here’s a quick list of my favorite things about it: <a href="https://towardsdatascience.com/a-beginners-guide-to-llm-fine-tuning-4bae7d4da672">Learn More</a>