ReLoRa: Pre-train a Large Language Model on Your GPU

In 2021, <a href="https://arxiv.org/abs/2106.09685" rel="noopener ugc nofollow" target="_blank">Hu et al.</a> proposed low-rank adapters (LoRa) for LLMs. This method significantly reduces the cost of fine-tuning large language models (LLMs) by only training a few added parameters (low-rank networks) while keeping the LLM’s original parameters (high-rank networks) frozen. With LoRa, we still need an existing pre-trained model to fine-tune, i.e., it can’t pre-train a good LLM from scratch due to the low-rank restrictions. It leaves pre-training unaffordable for most individuals and organizations. To reduce this cost, <a href="https://arxiv.org/pdf/2307.05695.pdf" rel="noopener ugc nofollow" target="_blank">Lialin et al. (2023)</a> propose ReLoRa. This is a modification of LoRa that allows pre-training LLMs from scratch. <a href="https://medium.com/towards-data-science/relora-pre-train-a-large-language-model-on-your-gpu-d104756f9ddf">Read More</a>