A complete guide to running local LLM models

Meta just released Llama 2 [1], a large language model (LLM) that allows free research and commercial use. It’s expected to spark another wave of local LLMs that are fine-tuned based on it. The open-source community has been very active in trying to build open and locally accessible LLMs as alternatives to ChatGPT after the first version of Llama came out in late February. I have been closely following this trend — I have run and tested dozens of local LLMs that can be run on commercial hardware because I truly look forward to the future where everyone can have their own personal LLM assistants that are not constrained by a central provider. For businesses that build their products on LLMs, these models can also be more private, secure, and customizable choices. In this post, I will share what I’ve learned about these local LLMs, including what your (best) choices are, how to run them, and how to pick the right setup for your use case. I will explain everything you need to know to get a model loaded and running, no matter if you have experience in programming or not. I will also share some ready-to-use Google Colab WebUI for these LLMs that you can try out yourself. (If you’re just interested in trying out the models, you can visit <a href="https://github.com/Troyanovsky/Local-LLM-comparison" rel="noopener ugc nofollow" target="_blank">this repo</a> where I included the WebUI Colab Notebooks.) <a href="https://bootcamp.uxdesign.cc/a-complete-guide-to-running-local-llm-models-3225e4913620">Website</a>