With the rise of Generative AI, Retrieval Augmented Generation(RAG) has become a very popular approach for using the power of Large Language Models (LLMs). It simplifies the whole Generative AI approach while reducing the need to fine-tune or eventually train an LLM from scratch. Some of the reasons why RAG has become so popular are:
- You can avoid hallucinations where the model tries to be “creative” and provides false information by making things up.
- You can always get the latest information/answer around a topic or question without worrying about when was the training cut off for the foundation model.
- You can avoid spending time, effort and money on complex process of fine tuning or eventually training on your data.
- Your architecture becomes loosely coupled.
Below diagram depicts a simplified component architecture diagram of RAG: