Emerging Large Language Model (LLM) Application Architecture

I’m currently the <a href="https://www.linkedin.com/in/cobusgreyling" rel="noopener ugc nofollow" target="_blank">Chief Evangelist</a> @ <a href="https://www.humanfirst.ai/" rel="noopener ugc nofollow" target="_blank">HumanFirst</a>. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more. Why do I say LLMs are unstructured? LLMs are to a large extent an extension of Conversational AI. Due to the unstructured nature of human language, the input to LLMs are conversational and unstructured, in the form of <a href="https://cobusgreyling.medium.com/prompt-engineering-text-generation-large-language-models-3d90c527c6d5" rel="noopener">Prompt Engineering</a>. And the output of LLMs is also conversational and unstructured; a highly succinct form of natural language generation (NLG). LLMs introduced functionality to fine-tune and create custom models. And an initial approach to customising LLMs was creating custom models via <a href="https://cobusgreyling.medium.com/how-to-fine-tune-gpt-3-for-custom-intent-classification-95973d05d7e0" rel="noopener">fine-tuning</a>. This approach has fallen into disfavour for three reasons: <ol> <li>As LLMs have both a <a href="https://cobusgreyling.medium.com/how-to-create-a-custom-fine-tuned-prediction-model-using-base-gpt-3-models-3dfd1eb1de0e" rel="noopener">generative and predictive</a> side. The generative power of LLMs is easier to leverage than the predictive power. If the generative side of LLMs are presented with contextual, concise and relevant data at inference-time, hallucination is negated.</li> <li><a href="https://cobusgreyling.medium.com/how-to-create-a-custom-fine-tuned-prediction-model-using-base-gpt-3-models-3dfd1eb1de0e" rel="noopener">Fine-tuning LLM</a>s involves training data curation, transformation and cost. Fine-tuned models are frozen with a definite time-stamp and will still demand innovation around prompt creation and data presentation to the LLM.</li> <li>When classifying text based on pre-defined classes or intents, <a href="https://cobusgreyling.medium.com/nlu-remains-relevant-for-conversational-ai-19b5f17936c5" rel="noopener">NLU</a> still has an advantage with built-in efficiencies.</li> </ol> The aim of fine-tuning of LLMs is to engender more accurate and succinct reasoning and answers. This also solves for one of the big problems with LLMs; <a href="https://cobusgreyling.medium.com/preventing-llm-hallucination-with-contextual-prompt-engineering-an-example-from-openai-7e7d58736162" rel="noopener">hallucination</a>, where the LLM returns highly plausible but incorrect answers. <a href="https://cobusgreyling.medium.com/emerging-large-language-model-llm-application-architecture-cba0e7862037">Read More</a>