Creating a (mostly) Autonomous HR Assistant with ChatGPT and LangChain’s Agents and Tools

OpenAI recently released a paper comparing two training methods aimed at improving the reliability of Large Language Models (LLM): model training by ‘process supervision’ and model training by ‘outcome supervision’. <iframe frameborder="0" height="440" scrolling="no" src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https%3A//twitter.com/OpenAI/status/1663957407184347136%3Fs%3D20&image=https%3A//i.embed.ly/1/image%3Furl%3Dhttps%253A%252F%252Fabs.twimg.com%252Ferrors%252Flogo46x38.png%26key%3Da19fcc184b9711e1b4764040d3dc5c07" title="OpenAI on Twitter: "We trained an AI using process supervision - rewarding the thought process rather than the outcome - to achieve new state-of-art in mathematical reasoning. Encouraging sign for alignment of advanced AIs: ...https://t.co/ryaODghohn / Twitter"" width="680"></iframe> Essentially, one model is rewarded for the correct ‘thought process’ or intermediate reasoning steps generated to arrive at an answer; the other model is rewarded for the correct ‘outcome’ or the final result. The two models are tested on a math problems dataset. The findings show that the model trained on process supervision — the one that is rewarded for correct ‘thought process’ significantly outperforms the model rewarded for the correct ‘outcome’. Check the graph of their relative performance here. Another paper from Google demonstrates how an LLM, when utilized with chain- of-thought prompting, performs better at reasoning and decision-making tasks. That is, it works more effectively if it explains it’s thought process ‘out loud’ to arrive at an answer, as opposed to outputting the answer immediately. The two papers show that by utilizing chain-of-thought processes during their decision-making, the models not only enhance their problem-solving abilities, but also reduce the instances of hallucinations, or the generation of incorrect or nonsensical information. <a href="https://pub.towardsai.net/creating-a-mostly-autonomous-hr-assistant-with-chatgpt-and-langchains-agents-and-tools-1cdda0aa70ef">Website</a>