Creating a (mostly) Autonomous HR Assistant with ChatGPT and LangChain’s Agents and Tools
<p>OpenAI recently released a paper comparing two training methods aimed at improving the reliability of Large Language Models (LLM): model training by ‘process supervision’ and model training by ‘outcome supervision’.</p>
<p><iframe frameborder="0" height="440" scrolling="no" src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&key=a19fcc184b9711e1b4764040d3dc5c07&schema=twitter&url=https%3A//twitter.com/OpenAI/status/1663957407184347136%3Fs%3D20&image=https%3A//i.embed.ly/1/image%3Furl%3Dhttps%253A%252F%252Fabs.twimg.com%252Ferrors%252Flogo46x38.png%26key%3Da19fcc184b9711e1b4764040d3dc5c07" title="OpenAI on Twitter: "We trained an AI using process supervision - rewarding the thought process rather than the outcome - to achieve new state-of-art in mathematical reasoning. Encouraging sign for alignment of advanced AIs: ...https://t.co/ryaODghohn / Twitter"" width="680"></iframe></p>
<p>Essentially, one model is rewarded for the correct ‘thought process’ or intermediate reasoning steps generated to arrive at an answer; the other model is rewarded for the correct ‘outcome’ or the final result.</p>
<p>The two models are tested on a math problems dataset. The findings show that the model trained on process supervision — the one that is rewarded for correct ‘thought process’ <em>significantly outperforms</em> the model rewarded for the correct ‘outcome’. Check the graph of their relative performance here.</p>
<p>Another paper from Google demonstrates how an LLM, when utilized with <em>chain- of-thought</em> prompting, performs better at reasoning and decision-making tasks. That is, it works more effectively if it explains it’s thought process ‘out loud’ to arrive at an answer, as opposed to outputting the answer immediately.</p>
<p>The two papers show that by utilizing <em>chain-of-thought</em> processes during their decision-making, the models not only enhance their problem-solving abilities, but also reduce the instances of <em>hallucinations</em>, or the generation of incorrect or nonsensical information.</p>
<p><a href="https://pub.towardsai.net/creating-a-mostly-autonomous-hr-assistant-with-chatgpt-and-langchains-agents-and-tools-1cdda0aa70ef">Website</a></p>