Check AI-Generated Code Perfectly and Automatically

<p>Do you love AI coding tools such as&nbsp;<a href="https://chat.openai.com/" rel="noopener ugc nofollow" target="_blank">ChatGPT&nbsp;</a>and&nbsp;<a href="https://github.com/features/copilot" rel="noopener ugc nofollow" target="_blank">GitHub Copilot</a>? Do you hate that you can&rsquo;t trust the code they produce?</p> <p>In some (for now, limited) cases, you&nbsp;<em>can&nbsp;</em>trust their code! Using formal verification tools such as&nbsp;<a href="https://github.com/model-checking/kani" rel="noopener ugc nofollow" target="_blank">Kani</a>&nbsp;for Rust we can sometimes automatically prove the correctness of AI-generated code with mathematical certainty. My crate&nbsp;<a href="https://github.com/CarlKCarlK/range-set-blaze" rel="noopener ugc nofollow" target="_blank">RangeSetBlaze&nbsp;</a>now includes some such code.</p> <p>Rust already provides mathematical certainty for&nbsp;<strong>memory management</strong>. In this article, you&rsquo;ll also learn how you can use Kani to expand that certainty to the topic of&nbsp;<strong>computer arithmetic and overflows</strong>. You&rsquo;ll find Kani useful for both human- and AI-generated code.</p> <blockquote> <p>Aside: You may object that, in general, proving a program correct is&nbsp;<a href="https://en.wikipedia.org/wiki/Halting_problem" rel="noopener ugc nofollow" target="_blank">undecidable</a>. Also, in general, writing a correct specification is as hard as writing a correct program. While these statements are true in general, in many specific domains writing a specification is easy&mdash; for example, we just write an obviously correct but inefficient function as our spec. Also, in some useful cases, proving the equivalence of two functions is well within the capabilities of systems such as Kani. Read on for examples.</p> </blockquote> <p>To understand the promise and limitations of automatic verification of automatically-generated code, I applied ChatGPT and Kani to parts of&nbsp;RangeSetBlaze. RangeSetBlaze is a Rust crate for efficiently manipulating sets of &ldquo;clumpy&rdquo; integers. Specifically, I looked at three programming problems with these properties:</p> <ul> <li>Problem 1: A one-line problem verifiable by Kani but too hard for ChatGPT</li> <li>Problem 2: A one-line problem solvable by ChatGPT and verifiable by Kani</li> <li>Problem 3: A more interesting problem, too hard for Kani and for ChatGPT</li> </ul> <p>Excitingly, when I benchmarked the ChatGPT/Kani solution to Problem 2, I found it to be 3% faster than my original code. RangeSetBlaze now includes this computer generated and verified code.</p> <p><a href="https://medium.com/@carlmkadie/check-ai-generated-code-perfectly-and-automatically-d5b61acff741">Visit Now</a></p>