Deduplicating software bugs with Machine Learning at Google

<p>I&rsquo;ve been working at Google for about three and a half years. Back in 2021, we were facing a little problem. A team we were working with had too many duplicate bugs.</p> <p>How did they get in this situation?</p> <p>Well, Google runs automated tests. Thousands of them. Millions of them. Some run automatically every time there&rsquo;s code changes in a developer workspace, some when engineers send a code review, some after they checkin, some as code reaches a deployment stage, some on a cadence, on demand or triggered by an event. Because we run so many tests, we don&rsquo;t always have the human cycles to analyze the results and manually log bugs, so we&rsquo;ve created a number of&nbsp;<strong>bug auto-filers.</strong>&nbsp;An auto-filer analyzes the results of a test run, and logs a bug automatically. Some of our test execution frameworks offer auto-filers as a feature.</p> <p>Bug auto-filers are great in that they reduce a lot of toil and ensure failures are not ignored. But eventually a human does need to be involved and this is where the process can bottleneck. Somebody needs to look at the bugs and decide whether they are a real problem or not. In other words, we created bots to save human toil, that generated human toil at the backend!</p> <p>Sometimes, a&nbsp;<strong><em>single</em></strong>&nbsp;root cause can trigger&nbsp;<strong><em>many</em></strong>&nbsp;tests to fail: if the auto-filer doesn&rsquo;t know that this is the same actual cause, each test failure ends in its own filed bug, therefore having&nbsp;<em>duplicated bugs</em>.</p> <p><a href="https://carloarg02.medium.com/deduplicating-software-bugs-with-machine-learning-at-google-857b5d3036ef"><strong>Visit Now</strong></a></p>
Tags: Software Bugs