Unnatural Keys
<p>At time of writing, I am working in the music industry. And as part of that work, we want a database of all of the songs in the world so that we can properly identify unknown songs and provide attribution so that folks can get paid appropriately. It is a noble goal with some interesting engineering challenges.</p>
<p>There’s also some… <em>less</em> interesting engineering challenges.</p>
<p>One is a bit self-inflicted. The first instinct for every DB person when faced with the “database of all the songs in the world” problem is to go with a <a href="https://en.wikipedia.org/wiki/Natural_key" rel="noopener ugc nofollow" target="_blank">natural key</a>. They think: “there’s a bunch of IDs we have to store anyways that the business cares about. That’s the definition of a natural key! Let’s just use them”. After all, there are a lot of songs in the world — slightly more than 100 million, depending on who you ask and what they consider to be a song. Adding our own surrogate means a few hundred megabytes of overhead, excluding indexes on the <em>other</em> IDs that the business cares about.</p>
<p>There’s even industry standards that <em>should</em> take care of this for us. ISRC is literally the ISO standard (ISO 3901) for “uniquely identifying sound recordings”. And if you’ve worked in software for any length of time, you know that it does not.</p>
<p><a href="https://betterprogramming.pub/unnatural-keys-425a68ee350c">Click Here</a></p>