A Little Bit of Rust
<p>A while ago, I wrote an <a href="https://medium.com/itnext/a-little-bit-of-code-c-20-ranges-c6a6f7eae401" rel="noopener">article</a> about an interesting problem with an even more interesting story. Back in the 80s, Donald Knuth was asked to write a code that solves this problem:</p>
<blockquote>
<p>Read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies.</p>
</blockquote>
<p>Knuth came up with a 10-page-long Pascal solution. In response, <a href="https://en.wikipedia.org/wiki/Douglas_McIlroy" rel="noopener ugc nofollow" target="_blank">Doug McIlroy</a>, the inventor of UNIX pipes, wrote a UNIX shell script that solves the same problem in six lines:</p>
<pre>
tr -cs A-Za-z '\n' |
tr A-Z a-z |
sort |
uniq -c |
sort -rn |
sed ${1}q</pre>
<p>Last time, I tried solving this problem with C++ using the ranges-v3 library to keep the code as short as possible. I’ll try to do the same with Rust.</p>
<p><em>Disclaimer:</em></p>
<ul>
<li><em>Pardon the smelly code; I’m new to Rust. I’d appreciate comments and suggestions on making the code more Rust-idiomatic.</em></li>
<li><em>The article is full of C++ terms and parallels to describe the Rust code and Rust concepts, but I tried to keep it understandable to a reader experienced in any of the general-purpose programming languages.</em></li>
</ul>
<h1>The Solution</h1>
<p>Before getting to the code, let’s dive into the problem and find a solution. It seems a straightforward problem with little room for clever tricks. We read the text in the file, split it into words (removing the punctuation signs and making them lowercase), count the occurrence of each word (using a hash table), sort the resulting <code>(word, frequency)</code> tuples by the frequency in descending order, and print the first N of those. Pretty straightforward.</p>
<p><a href="https://betterprogramming.pub/a-little-bit-of-rust-d9f2afc09238">Website</a></p>