The Zipf distribution and how to use it to find fake data
<p>In a course about numerical approximation of probability measure, I recently learned about the Zipf distribution, which mathematically describes the observation that the frequency of the nth most common entry in a list is (somehow) inversely proportional to n.</p>
<p>For example: The nth most common word (in more or less every language) is inversely proportional to n.</p>
<p>The following graphic shows the relation between the 30 most frequent words in three combined books from Charles Dickens and their frequency.</p>
<p><a href="https://medium.com/@bernhard.eisvogel/the-zipf-distribution-and-how-to-use-it-to-find-fake-data-c1b9ea81543f"><strong>Read More</strong></a></p>