Zipf |
When researchers analyze data sometimes remarkable patterns are found. One of these patterns is Zipf's Law, named after linguist George Kingsley Zipf (1902-1950). He found that the frequency of any word is inversely proportional to its rank in the frequency table. Specifically, he found that the most frequent word, will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, etc.
Thus he found that the most occuring word 'the' accounts for 7 percent of all word occurences (69.971 out of 1 million), the second most occuring word 'of' account for 3.5 percent of all word occurences etc.