Top 10 Words โ‰ˆ 25% of All English Text the ~5% be ~3% to ~3% of ~2.5% and ~2.5% a ~2% in ~2% that ~1.5% have ~1.5% I ~1.5%
Language Fun4 min readMarch 30, 2026

The 100 Most Common Words in English Do Almost All the Heavy Lifting

The word 'the' alone accounts for about 5% of all English text. The top 10 words account for 25%. The top 100 words? Nearly 50%. We obsess over rare vocabulary while common words do all the real work.

Here's a weird experiment you can try at home. Take any paragraph of English text and cross out all the common function words โ€” "the," "be," "to," "of," "and," "a," "in," "that," "have," "I." What remains is a skeleton of content words โ€” nouns, verbs, adjectives โ€” that carry the meaning but can't form a sentence. The crossed-out words, the ones you barely notice when reading, make up roughly 25% of all English text. They are the language's invisible backbone. We spend all our time learning and showing off rare vocabulary โ€” "serendipity," "ephemeral," "ubiquitous" โ€” while the top 100 most common words do about half the actual work of English communication.

The Numbers

There's been extensive analysis of large text corpora (the Oxford English Corpus contains over 2 billion words from all kinds of sources), and the pattern is remarkably consistent across different types of text. "The" is always #1, accounting for roughly 5% of all words. The verb "be" (in all its forms: is, was, are, been, being) is #2, at about 3%. Together with "to," "of," "and," "a," "in," "that," "have," and "I," the top 10 words account for about 25% of everything written in English. Expand to the top 100, and you're covering nearly 50% of all words in typical text.

What are these words? Almost all function words โ€” articles, prepositions, pronouns, conjunctions, auxiliary verbs. These are the grammatical scaffolding of English. They don't carry much meaning by themselves, but without them, English sentences collapse into word salad. The content words โ€” nouns like "time," "year," "people," "way," "day," "man," "woman," "child," "world" โ€” show up further down the list, mixed in among the function words. The most common content word in English is typically "time," appearing somewhere around #55 on the frequency list.

Zipf's Law

This distribution follows what's called Zipf's law โ€” named after the American linguist George Kingsley Zipf, who noticed in the 1930s that in any large body of text, the frequency of any word is inversely proportional to its rank. The most common word appears about twice as often as the second most common, three times as often as the third, and so on. This is not a rule of English specifically โ€” it holds across all human languages, and surprisingly, it also holds in many other systems: city populations, company sizes, income distributions, even the frequencies of notes in musical compositions. It's a deep statistical regularity that nobody has fully explained, though many have tried.

Implications for Learning

For language learners, the Zipfian distribution has a practical implication that's both encouraging and discouraging. The encouraging part: if you learn the 1,000 most common words of any language, you can understand roughly 80% of the words in most ordinary texts. That's a manageable goal. The discouraging part: the remaining 20% โ€” the rare words that carry most of the specific meaning โ€” are where all the learning effort goes. A beginner can learn "the," "be," "to" in a day. Learning the 20,000th most common word takes years of exposure.

And then there's the poetry of it. The most common word in English โ€” "the" โ€” is also the most humble. It's a definite article. It has no glamour. Nobody has ever won a spelling bee by spelling "the." But without it, English would be a different language. The holds our sentences together, silently, five percent of the time, asking nothing in return except that we occasionally notice and appreciate the work it does.