Googling around the internet I found a lot of sites where people had written in saying, “I am studying language XYZ, and I want to know how many words I have to know to be able to read a newspaper.”
This question is particularly relevant for people who are studying Chinese, where each word is a character, and most students know the exact number of characters that they can read. Whereas students who have been studying Spanish, German, or Vietnamese for a period of years, wouldn’t generally know the exact number, or may not even know an approximate number of words that they understand.
This information is relevant for anyone studying a foreign language, including English, particularly if your goal is to study at a university overseas or to work in a professional job in the foreign language environment.
Checking a number of websites, the answers varied substantially.
On aksville.com, someone took the time to write a long reply, explaining that major newspapers, such as USA Today, are written at a 6th to 8th grade level and require approximately 3,000 words to read.
Another site, called blogonebytes.com: “I read somewhere that to be able to carry on a good conversation in “Mandarin Chinese” one should know about 3,000 characters, and about 7,000 characters to read technical books.”
A follow up comment by a reader on the same site said, “You will need to know a minimum of 3000 characters to be proficient. You will need to be able to speak and understand in the range of 5000-7000 characters.”
According to Omniglot, a site which I tend to have a lot of respect for, “The largest Chinese dictionaries include about 56,000 characters, but most of them are archaic, obscure or rare variant forms. Knowledge of about 3,000 characters enables you to read about 99% of the characters used in Chinese newspapers and magazines. To read Chinese literature, technical writings or Classical Chinese though, you need to be familiar with at least 6,000 characters.”
I had always heard that the range was somewhere between 1,500 and 3,000 words to read a newspaper. In the case of Chinese, I know that I can read right about 3,000 characters, and yet, I absolutely cannot read a newspaper. If you hand me a newspaper, I can pick out words that I know, but I can’t actually read and understand the stories.
In Bangkok, I have several friends who are extremely conversant in Thai, and they can read a menu. But they would need an entire day and a dictionary to read a single newspaper story. And even then, they wouldn’t understand everything.
With German, after four years of studying and working as a translator and researcher in the country, I can obviously read anything. But, I have no idea how many words I know. Now that I am embarking on my study of Bahasa Malay, and also making plans to go back and finish learning Vietnamese, I am becoming very curious how long it will take to get my reading level anywhere close to what it is in English or Spanish. My own experience with Chinese made me question this 3,000 word figure. Also, as a person who earns most of his living from writing for magazines, newspapers, and books, I would hate to believe that I only write a 3,000 word vocabulary , and on a 6th to 8th grade level.
As many times as I attended 9th grade, you would think I would be writing at least at high school level.
The two facts that I wanted to verify were, the average reading level of The New York Times, my hometown paper, and the average number of words per edition.
The first question was easy to answer.
The May 2, 2005 edition of “Plain Language At Work Newsletter”, Published by Impact Information Plain-Language Services, explained that there are two generally accepted scales for determining the reading level of various publications. They are the Rudolph Flesch Magazine Chart (1949) and the Robert Gunning Magazine Chart (1952). Both charts analyzed such aspects of a magazine or newspaper such as, average sentence length in words and number of syllables per 100 words. Based on this information, they assigned a school grade reading-level to the publication. According to this rating system, The Times of India was considered the most difficult newspaper in the world, with a reading level of 15th grade. The London Times scored a 12th grade reading level, as did the LA Times and the Boston Globe. The survey must have been flawed, however, because they assigned The New York Times a reading level of 10th grade, which is lower than the LA Times, when everyone knows quite well that New York is better than California or any other place which is not New York.
If you get most of your news from Time Magazine, you might be pleased to know that Time and TV Guide both scored a 9th grade reading level.
The survey didn’t cover newspapers written in languages other than English, but if we assume that we are shooting for an average 10th grade level, this will probably be close to what you need to read a newspaper in any language.
The next question was much harder to answer. How many words do I need to read the New York Times? I have never believed the low estimates of 3,000 or less, simply because every event that happens anywhere in the world, any human situation can appear in the Times as a news story and could of course, require the appropriate vocabulary.
To answer the question, I went to the June 4, 2010 New York Times online and I chose 8 articles, taken from several different sections, because I assumed they would all require different vocabulary. The stories were: “Pelicans, Back From Brink of Extinction, Face Oil Threat”, “BP Funneling Some of Leak to the Surface”, “John Wooden, Who Built Incomparable Dynasty at U.C.L.A., Dies at 99”, “An Appraisal : Wooden as a Teacher: The First Lesson Was Shoelaces”, “Should you be able to discharge student loans into bankruptcy?”, “On the Road to Rock, Fueled by Excess” as well as other tidbits, announcements and follow up articles.
In some cases, if the articles were very long, I didn’t take them in their entirety, assuming there would be much repetition of words.
In all, I took parts of about 8 stories, comprising 51 pages of text. The stories I took didn’t even represent 10% of the total content of this particular edition of The New York Time, June 4, 2010 online edition.
I pasted the words into a word document, converted them to a single column table, which ran over 450 pages long. Then I sorted the table alphabetically. Up to this point, it was easy, just pressing buttons. Next, I had to go through all 450 pages, all 10s of thousands of words, removing duplicates. It was one of the most tedious exercises I have ever conducted in my life. It was exactly the type of obsessive compulsive behavior that gets people locked up in mental institutions. It took 16 hours. By the 10th hour, I began hallucinating. Nearing the 12th hour, I believed I was a hummingbird of some kind.
I allowed plural forms of nouns, so I counted “car” once and “cars” once. I also included all forms of a verb, so “walk” once, “walked” once, and “walking” once. I counted proper nouns, including place names, as the names of people and countries will come up in the news and you need to know them. Also, in foreign language, particularly Asian languages, the grammatical forms and proper names may not even be recognizable if you haven’t studied and learned them.
When I was finished, I found that the random sampling of stories I chose contained 4,139 unique words. This was much higher than the estimates I had read on some websites, but was well in line with what I suspected. If I had the energy to complete a similar analysis of the entire edition, I would have to believe the number would increase. And if we monitored the newspaper over a period of one month, analyzing the text every day, and comparing the vocabulary against an accumulated list, I would imagine that it would grow. Most likely the difference in vocabulary from day to day would be small, but still, the necessary vocabulary would increase.
Comparing the dialogues in my Chinese textbooks with the vocabulary that appeared in these New York Times articles, much of what I learned in school was useless. For example, all foreign language textbooks have chapters devoted to shopping at the market, where you have to memorize tedious lists of Fruits and vegetables. In these Times articles, not a single fruit name was mentioned. Neither my Vietnamese, Chinese, or Bahasa textbooks include the names of heads of state of various countries. But obviously, these names came up in world news stories.
Below is a small sampling of words that I found in the news story which, I don’t know how to say in Chinese. Some of these words, I question, however, if the average 9th grader would know them. Do 9th graders know: abetted, absinthe, archeo-feminist, or bearish?
abetted albeit assesses bankruptcy biofuels able-bodied. Amandine assessment batch biography abortions ambivalent assets bawdy-sweet black-clad absinthe anachronistic asthmatic bearish bleak absurd. anarchic audience-pleasing Bedford blemish accord Appended aura befriended blockade across-the-board Archbishop autobiography behind-the-back blowout activists archeo-feminist autograph-seekers benefits bond Advocates articulate awfully best-selling booster aerodynamic assertion babbles bioenergy breakthrough
Names and proper nouns are important for understanding news stories. In language textbooks you may learn the names of major countries and the capital cities, but news happens in small cities and even villages as well. To read the news you need to know the names of political parties, famous people, economic theories, financial indices, global corporations, educational institutions, associations, and international organizations such as the UN.
All of these names were taken from the same collection of stories. Do you know how to say these in Vietnamese or write them in Thai?
Cypriot Delta Geneva Mediterranean Bihar Baltic Democrat Greece Nehru Turkish-controlled Brooklyn Denmark Uttar Metropolitan Nasdaq Iranian Dow Midwesterner Mayor Polytechnique Louisiana Durbin Scotch Reich Iskenderun. pro-Greek Dutch-Irish Rev. Latino Kentucky. California Baptist BENJAMIN Bonaventure/Agence Burke/Associated Cambridge Chicago-based Berkeley Pennsylvania. Bush Cyprus Barataria-Terrebonne Navy BP Dallas-Fort Audubon Gandhi. Bess Dalit Arce
How many of the above terms were you able to translate or transliterate into the language that you study? This is the level of reading that an adult native-speaker can do, and this should be your goal. If the task doesn’t seem daunting enough, remember, in this article, we were only concerned with vocabulary. But you could have a vocabulary of a million words not be able to understand a newspaper or a book. For real communication, you need a comprehensive approach to language, which includes culture, syntax, context, and grammar.
It’s a long stretch. I know. And it can seem impossible. But remember, every Sunday in New York City Catholic mass is said in 29 languages. For more than a century, large numbers of immigrants, my family included, have been coming to America and Canada in search of a better life. Most of them learned English with less than half of the education of the average person reading this article.
So, if your Grandma and Grandpa could learn a new language to a level of functionality, so can you.