Emily Oster

8 minute read Emily Oster

Emily Oster

How Many Words Should Kids Say? And When?

A Cribsheet excerpt

Emily Oster

8 minute read

Today the newsletter is largely an excerpt from Cribsheet, all about kids and language — I thought given the Q&A from Katie Kinzler a couple of weeks ago, it might be interesting to revisit what the data says about how quickly kids start talking.

Cribsheet excerpt: Kids and language

Communicating with each other—talking, signing, writing—is among the things that make us most human. The moment your child stops having to cry and point desperately at the refrigerator and can instead say, “Milk, please” is one in which you can start to see glimmers of a person in there. We usually remember our children’s first words (Penelope: “shoes”; Finn: “Penelope (Puh-Puh)”), and early on many of us will probably admit to counting just how many they have.

Talking is also a natural point of comparison—of your children to other children, of your children to each other, and (in my case) of your children to yourself. I was warned before I had Finn that this problem is especially acute if you have a daughter first, followed by a son.

“Boys are slower with language,” warned my more delicate friends. Some less delicate ones said, “You’ll think your son is stupid.” People whose children were born in the opposite gender order told me how brilliant they thought their daughter was.

Figuring out how your child compares to others is not, in fact, straightforward. As with physical milestones, doctors tend to focus on identifying children for early intervention. At the two-year-old doctor visit it is common to ask whether the child has at least twenty-five words they say regularly. At less than this, it may be appropriate to bring in some outside help to figure out what is wrong. But this is a cutoff to indicate a problem, not a measure of the average or anything about the range. The average child has more than twenty-five words at age two. But how many more?

Most pediatrics books have similar approaches—they warn you when to be concerned, but don’t give a sense of the full distribution.

Even with the full distribution, there is a second question: Does it matter? Is talking early a marker of anything later? Both of these questions have answers—the first a bit more satisfying than the second—we just have to go to the data.

Data on children and words: The distribution

In principle, it seems like it would be straightforward to collect data on how many words children say. Specifically, you could just count them. And it’s true that when a child is very small—when they have five or ten or twenty words—probably parents could remember most of them if asked. But this procedure can break down as children talk more and more. Let’s say your child says four hundred words, some of them used frequently and some infrequently. Will you really remember them all?

A related problem in comparisons is how to count words that are specific to your child. For example: At just over two, Finn became obsessed with a song entitled “Bumblebee Variety Show,” written by the local “Music Together” instructor, Jen. We played it on repeat every time we were in the car. He liked to sing it loudly—in the car with the music, in his crib, in the bath.

The primary lyrics in this song are “Bumblebee variety show.” Technically, then, he could say this, although he pronounced it as one word: bumblebeevarietyshow. So: When counting words, should I think of him as knowing the word variety? He certainly would not use it in a sentence, nor did he think of it as a separate word. So, probably not. But then, should I count bumblebeevarietyshow as a single word? This seems more plausible. But still, it’s not even clear he thought of this as a word as opposed to just a noise. Also, it is actually not a word.

Researchers get around both of these problems—recall and the comparison set—by using a standardized measure of vocabulary size from a consistently used survey. The commonly used one is the MacArthur-Bates Communicative Development Inventory (MB-CDI).

The MB-CDI is administered to parents. The vocabulary portion lists 680 words in various categories—animal sounds, action words (“bite,” “cry”), body parts, etc. Parents check off all the words they have heard their child say, giving them a count of vocabulary size on these words.

For kids above sixteen months, the survey uses words and sentences; for those younger than that, there is a separate form for words and gestures.

This approach to vocabulary size works well for two reasons. First, by listing the words and asking about them rather than asking parents to remember, parents are less likely to forget words. I may never have recalled that my son knew the word shovel, but once it is mentioned, I may remember an incident in which he asked for one. Second, by looking at the same words for every kid, it is much easier to compare across children.

An obvious downside to this approach is that it will understate speaking ability for children who know a lot of unusual words but miss some common ones. For example, one of the words on the list is Coke; if your children do not drink soda, they may not know this word. Similarly, children in Hawaii may be less familiar with the word sled.

This problem is most acute as you get to ages where children know most of the words. It may not really be feasible to distinguish between a child who says 675 of the words and one who says 680. For children who know fewer words, these small differences balance out—one child knows sled, another knows beach.

Many people have completed this form. Much of this is in service of research. Some is in service of evaluating children for developmental delays or simply to satisfy curious parents. Regardless of the reason, the developers of this survey have a website where results can be uploaded. And from this, we can get a first answer to the question of the distribution of words. The graph is created out of their data—the horizontal axis is the age, and the vertical axis is the count of words as scored in the survey.

The lines in the graph show “quantiles”—basically, the distribution of words at each age. Take, for example, age 24 months. This data says that the average child—that’s the 50th percentile line—at 24 months has about 300 words. A child at the 10th percentile—so, near the bottom of the distribution—has only about 50 words. On the other end, a child at the 90th percentile has close to 600 words.

For younger children, these surveys and data focus on both words and gestures (i.e., signs). The graph shows similar data for children aged eight to eighteen months on this metric. One main takeaway from these graphs is the explosion of language after fourteen or sixteen months. Even the most advanced one-year-old has only a few words. At eight months, virtually no children have any words or gestures.

I was interested to note this, given my mother-in-law’s continual insistence that Jesse said the word fishy at six months.

The website for this data is publicly accessible and has the capacity to make all sorts of graphs—they can show you the data broken down by parental education or birth order (later children talk more slowly), for example, and they have similar data for other languages and for counts of words children understand in addition to being able to speak. It is worth noting here that kids who are bi-lingual – that is, their parents or caregivers speak to them in two different languages – tend to be slower to talk, although when they do, they can speak both languages.

Perhaps the most interesting of these splits is by gender, given the general impression that boys develop more slowly. This is, indeed, borne out in the data. The following graphs separate out boys and girls, and we can see that boys have fewer words at all points in the distribution. At twenty-four months, for example, the average girl has about fifty more words than the average boy. By thirty months, the most advanced boys and girls are similar, but there are still large differences at other points in the distribution.

This data provides some useful norming, but it is important to be cautious about where it comes from. It is not (for the most part) nationally representative data. There are many more parents with college or graduate degrees in these data points than you would see in the overall population. This means these figures are likely to overstate the average among all children. Having said that, they give you something beyond a general guideline about when to be worried, and also provide reassurance that there is a significant range in this distribution at all young ages.

Community Guidelines
Two women stand on a balcony chatting. One is pregnant.

Feb. 27, 2023

6 minute read

Your Best Parenting Advice

ParentData is 3!

A line graph with pink, yellow, and blue dots representing life's ups and downs.

Feb. 21, 2023

3 minute read

Wins, Woes, and Autism

Your stories for the week

A toddler sits on a couch poking at an iPad and smiling.

Feb. 16, 2023

4 minute read

Infant Screen Time and Academic Success

Infant screen time and breakfast cereal terror

A teddy bear sits on a chair in a doctor's waiting room.

Feb. 6, 2023

11 minute read

New AAP Guidelines on Childhood Obesity

What does the data tell us?