PHILADELPHIA — New research by a team of University of Pennsylvania psychologists is helping to overturn the dominant theory of how children learn their first words, suggesting that it occurs more in moments of insight than gradually through repeated exposure.
The research was conducted by postdoctoral fellow Tamara Nicol Medina and professors John Trueswell, and Lila Gleitman, all of the Department of Psychology in Penn's School of Arts and Sciences and the University's Institute for Research in Cognitive Science, and Jesse Snedeker, a professor at Harvard University.
Their work was published in the journal Proceedings of the National Academy of Sciences last week.
The current, long-standing theory suggests that children learn their first words through a series of associations; they associate words they hear with multiple possible referents in their immediate environment. Over time, children can track both the words and elements of the environments they correspond to, eventually narrowing down what common element the word must be referring to.
"This sounds very plausible until you see what the real world is like," Gleitman said. "It turns out it's probably impossible."
"The theory is appealing as a simple, brute force approach," Medina said. "I've even seen it make its way into in parenting books describing how kids learn their first words."
Experiments supporting the associative word learning theory generally involve series of pictures of objects, shown in pairs or small groups against a neutral background. The real world, in contrast, has an infinite number of possible referents that can change in type or appearance from instance to instance and may not even be present each time the word is spoken.
A small set of psychologists and linguists, including members of the Penn team, have long argued that the sheer number of statistical comparisons necessary to learn words this way is simply beyond the capabilities of human memory. Even computational models designed to compute such statistics must implement shortcuts and do not guarantee optimal learning.
"This doesn't mean that we are bad at tracking statistical information in other realms, only that we do this kind of tracking in situations where there are a limited number of elements that we are associating with each other," Trueswell said. "The moment we have to map the words we hear onto the essentially infinite ways we conceive of things in the world, brute-force statistical tracking becomes infeasible. The probability distribution is just too large."
To demonstrate this, the Penn team conducted three related experiments, all involving short video segments of parents interacting with their children. Subjects, both adults and preschool-aged children, watched these videos with the sound muted except for when the parent said a particular word which subjects were asked to guess; the target word was replaced with a beep in the first experiment and a nonsense placeholder word in the second and third.
The first experiment was designed to determine how informative the vignettes were in terms of connecting the target word to its meaning. If more than half of the subjects could correctly guess the target word, it was deemed High Informative, or HI. If less than a third could, the vignette was deemed Low Informative, or LI. The latter vastly outnumbered the former; of the 288 vignettes, 7 percent were HI and 90 percent were LI, demonstrating that even for highly frequent words, determining the meaning of a word simply from its visual context was quite difficult.
The second experiment involved showing subjects a series of vignettes with multiple target words, all consistently replaced with nonsense placeholders. The researchers carefully ordered the mixture of HI and LI examples to explore the consequences of encountering a highly informative learning instance early or late.
"In past studies of this kind, researchers used artificial stimuli with a small number of meaning options for each word; they also just looked at the final outcome of the experiment: whether you end up knowing the word or not," Trueswell said. "What we did here was to look at the trajectory of word learning throughout the experiment, using natural contexts that contain essentially an infinite number of meaning options."
By asking the subjects to guess the target word after each vignette, the research could get a sense of whether their understanding was cumulative or occurred in a "eureka" moment.
The evidence pointed strongly to the latter. Repeated exposure to the target word did not lead to improved accuracy over time, suggesting that previous associations hypotheses were not coming into play.
Moreover, it was only when subjects saw an HI vignette first did the accuracy of their final guesses improve; early HI vignettes provided subjects with the best opportunity to learn the correct word, and most guessed correctly when presented with them. Confirming evidence helped "lock in" the correct meaning for these subjects who started on the right track.
"It's as though you know when there is good evidence, you make something like an insightful conjecture," Gleitman said.
However, when subjects saw an LI vignette first they tended to guess incorrectly and, although they revised these guesses throughout the experiment, they were ultimately unable to arrive at the correct meaning. This showed that these subjects had no memory of plausible alternative meanings, including the correct one, from earlier vignettes that they could return to.
The third experiment showed that the inability to hold these incorrect meanings in mind is necessary for how word acquisition likely works. After a delay of a couple days, subjects saw vignettes on the same target word they missed before but showed no evidence of retaining their incorrect assumptions.
"All of those memories go away," Gleitman said. "And that's great! It's the failure of memory that's rescuing you from remaining wrong for the rest of your life."
Future work by members of the Penn team will investigate what makes certain interactions more or less informative when it comes to word meaning, as well as the order in which people process visual information in their environment. Both avenues of research could help rewrite textbooks and parenting guides, suggesting that rich interactions with children — and patience — are more important than abstract picture books and drilling.
The research was supported by the National Institutes of Health.
Journal
Proceedings of the National Academy of Sciences