News Release 7-Nov-2000

Surf like a caveman

Reports and Proceedings

New Scientist

Which of these activities occupies more of your time: foraging for food or surfing the Web? Probably the latter. We're all informavores now, hunting down and consuming data as our ancestors once sought woolly mammoths and witchetty grubs. You may even buy your groceries online.

But in an odd sort of way, Internet shopping has brought us full circle. According to researchers in the US, the strategies you use when you surf the Web are exactly the same as the ones hunter-gatherers used to find food. You may be plugged into the information superhighway, but deep down you're still a caveman.

At least that's the opinion of two researchers at Xerox's Palo Alto Research Center in California. Peter Pirolli and Stuart Card are using foraging theories from ecology and anthropology to understand how people find information in data-rich environments such as the Internet. They believe Web surfers rely on prehistoric instincts to maximise their yield when they hunt and gather morsels of information. If they're right, their results could help others design websites and search tools that are as alluring to informavores as flowers are to bees.

Biologists came up with foraging theory in the 1970s as a way of explaining some puzzling aspects of animal behaviour. A hungry fox, for example, might have the choice between chasing a big, juicy rabbit or a tiny vole. Which should it choose? Foraging theory can decide. It states that as far as possible, animals make choices that maximise their "benefit per unit cost". In other words, they'll expend food-gathering energy in ways that yield the best energy returns. The rabbit might have a high energy value, but it costs a lot to catch. The vole is much easier prey.

This cost-benefit analysis is complicated by the fact that food resources aren't evenly distributed around the world. They're patchy. The longer a forager exploits one patch, the lower the returns will be, until the patch is overgrazed and worthless. But time spent searching for a new patch is unprofitable-there's nothing to gain from a sterile space. So when is the best time to start looking for a new patch?

It turns out that the optimal strategy is to move on when the rate of return from a particular patch falls below the average rate over the whole region. This is the marginal value theorem, a cornerstone of foraging theory formulated by the University of New Mexico biologist Eric Charnov in 1976. And it doesn't just apply to animals. The theorem has been widely used in anthropology to explain all sorts of human behaviours, from food preferences to patterns of land tenure.

Pirolli and Card now believe the same idea can be used to understand information foraging. Imagine you're a financial analyst looking for data about an investment company. You've found a useful site on the Web, but it's starting to feel a bit stale. You'd like to move on, but you know that a search will take time and there's no guarantee that other sites will be any more useful. When should you abandon the dwindling supply? This, Pirolli and Card argue, is analogous to the problem faced by hunter-gatherers. And it can be solved in the same way.

The first inkling that this was the case came in 1992. Pirolli and Card were studying the relationship between humans and information, looking for a theory that explained how people performing data-intensive tasks decide where and how to look for data. They had already conducted what they call "quick and dirty" field studies of information-gathering behaviour, one on a group of MBA students and another on the author of a business newsletter.

Pirolli knew something of foraging theory, and he quickly noticed a correlation between the studies' findings and the behaviour you'd expect from animals searching for food. Like hungry foxes, information foragers try to maximise their benefit per unit cost-in this case, "benefit" meaning the relevance of the information and "cost" the time it takes to find it. They are also likely to move on from an information resource when it no longer yields a better-than-average return.

It was a satisfying analogy, but they needed empirical findings to back it up. So they designed a computer model that obeyed the rules of optimal foraging theory and set it to work looking for information.

The latest incarnation of Pirolli and Card's artificial forager is based on ACT, a theory of cognition developed by Carnegie Mellon computer scientist John Anderson. ACT stands for both Adaptive Character of Thought and Atomic Components of Thought, and is well suited to the research because it possesses human-like conceptual and problem-solving skills-things an information forager needs in abundance. On top of these, Pirolli and Card programmed in the rules of optimal foraging theory.

To test whether the theory produces useful results, Pirolli and Card set their model to work looking for information on a database. The database they chose was the IR Test Collection, one of the ultimate challenges in information science. It's a huge reservoir of texts from The Wall Street Journal, the Financial Times, the San Jose Mercury-News, the Associated Press newswire, the Department of Energy, the Federal Register, the US Patent Office, computer publisher Ziff-Davis and a handful of sources in Japanese, Spanish and Chinese. It contains more than a million documents.

Fastest route Pirolli and Card pinpointed target documents in the IR Test Collection and worked out the most efficient strategies for retrieving them. For this, they used an information retrieval system called Scatter/Gather designed for sifting through large databases. Scatter/Gather assigns each document to one of 10 groups according to its content, so documents that contain similar words end up in the same group. It presents these on screen as 10 boxes, each containing a collection of keywords. The user selects one or more of the groups. Scatter/Gather then discards the documents in the unselected boxes and scatters the remainder into ten more groups. It repeats the process until the user is satisfied that it's worth reading the gathered texts.

To find the fastest retrieval routes-in other words, those using the smallest number of steps-Pirolli and Card worked backwards through Scatter/Gather, starting from the target documents. Then they asked their artificial forager to go find the same pieces of information within the IR Test Collection. It did so with little problem. When Pirolli and Card plotted the forager's track through the collection, it matched the ideal route almost perfectly.

They then recruited eight human volunteers and asked them to perform the same task. Again, their routes closely matched the ideal one. It seems as though informavores really do employ optimal foraging strategies to sniff out rich information patches and avoid the arid plains.

Experts in foraging theory agree. "It's likely Web users rely on problem-solving abilities with deep evolutionary roots," says Bruce Winterhalder, an anthropologist at the University of North Carolina at Chapel Hill who has studied human foragers in great detail. "Foraging on the Web presents trade-offs analogous to those of hunter-gatherers. Different context, but similar cost-benefit problems."

Biologist David Stephens of the University of Minnesota, who co-wrote the seminal 1986 book Foraging Theory, adds: "Animals have been solving search problems for millennia, and natural selection has made them good at it. It follows that we can learn something from them."

What that means in practical terms is that database and Web designers could use foraging theory to help them create more productive information resources. The theory could prove particularly useful at that crucial moment when a forager starts thinking about leaving one patch in search of another.

In this respect, one of the most useful ideas the research has produced is that of "information scent". Pirolli and Card guessed that information foragers-whether human or artificial-have some way of evaluating the likelihood of finding target information in a given Scatter/Gather box. This led them to the idea that associated concepts "rub off" on one another, leaving detectable traces, just as a watering hole frequented by woolly mammoths will smell of woolly mammoths. A hunter-gatherer seeking mammoths is likely to be drawn to the watering hole, if only to look for spoor. Information foragers do the same. Imagine you're looking for texts about foraging theory. If Scatter/Gather throws up a box containing the keyword "hunter-gatherer", you're likely to select that box. It just smells right.

Xerox is now trying to capture the essence of information scent and infuse it into Web pages, giving surfers subtle come-ons as they sniff around for useful sites. "We are developing technologies that help designers make page layouts that give off good information scent-cues that allow users to assess the match of information to their needs and identify how to get to it," says Pirolli.

The analogy between food and information looks like being a big help to Web designers. But at some point, Pirolli says, it's likely to break down. For one thing, there's the question of evaluating costs and benefits. Biologists and anthropologists can always draw up an energy balance sheet for a foraging behaviour in joules. The value of information isn't so easy to measure. Another problem is that foraging models tend to assume environments stay the same over time, whereas information ecologies are nothing if not dynamic. Ingen-ious informavores-and those who seek to provide them with information-can actively manipulate their environment.

And even if information foraging theory works, there's no guarantee that it will be used to benefit the forager. Think of insectivorous flowers that lure flies with the scent of carrion. As Card points out: "The vendor's interests may not correspond with the searcher's. They may camouflage information to hide it or mimic something that they think you want. Banner ads, especially ones with fake buttons on them, are an example." So next time you're hunting down information on the Web, beware. It could smell like a juicy rabbit, but turn out to be a vole.

###

Rachel Chalmers is a technology writer based in San Francisco

Further reading: Foraging Theory by David Stephens and John Krebs (Princeton University Press, Princeton, 1986)

New Scientist issue: 11 November 2000

PLEASE MENTION NEW SCIENTIST AS THE SOURCE OF THIS STORY AND, IF PUBLISHING ONLINE, PLEASE CARRY A HYPERLINK TO: http://www.newscientist.com

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.