Try explaining to a nonnative English speaker why calling someone a wise guy isn’t flattering. It’s not easy. That’s why Christopher Manning, an assistant professor of computer science and linguistics at Stanford University, is turning to the Web to help people master English’s strange sayings and seemingly arbitrary rules.
Manning is surfing the Web for examples of the language as it’s used in everyday life. “A grammar book doesn’t show how people actually use language,” he says. “Real usage is reflected in what people say and write.” So Manning is continually scanning online newspapers, literature, chat groups?even real estate listings?to build statistical models of English usage. The models will help course developers create accurate and useful instruction guides, courseware and related materials, Manning says. “By taking the emphasis off the rules and placing it on how people really speak, it will show students the real world rather than a perfect world.”
Manning’s research could also prove useful to speech recognition software developers. As businesses expect speech systems to handle increasingly complex transactions, accuracy is becoming more critical. “You can’t just settle for a 60 percent probability that the user said, ’Buy 100 shares,’ rather than ’Sell 100 shares,’” says Manning. He claims that speech systems based on probability models of real-life sentence structures would provide faster and more accurate recognition.
Improved data-mining technology could be another offshoot of Manning’s research. Statistical models would help computers extract the key information that’s tucked inside cryptic real estate ads or court decisions. “People don’t speak or write like a real estate ad,” Manning says. “The models would help machines to better understand the specific jargon.”