by Michael Fitzgerald

The Semantic Web: Improving Search

Mar 15, 20032 mins

The World Wide Web stands as a living version of Jorge Luis Borges’ Library of Babel. It’s also why the Semantic Web is taking shape, to help make sense of it all.

“There’s a plethora of information, and it’s not filtered,” says Allan Goode, CEO of Agent Software in Mountain View, Calif. Agent formed last October to build a Semantic Web tool for NanoSIG, a global nanotechnology networking group that wants to give members real-time access to data about the growing nanotech field. Agent’s efforts are focused (at least for now) for a reason?Semantic Web technology is still in its infancy.

The Semantic Web, brainchild of Web inventor Tim Berners-Lee and his cohorts at the World Wide Web Consortium, intends to make Web searching more efficient. The goal is to reduce the amount of human involvement required for actions that computers can do automatically, such as extracting meaning from content.

It all sounds like XML, but XML doesn’t convey, for instance, that a dresser and chest of drawers are the same thing. The Semantic Web will be able to draw such conclusions, thanks to the Resource Description Framework (RDF), which can describe the contents of a collection of text, making it possible for machines to automatically compare even incompatible documents.

Agent Software uses a variety of software agents to find nanotech-related documents on the Web, pull them into a repository and give them RDF tags. Other agents then parse the documents into a variety of categories, creating a real-time search engine that lets the user define categories and priorities.

The NanoSIG is also talking with Dynago, which is testing a product that can tag documents in RDF. Dynago pulls HTML from the Web and marks it up with XML. XML documents can be easily converted into RDF, opening the data to other computers.

Sucking documents into databases to tag them is not ideal (even within its niche, a quick Web search on nanotechnology retrieves some 500,000 Web documents). And until RDF and its kin spread, the Web will remain a Library of Babel.