In Search of the Right Search Technology for Your Customers
Hint: The answer is not always Google. CIOs share their hard-earned lessons.
Solve the Context Problem
When you embark on an external search project, it’s important not to get overwhelmed by an early requirement—classifying all the data to be searched. One of the hardest issues for RB Search’s McCracken was bringing context into the search tool. He tried to tag the source material in the content management system to make the right information available to the search engine. But with 200 million documents and new ones being created all the time, the RB staff could not tag all the content to provide the categories that a search engine would use to find appropriate content, suggest related results and deliver related promotions. In fact, McCracken realized that perhaps only 2 percent of the content had been tagged, despite all the effort spent over a couple years. Worse, “the tags were not consistent” among Reed’s subsidiary companies, he says.
So McCracken brought in a tool from Teragram that helped automate tagging of content after the fact, using a rules engine. Doing so meant creating the taxonomies and an ontological (conceptual categorization) dictionary of 210,000 terms—something that must be kept up to date by people—but this made the tagging of the 200 million documents possible, he notes. McCracken then deployed Fast Search & Transfer, a search engine that provides the ability for search users to navigate through the categorized results derived from the tagged content.
The key to this software-assisted classification, McCracken says: You can’t depend completely on automation. Human experts must adjust the software’s rules and results. But when the tools are properly tuned to a company’s content, IT can then apply them to a vast quantity of documents, he says.
The U.S. General Services Administration took a similar approach to making public documents from multiple federal, state, and local government organizations available via the USA.gov website. It used Vivisimo’s clustering technology to contextually index the content from the multiple websites and Microsoft’s MSN to provide the search engine and index. GSA staffers now hand-tune the index and ontology as needed, and can create their own indexes quickly when the need arises, such as pulling together all Hurricane Katrina–related resources when the devastating storm struck in 2005, says John Murphy, director of USA.gov technologies.
To keep the index and ontology relevant, you’ll need to regularly analyze search queries and results to detect new user search patterns and expectations, says Ken Harris, CIO of natural products distributor Shaklee. He made this realization after replacing an old search engine with one from Visual Sciences (until recently known as WebSideStory) as part of a general Web modernization effort. The new tool came with analytics capability to help define relevance in results.



