Categorization Software Improves Search Capabilities

By Fred Hapgood

PAGE 3

Even companies that can afford manual tagging have reasons to look at autocategorization. Chat Joglekar, business development manager at USAToday.com, says that the major benefit of autocategorization for his company is consistency. USAToday.com had long used editors to do manual categorizing?or had avoided categorization altogether. But as the sheer mass of online material and the total number of editors kept growing and changing, the slight idiosyncrasies in how each of them categorized information steadily degraded the search function’s performance. Now the online newspaper takes advantage of a product called Concept Server from Applied Semantics. While machines may have their peculiarities, at least their biases are consistent over both time and scale of operation.

Raymond Karrenbauer, CTO of ING Americas’ Technology Management Office, reports a fourth payoff: Automatic categorization and taxonomy makes it easier for a company to add uncategorized or weakly categorized material, such as e-mail messages or ING’s more than 40,000 different formats of unstructured data, to its searchable data space. He adds that categorization improves the work of internal users?allowing customer service reps, for instance, to find what they need faster.

Several trends have combined to make those new services possible. First, two relevant "natural language recognition" technologies have matured almost simultaneously. One maps the frequencies of words in a document and their positions relative to each other to generate a document profile. The software then compares that profile with the profiles of previously categorized reference documents, those of other new documents or both. The first comparison sorts new documents into established categories; the second recognizes new topical "clusters" that probably should be explicit categories. For instance, if two documents have China and ceramics within 10 words of each other, the odds that they should be in the same category go up. Autonomy’s product relies on that approach.

The second technique (the one used by Applied Semantics, Inquira and others) relies on semantics. Given a document, such a program first filters out the important words, then looks up their synonyms, meanings and their thematic relationships (for example, the term chair would be linked to furniture and rocking). Finally it counts the number of these relationships to decide which words are most likely to reflect the document’s major and minor themes. Theoretically such a system can figure out whether an article on chips belongs under food, gambling, computers or horses, even if none of those specific terms appears in the document.

Perhaps the best news for vendors designing autocategorization products, however, has nothing to do with research breakthroughs. Today, more and more information travels with a lengthening entourage of data about itself (such as e-mail headers or meta-tags in webpages). Autocategorization software can recognize and leverage that data for its own ends. For example, iPhrase Technologies specializes in finding and harvesting, or "spidering," categorization information across many data types. "Three to four years ago, we had to code up explicit structure with every deployment," says Senior Product Manager Roy Rodenstein. "But today our clients have much richer data."


Loading...
Applications MarketSpace
Practical Approaches for Securing Web Applications
Enterprises understand the importance of securing web applications to protect critical corporate and customer data. What many don't understand, is how to implement a robust process for integrating security and risk management throughout the web application software development lifecycle. Learn more »
An Executive's Guide to Web Application Security
Since so many Web sites contain vulnerabilities, hackers can leverage a relatively simple exploit to gain access to a wealth of sensitive information, such as credit card data, social security numbers and health records. It's more important than ever to examine your Web application security, assess your vulnerability and take action to protect your business. Learn more »
Web Application Vulnerabilities
Security managers may work for midsize or large organizations; they may operate from anywhere on the globe. But inevitably, they share a common goal: to better manage the risks associated with their business infrastructure. Increasingly, Web application security plays a significant role in achieving that goal. Learn more »
Using ERP To Gain Competitive Advantage in a Tough Economy
For midsize enterprises, now is the perfect time to invest in a significant IT expansion - despite the economic climate. Learn more »
Why BI is Ripe For Businesses of Any Size
Oracle's range of offerings to mid-size and emerging companies reflects its vision that BI and EPM solutions can be embraced by companies of all sizes. Learn more »
Oracle Accelerate
Ovum has been following Oracle's Accelerate program over the last couple of years because they thought it is a smart strategy for penetrating the upper mid-market. Learn more »
The New Age of ERP
Not only can small and mid-sized companies reap the renowned ERP benefits of greater agility, increased business visibility and measurable ROI. Learn more »
 
SPONSORED LINKS
 

CRM Built for IT: The Executive Guide to Selecting CRM that Meets IT Needs

ROI of Application Delivery Controllers

White Paper: 4 Customer Service Myths

White Paper: Improve Agility with Operational Responsiveness

Removing the Barriers to IT Governance: How On-Demand Software Changes the Game

Cloud Computing--Latest Buzzword or a Glimpse of the Future?

A Balanced Approach to an Application Development Platform

Adobe® LiveCycle®solutions for intuitive user experience

10 Ways Excel Drives More Value from Your SAP Investment

What's New in SOA Suite 11g?

Unleash the Power of Java with Oracle JRockit Real Time

SOA Best Practices and Design Patterns

Application Grid: Ideal Platform for IT Consolidation

Ready to virtualize tier one applications? Check your virtualization maturity.

Learn how to provide complete Business Service Management.

Increase ROI of Your Application Portfolio

Return on Information: Google Enterprise Search pays you back. Get the facts.

VMware. The source for Business Infrastructure Virtualization.

ShoreTel tells businesses to untangle from competitors' complexity and turn to its brilliantly simple UC solution

See how AT&T can help protect your network.

Streamline IT Costs. Boost Performance with WAN Optimization.

Build your 1st app FREE with Force.com

TDWI checklist helps define data readiness for analytics. Download report.

eZine: A Roadmap to Reducing IT Complexity

Reduce risk, gain agility. See how Progress can help your business.

What's Next for Enterprise Resource Planning?

Gartner Magic Quadrant, Application Delivery Controllers 2009

White Paper: Managed Security for a Not-So-Secure World

SharePoint - Unchecked growth of content is unsustainable.

Focus Under Pressure: Why IT Governance Becomes Mission-Critical in a Down Economy

Should Your Email Live In The Cloud? A Comparative Cost Analysis

Adobe® LiveCycle® solutions for business process automation

Architecting Business Intelligence Applications for Change: The Open Solution

Increase UPS efficiency without sacrificing protection.

Unlocking the Mainframe: Modernizing Legacy System to SOA

State of the Data Integration Market

Enhance Customer Loyalty through Higher Responsiveness

Achieving Business Agility with Application Grid

Seven Ways ITIL Can Help You in an Economic Downturn

Four steps to populate your CMDB.

"Enterprise-Proven" is the Prerequisite for Enterprise SaaS Portal Solutions

AT&T Synaptic Storage as a Service. Expand on demand

Trend Micro ranked #1 against real-world malware. Read more.

Webinar: Jump-start your in-house e-discovery with Ringtail QuickCull from FTI Technology

Top Five CIO Challenges

Read the RSA report: Security for Business Innovation

64-page prescriptive guide to security, compliance, and IT operations.

A Clear View Toward Virtualization

Virtualization Technology as a Business Solution

The rules of infrastructure management just changed.

 
 
RESOURCE CENTER