NEWSLETTERS
 

CIO.com updates, insights and advice on technology, management and your career.

 Advice and Opinion

 CIO Consumer IT

 CIO Leader

 CIO Enterprise

 CIO Insider

 

RSS Feeds »

 
 
LEADERSHIP
 

CIO Executive Programs

The Leader in Face-to-Face Education for Senior Executives

Offering regional and national programs, CIO (and CSO) events bring together some of the most respected names and thought leaders in information technology and security. Presented by CIOs and other senior level executives, these invitation-only programs offer timely topics and strong networking. Learn More »

 

CIO Executive Council

Public Teleconferences

Join CIO Executive Council members and participate in the following live teleconferences:

* Planning for Succession:
Models for IT Leadership Development, June 23
* Youth in IT: How CIOs Can Engage the Next Generation
June 10
* Change Leadership at General Growth Properties: A
Pathways Leadership Development Seminar, June 25

More / Register »

Learn more about the CIO Executive Council »



 
 
RESOURCE CENTER
 
 
 
SUBSCRIBE TO CIO
 

Are you involved in setting the direction for your company's IT budget or strategy?


Apply today for a FREE subscription to CIO Magazine!

Subscription Services »

Reprints »

 
 

News

 

OpenPipeline Seeks to Ease Document Prep for Search

 

April 30, 2008 — IDG News Service —

Enterprise search vendor Dieselpoint is behind a new open-source project centering on a document "pipeline" -- or as the Chicago company's CEO, Chris Cleveland, puts it, "all the boring stuff you need to make enterprise search work."

Enterprise search implementations often cover an array of document sources and components; pipelines allow companies to standardize the processing of information before it gets pushed into a search-engine indexer.

"We're connecting the crawler companies to the text analytic companies to the search engine companies," Cleveland said.

Dieselpoint was having trouble integrating its own pipeline with third-party document analyzers and content connectors, and has open-sourced it as a basis for the project, which is dubbed OpenPipeline.

Its Web site is scheduled to open to the public on Monday, and a fully functional version of the software will be downloadable under the Apache 2.0 license. It is available under a commercial license as well, according to the site.

The software features a point-and-click user interface and provides a number of connectors, including Web and SQL crawlers. It also supports a number of commercial connectors for products such as SharePoint, Exchange and a number of portals.

Dieselpoint is pursuing the project both to make bigger, more complex implementations easier and in hopes that it will draw some customers to its search engine.

"The single biggest barrier to adoption of enterprise search is doing integration," Cleveland said. "Of course, it means enormous consulting engagements, so it's a source of revenue for the industry, but it's a deterrent."

While major search vendors have pipelines, they are "all proprietary and all closed," he said.

A number of other vendors and consultants have signed on to the effort's advisory board. They include Alias-i, Applied Relevance and Raritan Technologies. Cleveland is anticipating more companies will join soon.

Conceptually, an open-source pipeline makes sense for the industry on the whole "because each component is worthless on its own," he suggested.

Guy Creese, an analyst with Burton Group, compared OpenPipeline to an existing project.

"IBM attempted to fix this issue with UIMA [Unstructured Information Management Architecture], its framework for letting multiple vendors work together on a text analytics pipeline. However, UIMA has not done especially well in the market," he said via e-mail. "It's unclear whether that's due to the complexity of UIMA or the fact that the market isn't quite there yet (I believe it's the latter)."

"In short, OpenPipeline is an interesting, open-source alternative to UIMA. However, its appeal will still remain small in the market, as many enterprises aren't at the point where they need to mix and match text analytics modules," he added.

 
 
 
 
 
 
Loading...
 
 
ABCs
 

How To Do Nearly Anything

Just the basics, please. Sometimes we all need a refresher or we need to make sure our team and our colleagues are all on the same page.

Over 25 tutorials on everything from business intelligence to virtualization.

 
 
FEATURED SPONSORS
 
 
 
SPONSORED LINKS
 

Increase conversions on your site with the help of EV SSL.

Performance Brief: Mobile Application Acceleration

Nothing Short of Revolutionary: Four Steps toward More Effective Enterprise Finance

White Paper: Unlocking the Potential of B2B

Shift your ERP investments into high gear and join the leaders!

Evolve your data center on proven technology. The Brocade DCX.

Taking Control of Software Licensing

Simple, Economical Server Virtualization for Any Size Company

Learn about B2B integration for Companies of All Sizes

Integration as a Service Webcast: Are you connected?

WAN Optimization for mobile users is critical to your business success.

Case Study: 24 Hour Fitness turned to SEPATON

White Paper: HP Application Modernization Services

Establishing a Strategy for Global Distributed Development

Oracle Database 11g: Advances in Compression, Real Application Testing and Data Guard

InfoWorld Test Center on Oracle Active Data Guard

This white paper highlights best-of-breed solutions being built on the Microsoft platform

A Must Read on Data Protection Strategies!

Read the FREE Forrester study "Optimizing Users and Applications in a Mobile World"

Fujitsu Case Study - Thomson Learning

State of the Portal Market 2007: Portals and the Power of Participation

Write an RFP for Master Data Management: 10 Common Mistakes to Avoid

Building Compliance and Security into an Application Delivery Framework

Keys to a smooth and successful Office 2007 rollout, before, during and after!

Consider the ROI of Unified Communications for your contact center

How Office 2007 Exposed Bill Gates

Citrix XenServer FREE trial

TDWI Research report clears confusion about automating data governance

Tuning ERP and the Supply Chain for Profitable Growth

IDC VP Carl Olofson Reviews Technology Advances and How to Handle Reporting and Querying

Mitigating Risk with Security Assessments

White Paper: IDC Analysts Discuss Open Text

Case Study: Customer Integration Wins at Invitrogen

Internet's Largest Book Retailer Scores Big with B2B Customers

Riverbed RiOS 4.0: Raising the Bar in Wide Area Data Services

Business Value of Performance IDC Whitepaper

BPM At The Crossroads Webcast" Tackling the "Hard Issues" to Drive Success

Top 10 Questions to Ask when Choosing a Secure File Transfer Solution

InfoWorld Test Center on Oracle Real Application Testing

Oracle Real Application Testing with Oracle Database 11g

An Executive Guide to Understanding Hosted and Managed Messaging

BEA & Forrester Webcast: "The Enterprise 2.0 Workplace." Register Now!

LIVE Webcast - The Mainframe is Dead...Long Live the Mainframe?

White Paper: Learn how wide-area data services accelerate information across wide-area networks

Learn about the Power of Pen Computing

Webcast: Optimizing Wide-Area Data Services

Putting Windows Server and Citrix to Work in the Enterprise

Getting Off on the Right Foot: Avoiding Common Master Data Management False Starts

The Case and Criteria for Combining Application Acceleration and Security

Webcast: Learn how organizations are overcoming productivity declines