As CIOs look for better value from their enterprise content management (ECM) solutions, they’re finding more cost effective ways of operating from cloud based file storage vendors. Box, Google Drive, AWS and others provide the same capabilities offered by expensive ECM solutions. In this article, ECM refers to a solution that stores unstructured data, such as documents, images, and plain text.
Traditional ECM solutions are no longer cost competitive and do not provide any additional value over the simple indexing, storage and retrieval capabilities. Shifting ECM management of infrastructure, maintenance and operations to cloud based file storage vendors seems unavoidable to stay cost competitive.
Moving content to a cloud based file storage vendor can lower operational cost. However, this is not enough to gain any real competitive advantage. Cloud based file storage vendors do not reveal any additional insights over traditional ECM solutions. Companies are moving to big data solutions to gain better insights into their data. Yet, they have had limited success in obtaining value from unstructured content in big data file stores. This includes keyword proximity searches, classification and sentiment analysis on unstructured data streams like Twitter, Facebook, and LinkedIn.
Big data capability provides little value to those company executives that are retaining terabytes or petabytes of static content. How does one make sense of all this unstructured data? There is no silver bullet to gain optimum insights. One way to provide value from your unstructured content, is to bridge it with your structured content. However, there seems to be lacking an overall industry accepted strategy describing how to realize unstructured data into actionable insights.
The strategy: extract, classify, contextualize
The process of realizing unstructured data into actionable insights is no trivial task. Let’s take a look at the big picture. (See Figure 1).
Raw unstructured text can’t simply be placed into the structured world and still be meaningful. There is too much data, or data with the same elements that have different meanings, with alternate spellings, or with an incredible amount of irrelevant information. You will need to perform some form of extraction, which will provide you with varying degrees of information value.
In most cases, extraction will only provide capabilities for classification and does not provide the ability to represent context for the business. Today, big data A.I. machine learning capabilities can provide classification solutions, like sentiment analysis, or spam identification. However, providing capabilities for contextual identification on unstructured data is problematic at best, because of a phenomenon known as statistical biasing. This is caused by not having large and diverse data sets to reinforce learning. Large and diverse data sets for a particular domain are difficult to come by and are tedious to build, maintain and update.
In A.I. concierge services – realizing the promise of big data, I introduced the concept of an information framework based upon W3C open specification Resource Description Framework (RDF). RDF is a perfect solution for capturing and bridging unstructured and structured data. RDF provides a true enterprise solution for contextual mapping and protects a company from vendor lock-in. You now have the capability to turn your unstructured data repository into an oracle of corporate knowledge.
Achieving semantic maturity will enable you to build a knowledge management system that will transform the business. New type of capabilities can be realized, everything from auto answering emails, to adaptive and multiagent systems that process transactions. Imagine how these new capabilities will change ITs ability to service the business. You can now tie your knowledge management solution to your business process to provide invaluable insights.
For example, medical claims processing can use a knowledge management system for fraud detection, clinical treatment abuse, etc. You have now shifted your IT environment from simple processing transactions to understanding transactions.
The challenge for ECM vendors is to provide true information insights on unstructured data. In order to thrive and prosper, these vendors will require more than simple indexing, storage and retrieval of content. ECM vendors needs to shift their view from data storage to knowledge management. Holding onto the current capabilities will no longer be viable to stay competitive in a billion dollar ECM market place.
Mitch DeFelice started his career off serving six years in the U.S. Navy as part of the Naval Security Group tactical electronic support staff. Mitch’s military tours included serving with Fleet Air Reconnaissance Squadron (VQ-1) in Guam and support staff for Admiral Thomas B. Hayward, Commander-in-Chief, U.S. Pacific Fleet (CINPACFLT), Honolulu, Hawaii.
Mitch is a TOGAF 9 Enterprise Architect Certified. Mitch’s primarily focus is working with key business stakeholders and technology executive leadership developing technology solutions that support unstructured data. This includes areas of content management, records management, enterprise search and eDiscovery solutions. His passion lies with developing business solutions around Cognitive Computing capabilities.
Mitch is a frequent contributing author to trade magazines on unstructured data and cognitive computing related topics.
The opinions expressed in this blog are those of Mitch DeFelice and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.