4 Questions to Ask Before Starting a Big Data Initiative
Big data holds much promise for organizations seeking ways to improve business processes. Before reaching that point, though, you need to know what data you'll need, how you'll get it, how you'll use it and what you expect to learn from it. Learn how to start answering those questions here.
Tue, August 21, 2012
CIO — With the tremendous growth of electronic data that is being captured through social media, locating systems and internal enterprise systems, many industries are on the lookout for tangible benefits that they can reap from big data. In industries such as financial services, fraud detection and risk assessment rank among the tangible insights that organizations have been able to extract from their big data.
Through many surveys, executives have identified that big data initiatives rank high on the projects lists for 2012 and 2013. It's not surprising, since the promise that big data can improve and streamline presentations, gain insights to consumer purchasing habits and, in the healthcare industry, even help save lives is simply too important to ignore.
However, several key questions must be answered:
- What data should you consider?
- How is data captured?
- What tangible benefits can big data initiatives provide my organization?
- What is the ROI for a big data initiative?
While there may be more questions on many IT executive's minds, these are just the four that dominate most conversations. Here are the details and answers on the above questions.
What Data Should You Consider?
Data comes in three formats—structured, semi-structured and unstructured. Structured data is organized in a way that both computers and humans can read. The most obvious example is a relational database. Semi-structured data, which includes XML, email and electronic data interchange (EDI), lacks such formal structure but nonetheless contains tags that separate semantic elements. Finally, unstructured data refers to data types, including images, audio and video, which are not part of a database.
The foremost challenge is the need to unlock the data and gain access to it so you can store it and use it. This allows for the information to stay in its raw format, where it can be analyzed and reported on as it streams real-time into an analytics system. For structured data, this process is fairly straightforward. When working with unstructured data, on the other hand, advanced algorithms and powerful engines are needed to process the incoming data.
How Is Data Captured?
There are countless data sources available to you, and they likewise provide countless types of data. Ultimately, it boils down to the combination of data that needs to be collected.
One of the most commonly discussed data sources that today's companies use to gain insight into their consumers and brand following is social media. This is possible because Facebook, Twitter and the other major social media sites all offer some sort of data access through an application programming interface (API).
The next significant data source pertains to location and movement patterns. As RFID, infrared and wireless technology gets smaller and more affordable, companies will have more assets, employees and customers reporting their location to the appropriate business application.
As organizations capture data from these sources and combine it with the structured and unstructured data they are storing on site and in the cloud, they must ensure that it is being used in a way that it pays off.