Improving data quality

There are many ways to best analyse, organise and make decisions based on data, but what good is that if the data is inaccurate, unreliable and misleading? In a world where data or information is crucial for decision-making, data quality is one area that certainly shouldn’t be overlooked.

At the Cooperative Research Centre for Infrastructure and Engineering Asset Management (CIEAM), Professor Andy Koronios and Dr Jing Gao from the University of South Australia are working on ways to help infrastructure and engineering asset type organisations improve their data quality.

CIEAM is a Brisbane-based industry-directed research centre that works with industry partners and participants such as the Queensland Government, Queensland Rail, QR National, Sun Water, Australian Submarine Corporation, and research scientists from Australian universities to improve the efficiency and sustainability of infrastructure and engineering asset management.

Koronios and Gao’s research looks into data quality and data management issues within a particular organisation, to help develop a better information governance framework, and develop tools for improving data quality.

So why focus on data quality?

“[Poor] data quality is probably the biggest problem that we found most organisations are having difficulties with,” says Koronios.

A study by Ovum, Optimising Enterprise Applications: The Data Connection, surveyed senior IT executives at 146 large enterprises with revenues of more than $100 million in Australia, North America and the United Kingdom, and found that poor data quality was one of the main reasons for poor application management and performance. As the survey reveals, poor data quality is still very much an issue for CIOs when managing their organisations’ data.

Gao agrees that poor data quality is detrimental for organisations, especially when errors occur at the data entry level. The need to fix this problem lead to the development of a mobile application that enables workers to enter data remotely and directly into a system, eliminating room for errors that may occur in the extensive process of collecting data through to entering it into the system.

“We understand that a lot of times data errors occur at the time the data has been entered into the system. In order to address this, the data collectors need to be trained, motivated and empowered by adequate software tools,” he says.

“So we said, ‘Can we try using mobile applications to help people like field workers who need to go out to a remote area, to the actual physical asset location to do maintenance? Can we give a mobile application for them on their tablets, or their smartphones to enable them to enter data alternatively with a better designed interface?’

“For example, in the past organisations’ field force such as the engineers and maintenance people would drive miles to water companies’ pipe stations in the remote areas so they could fix the pipe, and after the job was finished they would probably fill out a paper copy drop sheet. Then they would go back to the office and have to type the drop sheet into the system.

"We have developed a mobile application so they can enter the information directly at the source where the physical asset locates, and therefore by using the mobile internet communication the data will be directly entered into the system. So the use of mobile apps and devices actually improves the data quality from the source.”

However, improving data quality for the long term is “not a one-shot activity”, says Koronios. As much as the tools and technologies that help improve data quality are important to have in place, he says it’s important to find the root cause of the problem first.

“I think a lot of people that talk about data quality [problems] they talk about the paradigm of having a lake and it’s poisoned with a lot of different poisons. If you don’t address the rivers coming into the lake then yes you can clean the water in the lake, but if the rivers keep bringing bad water in then you’re going to still have polluted water. So it’s the same with data,” he says.

“If you don’t find out why it is wrong and fix it, then you are going to keep getting the same problem over and over again… it’s costing you money over and over.”

IT governance is crucial to improving data quality for the long term, says Gao, and so he and Koronios use their research to put in place measurements for data quality and help ‘pluck out’ the root causes of problems in the organisation.

“We consider the objectives to improve information quality; we need to consider how we measure the information quality of the organisation. As part of the CIEAM project, we conduct interviews, we conduct various case studies and we develop the information quality maturity assessment model for organisations. So basically, we’re looking at how mature the organisation currently is in terms of managing information and to achieve quality information.”

Koronios adds that organisations need to also identify which data is most important to their business and focus on governing and improving quality of that, rather than trying to “grapple” with all of the data.

He says managing data quality is not only about looking at the technology. The long-term consequences of not putting in policies and frameworks, governance and best practices could not only cost the organisation money but also could cost its reputation and even cause legal/compliance issues. He uses the example of a marketing department sending out brochures numerous times to the same person because of data duplication and inaccuracy.

“If you don’t comply with reporting standards for data accuracy you can actually be on the wrong side of the law. So not only are there reputational implications, but as well non-compliance.”

Data quality is something CIOs need to focus on more so now than before, Koronios says.

“Recurrent resources need to be allocated to data quality and information quality improvement and governance programs. It’s a very worthwhile thing to do, particularly in today’s world where data is increasingly becoming much more difficult to manage. We talk about the Big Data areas where we are not only creating massive volumes of data but we are also creating a large variety of data and that is happening at an accelerated pace.

“In my opinion, everything else is replicable. You can replicate an SAP system… you can hire people. But the data of the organisation is a unique resource.”

Follow Rebecca Merrett on Twitter: @Rebecca_Merrett


Copyright © 2012 IDG Communications, Inc.

6 digital transformation success stories