3 minute read

INDUSTRY DIRECTIONS

Five Causes of Poor Data Quality

By David Greenfield

dgreenfield@automationworld.com

Editor-In-Chief/ Director of Content Director of Content

If networking devices are like the veins and arteries of Industry 4.0, data is surely its lifeblood. After all, data is intrinsic to every advanced manufacturing operation. Better decision-making and process improvements can’t be accomplished without operational data.

Given data’s intrinsic value to modern manufacturing and its future, the need for the data being aggregated and analyzed to be as clean and correct as possible is paramount. But you can only assess the cleanliness of your data if you’re aware of the key factors that can comprise the integrity of your data.

Michael Simms, practice director for data and analytics at Columbus Global, a supplier of digital transformation applications and related services, says there are five causes of data quality degradation:

• Human error. “People make mistakes, and when you are hand-keying data, errors are expected, such as missing details, typos, or putting the data in the wrong fields,” Simms says.

• Inconsistent data-entry standards. If your company doesn’t have standards around how data should be entered or captured, data quality will inevitably suffer. “I once worked with a company on data migration, and we began with mapping their data based on zip codes and states. After a few minutes of combing through the data, I saw an issue immediately: The data showed 253 states because there was no set way to input the state names. Each state appeared multiple times in multiple formats; for example, Nebraska, Nebr., and NE,” says Simms.

• Lack of authoritative data source. This is the single version of truth required by an authoritative data source. Simms says,

“If there are problems with your data, but fixes aren’t happening at the source, the reporting won’t be reliable on the other end. For example, with the state data problem noted above, if the fix was made in the reporting, but no one went back and ensured that data was cleaned up at the source, the company would have continued to encounter that issue, muddying up any reporting. Proper data governance and set protocols to fix data issues need to be in place.”

• Poor data integration. It’s important to realize that data quality may vary dramatically across all the systems you are pulling data from. Simms points out that duplicate data are an important challenge to face as these duplicates will skew any analysis.

• Not keeping data up to date. “Data gets abandoned and forgotten,” says Simms.

“Make sure it doesn’t.” To mitigate this issue, Simms advises keeping data such as product names, vendor and employee information, as well as email addresses and other pertinent information up to date by ensuring there’s no duplicate, incorrect, or outdated information in these areas.

A critical step in ensuring the quality of your data is to make sure IT doesn’t own it. You read that right—Simms advises against IT owning your company’s data.

“Your IT team might be resourceful, and they may do an excellent job of making sure everything is running smoothly,” says Simms, “but they should not be the keepers of the data. Data should be managed by the people who are living it day in and day out, those who know what it means and the importance of having that information be reliable.”

To build a reliable foundation for decisionmaking in your company, Simms says to plan for quality data, including giving ownership to management and seeking a trusted partner for the data-quality improvement process.

Listen to our podcast on how to get a quick ROI from analytics software.

Read about understanding industrial analytics software.