Guest Column | December 2, 2021

Addressing The Issue Of Bad Data: 3 Best Practices For IT Leaders And MSPs

By David Loo, BitTitan

iStock-1251235330-data-analytics-analyze-numbers

What is your data costing you? As Thomas C. Redman wrote in the MIT Sloan Management Review, "Bad data is the norm." And the norm is costly.

According to Redman, incomplete, old, and inaccurate data costs most companies 15% to 25% of revenue. So, how can IT leaders and MSPs fix degraded data?

The answer is straightforward. We must ensure the data driving our business operations, finances, and customer satisfaction is up-to-date and reliable.

Data Gone Wrong 

To say that data is degraded means that it's incomplete, old, or inaccurate. There can be too little or too much. And bad data often produces bad outcomes.

Incomplete data is missing essential pieces. Maybe something's gone wrong in its collection. Our schema might change, which causes missing pieces. Imagine that we've collected weather data for the past 50 years. This year, we introduce a new field — visibility index. Because we haven't collected visibility index data for the last half-century, that field will show as zeros for all past years. That missing data gives us an inaccurate picture.

Old data is just as costly. If you're trying to project which of your sales regions will need more help, but your data ended three months ago, you're not getting the insight necessary for smart, agile decisions.

Even if data is current, it must be collected and stored accurately. If you're collecting the daily high temperature for the last 50 years, you need that data to be uniform and accurate. You don't have accurate data if some of those temperatures are inputted in Fahrenheit and others in Celsius. You have a mess.

Duplicate or additional data, too, can blur an otherwise clear picture. To be truly useful, data must be complete, clean, and up to date.

Data Inconsistencies 

Data gives insight into the state of a business from an IT point of view. How is my network functioning? How am I seeing and serving my customers? What should I be prioritizing? Data inconsistencies give you badly skewed answers.

The risks of relying on bad data are costly. Inaccurate data can lead to hiring the wrong people, choosing the wrong solutions to problems, and incorrectly forecasting sales. Incomplete data can cause simple yet devastating problems. A small missing zero in a revenue column can cause you to see a $10 million customer as a $1 million customer, leading to underprioritizing them. In a worst-case scenario, that customer feels uncared for, and you lose them and their business.

When you're working with inconsistent data, your priorities might be scattershot, or worse, flat-out wrong. You need clean, reliable data to strategize, prioritize and ensure your company uses its resources effectively.

Resolving Data Inconsistencies 

Three best practices can help IT professionals resolve data inconsistencies. We should approach data accuracy by looking at collection, storage, and output.

Data collection must ensure complete data by monitoring the collection regularly, whether it be daily or hourly. If you miss a step-in collection, you'll know what's missing and how to address it immediately. Regular monitoring ensures current and accurate data.

With storage, you need to establish clear, strict criteria. Are we using Fahrenheit or Celsius? Everyone must be on the same page. You also can set reminders on data staleness. If a record has not been updated in four days, someone is notified, and that data is refreshed.

Output, however, is the key to clean data.

Data Snapshotting  

The best practice to ensure accurate data, though, is data snapshotting. Outputting snapshots of your data into a controlled environment lets you assess it, rooting out costly inconsistencies.

Snapshotting puts your data in a secure container. Here, the snapshots can be reviewed and compared with your real-time data to catch inaccuracies. This helps you identify and clean up irregularities.

You might notice a new field pop up that you need to address. Maybe you have a hundred data points for another field, but there are only 95 in the warehouse. What happened to the other five? Snapshotting your data lets you find inaccuracies before they compound into significant issues.

Push, Not Pull 

Data synchronization helps with day-to-day issues, but data inconsistencies do arise based on the application of the two data tracking and extraction methods: pull and push.

The pull approach is the most used in the industry. With this method, you can pull data from specific outside places of your choosing. However, you must give someone permission to access the data with important login information. This can expose your vital data to external factors such as other users and programs, leaving data at risk of deletion, inaccurate changes, and other significant problems.

When snapshotting data, however, the push method offers more secure, accurate updating and monitoring. You don't need to give out credentials. There aren't any external applications or users coming into your systems. Instead, you're pushing out the data you want into the warehouse. When you push, you're ensuring the data being analyzed is the most up to date.

Analyzing Data Across Multiple Applications 

Enterprises need to collect and analyze data across multiple applications. ServiceNow focuses on IT-related work. So, if we're working on an isolated incident, we'll see individual data from the CMDB or HR. But when you go up the business manager ladder, you want different answers.

IT leadership wants to know if they're prioritizing service correctly. But executive leadership needs a bigger picture. How much money are we making? How much are we losing? To gain insights into more complex questions, you must join different data in an outside space. This can only happen outside of your service.

What if you need to know your biggest customer with most service problems that aren't being addressed? To answer that, you need to join multiple sets of data outside of those systems. Once you analyze data across various applications, you can decide what region to concentrate your hiring push.

We are both awash in data and beholden to it. Enterprises with current and accurate data have a tremendous advantage. Business leaders must implement reliable tracking, synchronizing, and snapshotting to prevent common and costly data inconsistencies.

DavidAbout The Author 

David Loo is the Chief Product Officer for BitTitan and is responsible for driving the product organization. A 30-year veteran in systems and applications integration, David founded Perspectium in 2013 and was a founding member of ServiceNow's development team and instrumental in creating the foundation for integrating and extending the platform.