Extracting business value from data

Steve DaumWhen I visit my doctor, there is a sign-in sheet in the waiting room. I write down my name, my doctor’s name, and my arrival time and then I sit down. The sign-in sheet is used by the staff to look up my records and prepare for my visit with the doctor. Once I’m in the exam room, my information is scratched out on the sign-in sheet.

The sign-in sheet is a small system. The intent of the system is to manage the order of serving patients and give the reception worker information for gathering records. This system stores very few pieces of data: the date, an arrival time, a patient name, and a doctor name.

I don’t know what they do with these sheets. They might file them for historical reference. They might throw them away. One thing is certain. There is more value to be gleaned from this data. For example, they might discover which physician in their practice is the most visited doctor on Fridays. They might discover the times of day that patients are most likely to arrive on time or most likely to arrive late. They could learn about the busiest days of the week. Like many data gathering systems, this one can provide value beyond the original intent.

First truism of data management systems – A set of data can provide more value than its creators envision.

Second truism of data management systems – A set of data collected, gathered, and stored for one purpose will eventually be used for some other purpose. Most often this will be for questions asked of the data that follow this pattern: “I wonder how many….” or “I wonder how often…”

Beyond the sign-in sheet example, consider larger systems that are widely used. Customer relationship systems, enterprise resources planning systems, website analytics, email management,  just to name a few. Each of these systems has an original intent, as the sign-in sheet does, and each has large amounts of data available--data from which value might be extracted.

Third truism of data management systems – The more data stored in a system, the larger the opportunity to extract business value from it.

A common problem with these systems is access to the underlying data. For example, imagine that your organization uses a commercially available, off the shelf, contact management system. Typically, this system will provide some level of reporting on the data it manages. A standard set of reports is common. Some even offer limited customized reports. The output is helpful, but almost always you will have questions that a fixed set of outputs cannot easily answer. You might be curious about the number of new contacts entered in the morning compared to the number of new contacts entered in the afternoon. Knowing this may provide value to help you train your workers or plan for staffing levels. However, getting at this answer may be difficult. The skills required to be an ordinary user of the system are not always sufficient to get answers to questions like this.

Fourth truism of data management systems – For any data management system there is some amount of friction in gaining access to the data being managed.

This friction can take many forms. Some common ones are 1) friction from lack of required skills and 2) friction created by duplication of data and 3) friction from technical barriers built into system and 4) point of view friction.

Many organizations, especially smaller ones, simply do not have workers with the technical skills required to extract data from an application and then manipulate it for further analysis. A small firm that makes dentures will have employees skilled in the tools and techniques of denture making – but they may not have highly trained database or software development workers. Often the skills deficit can be overcome with use of contract employees or consultants, technical support provided by the system vendor, or through specialized employee training.

The common problem of data duplication works like this. Most applications have features for exporting data. Since the application does not provide all the reporting options you need, data must be exported (moved) into a different application-- one that is able to do the analysis. You export the data, import it into the other application, and then do the analysis. It works great. The first few dozen times you do this everyone is happy. However,  you soon realize that the steps of moving this data around are tedious. As the requests for analysis increase,  some worker is doing more and more of this data movement activity. The analysis of this data quickly becomes stale. That is, the analysis you did last month  needs to be done each month. Or better yet, it needs to be done daily or even made available all the time-- in real time. This is data duplication friction.

One way to address data duplication friction is to automate as many of the steps in the process as possible. For example, you might set up  a script to export the data and set up another script to import the data into the analysis application. When you need the analysis, you run these two scripts and the work is easier.

Another way to address data duplication is by creating a data warehouse at some scheduled time, say every night at midnight. This warehouse might be a copy of ALL the data in the host system that has been transformed,  to make it more amenable to further analysis by other applications. Using this approach removes one part of the tedious work of moving data around – but not all of it.

Solutions to the data duplication problem can be brittle. One change in the source data, one change in the data requested, one change in the way the analysis software imports data, and you find yourself back at the solution that always works -- manual intervention.

Perhaps the best solution to data duplication friction is to use an application that can analyze the data in native form – bypassing the need for these brittle, duct tape solutions that string one system together with another.

This leads to the most difficult friction problem to overcome: technical barriers built into the application. There are many ways to store data. Programs store data in text files, in binary files, in different databases, and in various proprietary formats. The developers of these programs balance competing needs in deciding how to store their data. They think about speed, efficiency, accessibility, security, cost, and many other issues. In the end, as a user of the application, you often do not have a say in these decisions.  But they do have an impact on your ability to extract value from the data being stored.

Fifth truism of data management systems – An application vendor and an application user often have conflicting needs for the data being stored and managed.

The vendor wants to keep the data safe from loss or damage. The user wants full and free access to the data. The vendor may need to use complex storage techniques to gain efficiency. The user wants simple, easy-to-access, and easy-to-understand data.

An ideal situation is where the application stores data in a well-known database with meaningful and understandable table names and column names. However, not all application vendors use this approach. In some cases, they may use a standard database but they obscure the table and column names to make external eyes less likely to see and thereby use the data.

The final type of friction to understand is point-of-view friction. Imagine that you want to examine data from your customer relationship system (CRM) in the form of a statistical process control (SPC) control chart. Control charts are used to study variation over time. The work flow of using a control chart involves looking at the chart on a regular basis and knowing that that data is current each time you view the chart. The developers of your CRM did not have this point of view. Instead, they had the point of view of CRM users who need to look up, edit, and report on information about their customers. The best hope for resolving this type of friction is that the CRM vendor stores data in a standard database and does not try to obscure the table and column names.

What does all this mean as you make decisions about what applications to use and how to use them?


  1. See data as a resource--one from which business value can be extracted. Develop a culture in your organization that understands the value in data being collected. Encourage an inquisitive view of all data that is being captured.
  2. Become the owner of your data. You might be putting your data into an e-commerce system, or a resource planning system, or a customer relationship system. In any case you must see your organization as the owner of the data. If the applications you are using create friction in gaining access to this data--in other words if they have become the owner of the data--you may want to approach the vendor to resolve this problem or even consider abandoning the vendor in favor of one who allows you to get to your data easily.
  3. When you have large amounts of data flowing into known data sets on a regular basis,  continue to ask the question “what business value can we extract from that?”
  4. Seek applications with open and easy access to their underlying data.
  5. Seek applications able to analyze data--in place--from standard data sources.

Originally published in the October 2011 edition of Quality eLine, our free monthly newletter.

Subscribe now!