Master Data Mix-Ups What do you mean by master data?

Originally published 4 November 2009

Have you ever had a conversation where you are talking at cross purposes? I came across an amusing example when a British friend of mine married an American lady.

She: “I’d really like you to wear suspenders on our wedding day.”

He: “What do you mean? Suspenders are for women! Will you be wearing some?”

She: “Of course not. I won’t be wearing any pants.”

He: “What kind of a woman am I marrying?!”

What this example illustrates is that effective communication between people is only possible if we can be sure we’re talking about the same thing. Because people have different backgrounds and experiences, it’s always possible that the message we’re intending to send is different than the message the other person receives. In the example, the lady thinks the conversation is about braces and trousers, while the man thinks they are talking about garters and underpants.

The need to agree on the meaning of terms is one of the basic drivers behind master data management initiatives. But what do we actually mean when we use the term “master data”? And more importantly, do we all mean the same thing? It matters because if we have different interpretations, any conversations we have about master data and master data management could be at cross purposes.

Interpretation 1: Master data is data about things, not events.

In this interpretation, data is divided into two broad categories known as master data and event data.

Master data is data describing the individual items about which events, figures and observations can be recorded. Such items include customers, products, locations and periods of time. Master data forms an inventory of the “things” and “concepts” that are of interest.

Meanwhile, event data is data describing events that occur. Event data can include measured figures, textual observations and information to identify the “things” that participate in the event.

The withdrawal of money from a bank account is an example of an event. In this example, the event data to be recorded includes the amount withdrawn, the account number and the location, and the date and time of the withdrawal. The “things” that participate in the event are the account, the location and the date. It is important to note that while the event data includes enough data to identify the “things,” the definitions of those things is described via master data. For instance, the event data in this example includes an account number, but information about the account holder and the account type associated with that number is stored in master data.

In Malcolm Chisholm’s excellent article "What is Master Data?," he defines six layers of data and defines master data as consisting of three of these layers. Malcolm’s definition is consistent with the definition just presented, and he highlights the fact that the three different types of master data each require their own special treatment.

 Other published sources also seem to be in agreement that this interpretation of master data is “correct,” with definitions including the following:

"Master data is the consistent and uniform set of identifiers and extended attributes that describe the core entities of the enterprise.” (Gartner)

“Master data is that persistent, non-transactional data that defines a business entity for which there is, or should be, an agreed upon view across the organisation.” (Wikipedia)

This interpretation of master data is the one that I’ve used throughout my career and there seems to be broad agreement. So why would people think anything different? Actually, it turns out that there are good reasons…

Interpretation 2: Master data is reference data with consensus.

The two broad categories of master data and event data in Interpretation 1 are often known by the alternative names “reference data” and “transactional data.” In other words, data describing things is reference data, and data describing events is transactional data. Within this section, I’ll use those alternative names.

One problem with reference data is that multiple inconsistent versions may exist throughout the business. Data about each particular “thing” may be stored in multiple databases, each of which stores different attributes, uses different identifiers and is maintained by different people. Master data management initiatives set out to solve these problems by providing a single maintenance structure and a single place to make changes. Changes are synchronised between the master data management tool and each consuming system so that the master data in all systems is consistent.

In practice, master data management tools are hardly ever introduced in a single big bang and are rarely used to maintain all reference data. Instead, analysts and developers will begin by tackling a single type of reference data such as customer. As a consequence, at any given time there may be plenty of reference data that is not centrally maintained and for which inconsistent versions exist in different systems.

 This situation introduces the need to distinguish between those items of reference data that are subject to master data management and those that are not. To vendors of master data management tools, only reference data that is stored within their tools is worthy of the title master data. In this interpretation, master data can be defined as “reference data with consensus.”

Consider for a minute what term you would choose for the data that is the subject of master data management. In the absence of a better suggestion, I think I’d call it master data too!

MORAL: When discussing master data, agree whether you’re referring to all reference data or just the data within a master data management system.

(Mis)interpretation 3: Master data is data that needs to be highly shared.

 

An aim of master data management is to ensure that tools and processes are in place to maintain and share common definitions. The sharing of data is an important aspect of master data management. After all, it is important that all the systems within a business have a consistent view of customers, products, locations, etc.

However, it’s important to note that the need to share data is not limited to master data. For instance, any numbers that are published by an organisation in a written document or on a web page can certainly be said to be “highly shared,” but that doesn’t mean that they are master data – instead, they are measures derived from event data.

Similarly, metadata that describes data structures and business rules may need to be highly shared. Again, this doesn’t mean it is master data.

MORAL: Master Data typically does need to be shared, but this isn’t the critical factor in determining whether data should be classified as master data.

(Mis)interpretation 4: Master data is a term referring to the master copy of the data.

It is quite common for people to believe that master data refers to the master copy of data (as in “master tape”).

The reasons why people can come to this conclusion are obvious. This view can be further reinforced by the fact that discussions about master data management almost inevitably include phrases such as “golden copy of data” and “single version of the truth.”

This interpretation can cause particular confusion. It places emphasis on the place where data is stored rather than on the characteristics of data. Therefore, it can lead people to believe that master data management tools should be used to manage all data, rather than just the “critical nouns” of an organisation.

MORAL: Make sure you dispel the myth that master data refers to the master copy of data.

Conclusion

It is easy to find yourself talking at cross purposes when discussing master data. The people you talk to could have any one of several interpretations in their mind.

To avoid any confusion, it’s important to make sure all parties are using the same interpretation. To summarise, here are some key points to re-emphasise my own interpretation of master data as set out in Interpretation 1:

  • Master data is data describing the individual items about which events, figures and observations can be recorded. Such items include customers, products, locations and periods of time. Master data forms an inventory of the “things” and “concepts” that are of interest.

  • Master data is distinct from event data and metadata.

  • The need to share data doesn’t make it master data.

  • Master data is not simply a term referring to the master copy of data.
  • Chris Daniels

    Chris Daniels is an Information Management Consultant at IPL, a leading UK IT services company specialising in the delivery of intelligent business solutions. He has over 10 years of experience in helping a wide range of high profile clients exploit the full potential of their information.

    Chris has significant information management expertise gained across a variety of sectors, including manufacturing, finance, government, and telecoms.  Chris has ongoing engagements in business intelligence, undertaking business analysis and the specification of information systems.  He can be reached at chris.daniels@ipl.com.

Recent articles by Chris Daniels

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!