Oops! The input is malformed! Introduction to Master Data Management by Andy Hayler - BeyeNETWORK UK

Channel: Master Data Management - Andy Hayler RSS Feed for Master Data Management - Andy Hayler


Introduction to Master Data Management

Originally published 2 September 2009

What is master data, and why should you care? Essentially a business transaction, such as you going to a store and buying three bars of chocolate for $2 each, results in two different kinds of data. One is the data associated directly with the transaction, so the number of bars of chocolate (three) and the price you paid ($6). It happened at a certain date and time, and these are unchanging facts. But the brand of chocolate that you bought, the price you paid, and even the store that you bought it from may appear fixed but are actually more fluid. For example, the marketing department may currently classify the chocolate that you bought as a luxury brand (but one day may change this classification), and the store you bought the chocolate from is currently in a certain sales region, which in itself may change with a company reorganisation. Perhaps the price you paid was normal list price, or maybe it was on a promotion (which will change in time). All these other elements provide the context of the business transaction; and while fixed at any moment in time, they typically do change, if infrequently. Information such as product, the customer (you, in this case), the location bought from, and so on are all examples of master data. Those familiar with data warehousing may have heard the term “slowly changing dimension” to describe this kind of data.

The reason for making this distinction between transaction data (volume sold, proceeds) and master data (customer, product, location, etc.) is that the way in which companies treat master data gives rise to some far-reaching problems. In a large organisation, it is natural that different departments have different perspectives on things. The marketing department cares a lot about the brand of the chocolate, its price, whether it was on a promotion etc., whereas the supply and distribution department cares about how many bars can be put on a palette and how much they weigh, while production cares a great deal about the recipe of the chocolate bar and what components are needed for it. These different perspectives typically mean that many different computer systems have sprung up to support these perspectives. Human nature being what it is, not all of these are necessarily consistent. Hence, the product codes used in marketing may not necessarily be those used in the packaged application used in supply and distribution, or in finance. Even if there is an attempt to standardise on such codes and classifications, these things change over time, and so when marketing reorganise their product hierarchy, does this change immediately and automatically get reflected in all other company systems? The answer is, in most cases, no. My own company’s research in 2008 showed that in one survey of large companies, no less than nine different systems, on average, were generating product master data, and six were generating customer data (these were just the median values – some companies had literally hundreds of systems).

This inconsistency in master data leads to insidious problems. Customer addresses change and are updated in one systems but not another, so marketing material, shipments and invoices go astray. Because the classifications and definitions differ, no one can agree on the summarised numbers (one company we are familiar with had 23 different definitions of “profit margin” used in just one of their country subsidiaries). This matters when it comes to deciding where to allocate company resources and capital. One company we worked with had decided strategically to focus on the top 25% of its existing brands, and divest the rest. This was a big decision, but one that was complicated when they discovered that no one could agree on which those best performing 25% of brands were. The figures from different business lines and countries had different lists (e.g., in some countries corporate overhead cost were allocated down to individual brands and transactions, in some cases not) meaning that some brands appeared more or less profitable relative to each other.

Anyone who has worked in a large company will recognise these or similar issues. There have been many attempts to address this over the years. For example, data warehousing tried to bring all this disparate data together and resolve the inconsistencies, but usually struggled to keep up with the pace of business change. ERP promised to solve the problem by replacing all the transaction systems with one giant application, but in practice most companies ended up with many separate implementations of their ERP system (or indeed competing ones), and this only covered part of the functionality a company needed (a large multinational will have many hundreds of applications, even after it has implemented “wall to wall” ERP). Moreover, even if by some miracle the data definitions could be made perfectly consistent by waving a magic wand, you still have the issues of what quality the actual data is (in most companies: don’t ask). What happens when a company takes over another and tries to merge its shiny data definitions with those of the acquired company? If this process takes months or years, which it usually does, there goes your consistency again.

Master data management, then, is not a new thing. The difference is that there is an acknowledgement that technology alone cannot solve the problem – that the business s needs to take ownership of its data and processes, and put in place the ownership structures and processes to resolve competing definitions of data across its life cycle (this is called “data governance”).

Technology has sprung up to support this goal in a variety of guises. Some products have support for data governance and workflow, typically having a hub in which master data can be stored and managed, and provide tools to assist with the cleaning and standardisation of the data, as well as mechanisms to publish the new shiny “golden copy” data out to other systems and consumers.

Master data management is the combination of organisation, processes and technology in support of the goal of managing master data on a consistent basis across the organisation. In future articles, we will explore the many issues associated with this, what works and what does not work, how the technology fits, and best practices in terms of how to implement such a program of work.

SOURCE: Introduction to Master Data Management

  • Andy HaylerAndy Hayler

    Andy Hayler is one of the world’s foremost experts on master data management. Andy started his career with Esso as a database administrator and, among other things, invented a “decompiler” for ADF, enabling a dramatic improvement in support efforts in this area.  He became the youngest ever IT manager for Esso Exploration before moving to Shell. As Technology Planning Manager of Shell UK he conducted strategy studies that resulted in significant savings for the company.  Andy then became Principal Technology Consultant for Shell international, engaging in significant software evaluation and procurement projects at the enterprise level.  He then set up a global information management consultancy business which he grew from scratch to 300 staff. Andy was architect of a global master data and data warehouse project for Shell downstream which attained USD 140M of annual business benefits. 

    Andy founded Kalido, which under his leadership was the fastest growing business intelligence vendor in the world in 2001.  Andy was the only European named in Red Herring’s “Top 10 Innovators of 2002”.  Kalido was a pioneer in modern data warehousing and master data management.

    He is now founder and CEO of The Information Difference, a boutique analyst and market research firm, advising corporations, venture capital firms and software companies.   He is a regular keynote speaker at international conferences on master data management, data governance and data quality. He is also a respected restaurant critic and author (www.andyhayler.com).  Andy has an award-winning blog www.andyonsoftware.com.  He can be contacted at Andy.hayler@informationdifference.com.


Recent articles by Andy Hayler



Want to post a comment? Login or become a member today!

Posted 4 January 2010 by Anonymous

This is a poorly written and hard to follow article. Here are two examples: "One is the data associated directly with the transaction [...] the *price* you paid [...] are unchanging facts. But [...] the *price* you paid [...] may appear fixed but are actually more fluid." So, is the price "unchanging" or "more fluid"? Probably in the first mention, "price" should be replaced with "amount". "[...] no less than nine different systems, on *average*, were generating product master data [...] (these were just the *median* values some companies had literally hundreds of systems)." So, are these numbers average or median values? Average value and median value are two different things... I created an account on this website just to post this comment. I do not recommend reading this article, it is a waste of time.

Is this comment inappropriate? Click here to flag this comment.