Data Modelling is Not Just for DBMSs, Part 1

Originally published 1 July 2009

“It’s data modelling Jim, but not as we know it”

Could these be the words of chief (data) engineer Scotty on the Starship Enterprise? Enterprise data modelling that is!

The problem is that “data modelling” has, in too many companies, received a lot of bad press. Have you heard any of the following statements?

“It gets in the way.”

“It takes too much time.”

“What’s the point of it?”

“It’s not relevant to today’s systems landscape.”

“I don’t need to do modelling, the package has it all.”

And these are just the polite comments!

Yet when data modelling first came onto the radar in the 1970s, the potential was enormous. We’d get benefits such as:
  • A single consistent definition of data

  • Master data records of reference

  • Reduced development time

  • Improved data quality

  • Impact analysis

Would corporations want to realise these benefits? You bet – it’s a no-brainer.

So then, why is it that now, 30+ years later, we see in many organisations that the benefits of data modelling still need to be “sold” and in others the big benefits simply fail to be delivered? What’s happened? What needs to change?

As with most things, a look back into the past is a good place to start.

Background and History

Looking back into the history of data management, we see a number of key eras.

1950 – 1970: IT was starting to enter the world of commerce; and later in this period, we saw the introduction of the first database management systems such as IMS, IDMS and TOTAL. Who can remember the DBMS that could be implemented entirely on tapes? (It was IMS HISAM if you really want to know.) The cost of disc storage at the time was exceptionally high. The concept of “database” operations came into being and the early mentions of “corporate central databases” appeared.

1970 – 1990: Data was “discovered.” Early mentions of managing data “as an asset” were seen and the concepts of data requirements analysis and data modelling were introduced.

1990 – 2000: The “enterprise” became flavour of the decade. We saw enterprise data management coordination, enterprise data integration, enterprise data stewardship and enterprise data use. An important change began to happen in this period, in that there was a dawning realisation that “technology” wasn’t the answer to many of the data issues. Data governance started to be talked about seriously.

2000 and beyond: Data quality, data modelling as a service, security and compliance, service-oriented architecture (SOA), governance (still) and alignment with the business were/are the data management challenges of this period.

And all of this has to be undertaken in these rapidly changing times when we have a “new” view of Information: Web 2.0, blogs, mashups – anyone can create data! At the same time, we have a greater dependence on “packaged” or COTS applications such as the major ERPs. Also, there’s more and more use of SOA, XML, business intelligence and less traditional “bespoke” development.

Notice I snuck in “mashups” (or web application hybrid) there? There are many powerful facilities available now that enable you to create your own mashups. Make no mistake, these are now becoming the “cottage industry” IT applications of this decade. Remember the homegrown departmental Excel macros of the ‘90s and onward that became “critical” to parts of the business? Well mashups are doing the same thing now. But just who is looking at the data definitions, standards, applicability, etc.? Certainly not the data management group – because they don’t know these things are being built in departmental silos, and anyway the “data team” is pigeonholed as being only involved in DBMS development.

So that leads us on to examine the belief that many people have (too many unfortunately) that data modelling is only for DBMS development. Why is that?

Modelling for DBMS Development

In its early days, data modelling WAS primarily aimed at DBMS development. We’ll have a look at the two main techniques in a moment.

Just to illustrate, we can look at 4 typical roles:

The enterprise data customer: This might be at director or CxO level. The accuracy of data is critical, they are reports users, and the data “products” we as data professionals produce are key to serving the needs of this level of user.

The data architect: This person knows the business and its rules. He/she manages data knowledge and defines the conceptual direction and requirements for the capture of data.

The DBA: This person is production oriented, manages data storage and the performance of databases. He also plans and manages data movement strategies and plays a major part in data architecture by working with architects to help optimise and implement their designs in databases.

The developer DBA: This role works closely with the development teams and is focused on DBMS development. They frequently move and transform data often writing scripts and ETL to accomplish this.

Data models (more accurately the metadata) were (and are) seen as the glue or the lingua franca for integrating IT roles through the DBMS development lifecycle. All of the roles depend on metadata from at least one of the other roles.

What then are the steps for developing DBMSs using models? This could be the subject of a huge paper, but I’ll try and summarise it simply here:

There are two “main” approaches to creating DBMSs from models: One is the “top down” or “to-be” approach, and the other is termed the “bottom-up” or “as-is” approach.

Top Down (to-be) Approach

Step 1: Document the business requirement and agree on high-level scope. The output is typically some form of Business Requirements Document (BRD).

Step 2:
Create a more detailed business requirement document with subscriber data requirements, business process and business rules.

Step 3:
Understand and document the business keys, attributes and definitions from business subject matter experts. From this, create and continually refine a logical data model.

Step 4:
Verify the logical data model with the stakeholders. Walk a number of major use cases through the model. Apply technical design rules, known volumetric and performance criteria and create a first cut physical data model.

Step 5:
Refine the physical design with DBA support and implement the DBMS using the refined physical model.

This approach has the great advantage that the “new” or “to-be” business and data requirements are foremost. However, it doesn’t take account of any existing systems.

Bottom-Up (as-is) Approach

The primary purpose of the bottom-up or as-is approach is to create a model of an existing system into which the new requirements can be added. Frequently, the bottom-up approach is used because a model of the as-is system simply doesn’t exist – often because it’s evolved and/or the design staff has moved on.

The steps in this approach are:

Step 1: Reverse engineer the database schema from the system that is already implemented. From this, you will have the database catalog, table, column, index names, etc. Of course, these will all be in “IT” language without any business definitions.

Step 2: Profile the real data by browsing and analyzing the data from the tables. Scan through the ETLs to find any hidden relationships and constraints.

Step 3: Find foreign key relationships between tables from IT subject matter experts and verify the findings. The typical output here is a refined physical model.

Step 4: Document the meanings of columns and tables from IT subject matter experts.

Step 5: Try to understand the business meanings of probable attributes and entities that may be candidates for a logical data model. From here, the result is a “near logical” model.

A third way is a hybrid of these 2 approaches that is frequently called the “middle out” approach.

Now all of these uses of models described so far, and the history of data modelling, and if we’re honest, until very recently, much of the literature from the data modelling tool vendors leaves us with the assumption that data modelling is just for DBMSs.


What Needs to Change?

The use and benefit of data modelling is considerably greater than just the “one-trick pony” current press would suggest. To make data modelling relevant for today’s IT landscape, we must show that it is relevant for the “new” technologies such as:
  • ERP packages

  • SOA and XML

  • Business intelligence

  • Data lineage

We also need to break away from the “you must read my detailed data model” mentality and make the information available in a format users can readily understand. This, for example, means that data architects need to recognize the different motivations of their users and repurpose for the audience: Don’t show a business user a data model!

Information should be updated instantaneously, and we must make it easy for users to give feedback – after all, you’ll achieve common definitions more quickly that way.

We need to recognize the real world commercial climate that we’re working in and break away from arcane academic arguments about notations methodologies and the like. If we want to have data modelling play a real part in our business, then it’s up to us to demonstrate and communicate the real benefits that are realized. Remember, data modelling isn’t a belief system – just because you “get it,” don’t assume that the next person does.

So, How Can We Accomplish This?

In Part 2, we’ll be discussing:

  • Modelling for the “new” technologies

  • Demonstrating benefits

  • The greatest change required

  • What needs to stay the same?
  • Chris BradleyChris Bradley

    Christopher Bradley has spent almost 30 years in the data management field, working for several blue-chip organisations in data management strategy, master data management, metadata management, data warehouse and business intelligence implementations.  His career includes Volvo as lead database architect, Thorn EMI as Head of Data Management, Reader's Digest Inc as European CIO, and Coopers and Lybrand’s Management Consultancy where he established and ran the International Data Management specialist practice. During this time, he worked upon and led many major international assignments including data management strategies, data warehouse implementations and establishment of data governance structures and the largest data management strategy undertaken in Europe. 

    Currently, Chris heads the Business Consultancy practice at IPL, a UK based consultancy and has been working for several years with many clients including a British HQ’d super major energy company.  Within their Enterprise Architecture group, he has established data modelling as a service and has been developing a group-wide data management strategy to ensure that common business practices and use of master data and models are promoted throughout the group.  These have involved establishing a data management framework, evangelising the message to management worldwide, developing governance and new business processes for data management and developing and delivering training. He is also advising other commercial and public sector clients on information asset management.

    Chris is a member of the Meta Data Professionals Organisation (MPO) and DAMA, and has archived Certified Data Management Professional Master status (CDMP Master). He has recently co-authored a book Data Modelling For The Business –  A Handbook for Aligning the Business with IT Using High-Level Data Models. You can reach him at

Recent articles by Chris Bradley



Want to post a comment? Login or become a member today!

Be the first to comment!