Oops! The input is malformed! Case Study: Launching a Corporate Glossary by Bonnie O'Neil - BeyeNETWORK UK

Channel: Data Management - Ronald Damhof RSS Feed for Data Management - Ronald Damhof


Case Study: Launching a Corporate Glossary

Originally published 1 July 2005

Published in TDAN.com July 2005

Introduction and Project Background

There have been several events which have transpired in my current place of business which have created the need for making a common corporate glossary explicit. Perhaps the most notable of these was a migration to a new Commercial Off-the-Shelf (COTS) software system which essentially manages the heart of the business. As you would expect, this system has its own terminology and this has caused the entire business to learn a new vocabulary. Terms that are important to the business are important enough to expend time and resources in tracking them and making sure everyone in the enterprise is "speaking the same language"-literally. We have therefore identified the need for a corporate glossary.

A corporate glossary also sets the stage for "the semantic web", intelligent search, and real knowledge management: The establishment of a corporate, shared knowledge base where different groups within the organization can learn from the successes of others.

This article will share a case study in launching a corporate glossary. In the process, we will explore the Lifecycle of a Term and discuss the beginning of a Governance Process for Business Terms.

In the last issue, I wrote about how to create a definition. In this series of articles, I will be exploring our journey in the creation of a knowledge base. The next step after a glossary is to expand and built it into an encyclopedia of sorts. Along the way, we will be looking at infrastructure, lifecycle, architecture, and governance. We have discovered some exciting things in our adventures, and I look forward to sharing these nuggets of knowledge with you.

Purpose and Function of the Glossary

The purpose of the glossary is to provide anyone with the definition of any term or data element name, no matter where they are or in whatever software they are using.

In order to provide this capability, the glossary must be ubiquitous--available from anywhere in the organization, at any time and always accessible by everyone. It should be easy to use, and require as few keystrokes as possible. It should also be able to be modified by not just the Subject Matter Experts (SME's) but also by anyone in the company.

Our portal is the perfect place for the glossary to reside, because it is available from anywhere in the company. It comes up when you click the Internet Explorer icon in the tray.

"Governance Lite"

If anyone is allowed to create or update an entry, some control is needed to reconcile terms cross-functionally, with other groups in the organization who may have a different usage of a term. This creates the need for governance.

We have all experienced data dictionary initiatives that have been burdened with too much governance. The governance gets in the way of flexibility: The business needs to be able to change the definition when the business itself changes. The resulting situation involves definitions which may have been correct at one point in time, but as things change, they never are in synch with the business as time goes on because it takes an Act of Congress to change the dictionary. We have therefore created what we call "Governance Lite" in order to create a flexible structure to accommodate both the need for governance and the need to keep constantly in synch with the business.

Governance Lite works as follows: Anyone in the organization can create a Business Term entry in the glossary. When a new term is created, it has a state of "Candidate". Anyone can see all candidate terms, and the state will be shown to users of the Glossary.

We have a "Terms Team" whose job it is to rationalize and normalize terms. The Terms Team is electronically notified when a new term is entered into the glossary. The team then researches the term with other terms already in the glossary and makes sure there are no conflicts. The team makes sure the wording of the definition is accurate, and that the definition follows the format mentioned in my previous article. Sometimes this work may require researching reference documents, and/or contacting line of business SMEs directly. When the definition has been successfully researched, then its state is changed to "Authorized".

The business is alerted to its use of a Candidate term, since the term is displayed in the glossary along with its state. The business should take steps to clarify the meaning of Candidate terms whenever they are used, since it is possible that a candidate can have more than one meaning.

"Wiki" Updating

"Wikipedia" is an open source encyclopedia on the internet. What's cool about it is it's the "People's encyclopedia": anyone can update an existing entry or add a new one. In this way, everyone can participate in it and in a real sense, "own" it. On the downside, it can be very chaotic, because it lacks governance. We are trying to strike a balance and enable everyone to feel like they can contribute, and therefore only apply minimal governance. Where it gets interesting is when someone wants to update an existing entry; especially someone else's entry. This is when the governance is really needed.

How We are "Faking" Wiki

Our portal software has two types of development: Custom and straight out of the box. When you use the latter, you can implement stuff very quickly. The tradeoff is you cannot do any custom development. Since we wanted to launch our glossary quickly, we wanted to use the straight out of the box software. We had to then get creative in how we implemented it. Since the "plain vanilla" implementation does not offer versioning, we decided that every time someone wanted to edit an existing term, they can click the "edit" button (even on a term submitted by someone else) and they would be allowed to edit it. The system would create a new term under the scenes, using a database trigger. So if they are editing an Authorized term, the dictionary would essentially have both: the older version, which is the Authorized one, and the proposed new one, which would be a Candidate. The "new" one would contain the edited version of the definition. So what we are doing is not a true wiki because we need the entries for both before and after in order to apply governance and resolve conflicts.

Resolving Conflict

Suppose a business term has been entered more than once in the glossary, with two different meanings. There are several ways this can be resolved:

  1. Add a modifier to the term to create a new term: there may be a general definition of customer, but Marketing uses the term differently so you can have a "Marketing Customer."
  2. In semantics, there is a notion of "word sense": A different "sense" of the word. For example, the word "Mole" has different meanings, such as:
  • a small furry mammal,
  • an indentation in the skin,
  • or a spy.

In our glossary, we will be using what I call "enumerated definitions", meaning a term will have numbered definitions. We have not decided whether the order of the definitions will reflect the most to least commonly used definition. I think for now it will be a "first in" type queue: The first definition received will be the first shown. Therefore, every term will have only one "Authorized" entry.

3. Resolution is needed; see next section.

Replacing and Merging Terms

The Term Team examines two or more existing definitions for the same term, either from separate submissions by different people or the modification of an existing term. Here are the resolution scenarios:

  • One of the definitions looks better than the others, either it is more complete, language is more precise, etc. In this case, one term will replace the others.
  • Elements of all the candidate definitions can be merged together to create a new definition.
  • One definition may be incorrect, or may apply to a different term.

In all of these cases, one definition will replace others. The replaced term will have a status of "Replaced" and there is a data element Replaced By which would contain the Term ID of the new term.

Lifecycle of a Term

The states of a term are:

  1. Candidate
  2. Authorized
  3. Replaced
  4. Retired

Sometimes it will be determined by the business that a term is not useful anymore and is no longer part of the common business language. In this case, its status will be Retired. We are still setting policies concerning retired terms. Right now, our intention is if the Term Team uncovers a term that should be retired, the status will be set to Retired, and if no one comments on this or edits the definition, it will cease to be an active part of the glossary and will no longer be displayed to users.

Using the Glossary: How Search Works

The first step in a glossary search is accessing the main portal page by launching the browser. There are two portlets for the glossary, both are shown on the main portal page: One for search and one for submit. The search portlet has only one entry area, for the term name, along with the search button. The results window displays one definition, with an arrow indicating more rows found if they exist. The results are sorted by State, so if there is an Authorized definition it will show first. Otherwise, the results are sorted secondarily by date entered, so if there are multiple candidates, the most current will show first.

The submit portlet allows the user to enter a term name and a definition if desired. We want users to be able to submit a term even if they don't know the definition.


One of the most immediate goals of our project is getting the business to be familiar with the glossary and to use it. We are in the process of launching a publicity campaign to generate excitement and awareness of the glossary. The publicity campaign includes a contest: Every time someone enters a term, their name will be entered into the drawing to win a gift certificate for company merchandise branded with its logo and trademarks, like polo shirts, tote bags, etc. There will be an article on the corporate portal, as well as an article in the e-magazine and posters in each regional office announcing the contest.

The purpose of the contest is twofold:

  • To collect as many terms from actual business people as we can, and
  • To get people familiar with the glossary so they can start using it in their daily work.

The Future of the Glossary: Semantic Web and Encyclopedia

The Terms Team, in the background, will quietly be collecting synonyms. Eventually we want to display synonyms to the user when the definition is shown. This will require customization so it will be implemented in a later release. It is our goal at this time to have the display limited to the term name, the definition and its status. Synonyms and related terms lead to ontological information: terms that can be related and can therefore supercharge searching.

The stage is being set for semantics to empower search. Although the tools are not here yet, it helps to be aware of the emerging standards such as RDF and OWL which will enable related information to be tagged for easy navigation. We see that our dictionary work is preparing us for semantically enabled search capability when it arrives. At the same time, another group is gearing up for the creation of an in-house taxonomy for corporate wide searches. Our glossary will help define the base terms used in this taxonomy.

Our next evolutionary step for our "wiki-glossary" will be a corporate "wikipedia", like its namesake on the web. We want to enable business people to capture the largely unstructured data that they uncover every day that helps them perform their jobs better. We want to create a way that when people make a discovery of knowledge that helps them, they can enter it in the wikipedia as an entry (sort of like a business term entry but more detailed) and others can search on it. We already have a document library, but it contains only formal documentation on business processes. We would like to capture all the informal knowledge embedded in "carbon-based life forms" so others can benefit. Everyone knows that most business rules live informally, with the large majority in people's heads. If we can create a very simple tool that makes it easy to capture this informal bed of knowledge, then the corporation will benefit greatly. Our next step, along with the creation of a "wiki-ontology", is a classification of corporate knowledge that can be grown from the business people themselves.

Generating Business Value

Our glossary is making the business community aware of the meanings of the words they use. We see confusion every day in word usage which often translates into misunderstood business metrics, which can have a drastic impact on decision-making. It is our hope that we can clarify terms so systems and metrics can be better understood, which will eventually have an impact on the bottom line.

SOURCE: Case Study: Launching a Corporate Glossary

  • Bonnie O'Neil

    Bonnie is President of Westridge Consulting, and is an internationally recognized expert on data warehousing and business rules. She is a regular speaker at Meta Data/DAMA Conference, Software Development, Database World, Guide, and the Business Rules Forum; she was the keynote speaker at an international conference on Data Quality in South Africa.  She is a founding member of the Guide Business Rules Group (a standards group for business rules) and also the ODTUG Business Rules Summit. She has been involved in data warehousing projects in both Fortune 500 companies and government agencies, and her expertise includes specialized skills such as data quality, profiling, data integration and migration.  She is the author of two database books including Oracle Data Warehousing Unleashed, as well as over 40 articles and technical white papers. She is a Certified GIF Architect by Bill Inmon, the father of data warehousing.



Want to post a comment? Login or become a member today!

Be the first to comment!