The Demise of the Data Integration Toolbox and the Rise of the Data Management Platform

Originally published 21 March 2011

Today’s organizations have incredible amounts of data to be managed, and in many cases it is quickly spiraling out of control. Increased channels and routes to market; business globalization; expansion from traditional data repositories such as databases to unstructured data such as e-mails, blogs and networking sites; and a sharper focus on regulatory compliance have all contributed to this exponential increase of data that is captured, processed, analyzed and archived.

To address the issues around managing, governing and utilizing data, organizations have acquired quite a toolbox of data integration (DI) tools and technologies over the past decade. A core driver for these DI tools and technologies, and their subsequent assembly in a “toolbox,” has been the ever-evolving world of the data warehouse.

A Look Inside the Toolbox

A current DI toolbox to support data warehouses typically contains software for the following tasks:
  • ETL to support core processes of extraction, transformation and loading typically associated with data warehousing.

  • Data quality that supports the process of cleansing the data so that it is “fit for purpose.”

  • Data profiling to support the examination of data and metadata in a repository to collect statistics and information about that data.

  • Data federation to dynamically aggregate data from multiple sources into a single virtual view and expose that data through SQL queries or a service.

  • Data exploration for identifying where data resides within the organization, followed by the categorization and documentation of that data.

  • Metadata management capabilities to store, search, report on, link to, categorize and govern information about data.

  • Master data management (MDM) for: defining and maintaining consistent definitions of reference or master data; storing and sharing that data enterprisewide across IT systems and groups; and ensuring master data files remain in a standardized, controlled format as they are accessed and updated.
To build their required data integration toolbox, organizations have possibly acquired the tools and technologies from different vendors over time. This means that most organizations now have a multitude of tools running on diverse infrastructure. The toolbox is generally poorly integrated, and the differing interfaces stifle collaboration and consistent application of processes and standards. As projects become increasingly complex they require deployment of many technologies from these DI toolboxes, a significant proportion of the overall time on the project is spent resolving technology integration issues that could be better applied to DI requirements that support the business.

Sounding the Death Knell for DI Toolboxes

Data-related projects continue to grow in importance, and organizations are shifting from a focus on integrating data mainly for decision support to how they can better manage organizational data. This evolving area is known as data management because it refers to the management and governance of all data in the organization. As the area of data management has emerged, new projects have been identified faster than the staff to support those projects.

Organizations need tools and technologies that address new requirements and enable employees to focus on the job at hand instead of spending time constantly integrating disparate technologies in the toolbox. The need for a single, integrated data management platform that can address all aspects of DI, data quality and master data management could be sounding the death knell for the data integration toolbox. These key areas will be underpinned by adapters and a federation capability, and will share technical and business metadata that aids in collaboration. Ultimately, a single user interface should surface all the platform capabilities rather than a disparate set of user interfaces.

Having a single platform and single interface provides the additional benefit of applying a consistent methodology spanning the different aspects of the data management life cycle, an approach that proves difficult or impossible with a toolbox. With a single platform for all data management initiatives and a consistent interface, a methodology can be incorporated into the platform and guide the business analyst or developer through the required project phases and the tasks. This approach not only reduces the learning curve, but also reduces risk and accelerates project delivery.

The Value of a Unitary Platform

To be successful, the data management platform will need to be based on a single, shared architecture. Depending on their role, platform users launch modules within a consistent user interface to provide data exploration, profiling, business rule management, ETL/ELT and many other capabilities. A business user might be defining business rules or examining a profile report to identify where data in a source system doesn’t comply with defined business requirements. A technical user will access the same data management platform, launch the data integration module, see any notes and comments, utilize the business rules defined by the business user. On top of a single, shared architecture, anything captured or developed by one user is available to be consumed by another, enhancing collaborative efforts.

Once organizations realize and experience the value of a single data management platform, it is easy to conclude that the DI toolbox is no longer viable.


SOURCE: The Demise of the Data Integration Toolbox and the Rise of the Data Management Platform

  • David BarkawayDavid Barkaway
    David has more than 15 years in the software industry working mainly for software vendors. In the last 10 years David has focused specifically on data management technologies working for organizations such as Evolutionary Technologies, Business Objects’ Enterprise Information Management Division, BEA’s European Product Specialist’s Group and GoldenGate. He has gained significant practical and market experience in the data integration and data warehousing world and more recently in emerging technologies such as data services, data virtualisation, SOA architectures, transactional data management and master data management (MDM).

    As a member of the SAS EMEA Technology Practice, David has responsibility for SAS data management technologies where he spends the majority of his time working with customers and prospects discussing and advising on data management to ensure the successful application of the data integration, data quality and MDM technologies. He can be reached at david.barkaway@sas.com.


 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!