How Data Virtualization Helps Data Integration Strategies

Originally published 6 December 2011

Many different approaches are now available for data integration, yet far and away the most popular approach is still extract, transform and load(ETL).

However, the pace of business change and the requirement for agility demands that organizations support multiple styles of data integration. Three leading options present themselves; I will now describe the differences among the three major styles of integration.

  1. Physical Movement and Consolidation

    Probably the most commonly used approach is physical data movement. This is used to replicate data from one database to another. There are two major genres of physical data movement: extract, transform and load (ETL) and change data capture (CDC). ETL is typically run according to a schedule and is used for bulk data movement, usually in batch. CDC is event driven and delivers real-time incremental replication. Example products in these areas are Informatica (ETL) and Oracle GoldenGate (CDC).





     

  2. Message-Based Synchronization and Propagation

    Whilst ETL and CDC are database-to-database integration approaches, the next approach, message-based syncronisation and data propogation, is used for application-to-application integration. Once again there are two main genres: enterprise application integration (EAI) and enterprise service bus (ESB) approaches, but both of these are used primarily for the purpose of event- driven business process automation. A leading product example in this area is the ESB from TIBCO.






  3. Abstraction / Virtual Consolidation (aka Federation)

    The third major style of integration is data virtualization (DV). The key here is that the data source (usually a database) and the target or consuming application (usually a business application) are isolated from each other. The information is delivered on demand to the business application when the user needs it. The consuming business application can consume the data as a database table, a star schema, an XML message or in many other forms. The key point with a DV approach is that the form of the underlying source data is isolated from the consuming application. The key rationale for data virtualization within an overall data integration strategy is to overcome complexity, increase agility and reduce cost. A leading product example in this area is Composite Software.







Extract, Transform and Load or Data Virtualization?

The suitability of data integration approaches needs to be considered for each case. Here are six key considerations to ponder:

  1. Will the data be replicated in both the data warehouse (DW) and the operational system?

    1. Will data need to be updated in one or both locations?

    2. If data is physically in two locations, beware of regulatory and compliance issues associated with having additional copies of the data (e.g., SOX, HIPPA, BASEL II, FDA, etc.).

  2. Data governance

    1. Is the data only to be managed in the originating operational system?

    2. What is the certainty that a data warehouse will be a reporting data warehouse only (versus operational DW)?

  3. Currency of the data (i.e., does it need to be up to the minute?)

    1. How up to date are the data requirements of the data warehouse?

    2. Is there a need to see the operational data?

  4. Time to solution (i.e., how quickly is the solution required?)

    1. Immediate requirement?

    2. Confirmed users and usage?

  5. What is the life expectancy of source system(s)?

    1. Are any of the source systems likely to be retired?

    2. Will new systems be commissioned?

    3. Are new sources of data likely to be required?

  6. Need for historical / summary / aggregate data

    1. How much historical data is required in the DW solution?

    2. How much aggregated / summary data is required in the DW solution?

Leading analyst firms like Gartner are recommending that you add data virtualization to your integration tool kit, and that you should use the right style of data integration for the job for optimal results.

SOURCE: How Data Virtualization Helps Data Integration Strategies

  • Chris BradleyChris Bradley

    Christopher Bradley has spent almost 30 years in the data management field, working for several blue-chip organisations in data management strategy, master data management, metadata management, data warehouse and business intelligence implementations.  His career includes Volvo as lead database architect, Thorn EMI as Head of Data Management, Reader's Digest Inc as European CIO, and Coopers and Lybrand’s Management Consultancy where he established and ran the International Data Management specialist practice. During this time, he worked upon and led many major international assignments including data management strategies, data warehouse implementations and establishment of data governance structures and the largest data management strategy undertaken in Europe. 

    Currently, Chris heads the Business Consultancy practice at IPL, a UK based consultancy and has been working for several years with many clients including a British HQ’d super major energy company.  Within their Enterprise Architecture group, he has established data modelling as a service and has been developing a group-wide data management strategy to ensure that common business practices and use of master data and models are promoted throughout the group.  These have involved establishing a data management framework, evangelising the message to management worldwide, developing governance and new business processes for data management and developing and delivering training. He is also advising other commercial and public sector clients on information asset management.

    Chris is a member of the Meta Data Professionals Organisation (MPO) and DAMA, and has archived Certified Data Management Professional Master status (CDMP Master). He has recently co-authored a book Data Modelling For The Business –  A Handbook for Aligning the Business with IT Using High-Level Data Models. You can reach him at Chris.Bradley@ipl.com.

Recent articles by Chris Bradley



 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!