So How Do You Want Yours Served?

Originally published 6 January 2010

Most organisations today have a business intelligence (BI) and data warehouse (DW) solution of some sort. The maturity of business intelligence and data warehousing has moved on from departmental through operational BI to the position I now see in many corporations to that of enterprise business intelligence.

The availability of a new generation of tools and BI solutions that easily integrate with ERP systems has undoubtedly provided real benefit in reducing overall time to solution.

However, the information explosion, plethora of tool options and information regulation and compliance presents us with more challenges, including:

  • Data migration and the discussion on ETL vs EII (or both?).

  • Does open source BI have a place?

  • Historical reporting or predictive analytics?

  • Information management with regards to BI.

Time does not allow me to cover all of these in this article so I’m going to highlight the first two.

Data Migration and the Take on ETL Versus EII (or Both?)

By now most of us are familiar with the purpose of extract, transform and load (ETL) tools. Less well known, however, are the capabilities of the data virtualisation or enterprise information integration (EII) tools such as Composite or MetaMatrix.

Broadly speaking these provide the capability to access data from a massively wide variety of sources without having to move it from the source system. They have extremely rich caching and aggregation capabilities and, in my experience, have dramatically reduced the time to provide rich access to data. I once heard them described as “views on steroids”.

Can EII/Data Virtualisation Add Value to Data Warehousing?

The use of EII technology in enterprise data warehousing and for data take-on is something that demands serious consideration. There are several ways in which EII can add value to DW solutions; here are just three to consider:

Prototyping data warehouse development. During data warehouse development, the time taken for schema changes, adding new data sources and providing data federation is often considerable. Using data virtualisation to prototype a development environment means you can rapidly build a virtual data warehouse rather than a physical one. Reports, dashboards and so on can be built on the virtual data warehouse. After prototyping, the physical data warehouse can be introduced.

Enriching the ETL process. Frequently new data sources, particularly from ERPs, are required in the data warehouse. All too often the ETL lacks data access capabilities to complex sources. Tight processing windows may require access, aggregation and federation activities to be performed prior to the ETL process. The powerful data access capabilities of EII provide rich access and federation capabilities which can present virtual views to the ETL process which continues as though using a simpler data source.

Federating data warehouses. How many organisations have more than one data warehouse? Is the information in each completely discrete? I don’t think so. Data virtualisation provides powerful options to federate multiple DWs by creating an integrated view across them. This has particular relevance in providing rapid cross warehouse views following a merger or acquisition.

Considerations of ETL or EII?

When providing data into a data warehouse, the use of ETL or EII (or both) needs care. Some of the key considerations are shown in Figure 1:


Figure 1: ETL or EII?

Open Source Business Intelligence

As one of the most talked about BI technology trends this year, the theory of “open source” is undeniably good. In the commercial world however, remember that “commercial” open source doesn’t mean free. The business model for commercial open source usually means that support, training, consulting and so on are the supplier income streams. Thus said, it’s well worth considering these solutions.

For some organisations, management reporting and BI have been provided by spreadsheet-based reports and graphs. These have evolved from a few departmental reports to become an inter-twined set of Excel reports.

If this example spreadsheet life cycle sounds familiar, you are not alone. In fact, it is so familiar that many people just try to live with it. Managing data in this way will eventually lead to poor quality data in reports. Depending on the audience of the reports, the implication of poor data quality may be poor business decisions, loss of credibility, legal compliance issues and possible financial or legal penalties for breaching regulations. If these issues lead to an investigation of data management practises, the spreadsheet daisy chain is going to be hard to defend.

This doesn’t just lead to data quality issues; it also creates a very inefficient chain of data propagation. Everybody in the chain is dependent upon the previous people. Any issues identified need to be passed back along the chain.

Furthermore, by keeping all the raw data in your spreadsheet, you have far more data stored locally than you need. With the continuous stream of information security failings published in the press, can you defend why your local laptop has a spreadsheet containing all the low level data, when all you needed to publish were some high level key performance indicators (KPIs)?

Over time, departments become more and more dependent upon spreadsheets. Before long you have little departmental “cottage industries” producing spreadsheet applications often completely outside the governance of corporate application development strategies.  These spreadsheet applications will inevitably need support and enhancement. You may end up with applications which themselves have a total cost of ownership that was never budgeted for.

For these scenarios, open source BI represents a quantum leap and is highly commended.

Not Only Departmental Solutions

Open source BI is also suitable for larger scale opportunities too; however, before taking the plunge a few questions need to be considered:

  • Is the developer interface and capability suitable? They are often not as refined and capable as their (more expensive) non open source rivals.

  • Is the benefit of the open source “community” appropriate in my sector? The community and development of industry specific solutions is a powerful argument for open source – are there solutions for your industry though?

  • Less expensive versus best? Fitness for purpose, capability and support must always be considered, not simply the headline price.

  • Prototype solution? These provide a very cost-effective way of developing a prototype and the potential of taking it further.

  • Head to head with Microsoft? At the less expensive end of the BI tools scale, Microsoft SQL server, SSIS and SSRS solutions are very competitively priced. Open source tools, however, are less costly than other well know mainstream rivals.

  • Appropriate for end-user development? Not really. However in several organisations the reality is that it is actually the IT folks who build the Business Objects or Cognos reports too.

The opportunity is thus to see where open source business intelligence can be beneficial in your overall solution portfolio. It really has now come of age.

  • Chris BradleyChris Bradley

    Christopher Bradley has spent almost 30 years in the data management field, working for several blue-chip organisations in data management strategy, master data management, metadata management, data warehouse and business intelligence implementations.  His career includes Volvo as lead database architect, Thorn EMI as Head of Data Management, Reader's Digest Inc as European CIO, and Coopers and Lybrand’s Management Consultancy where he established and ran the International Data Management specialist practice. During this time, he worked upon and led many major international assignments including data management strategies, data warehouse implementations and establishment of data governance structures and the largest data management strategy undertaken in Europe. 

    Currently, Chris heads the Business Consultancy practice at IPL, a UK based consultancy and has been working for several years with many clients including a British HQ’d super major energy company.  Within their Enterprise Architecture group, he has established data modelling as a service and has been developing a group-wide data management strategy to ensure that common business practices and use of master data and models are promoted throughout the group.  These have involved establishing a data management framework, evangelising the message to management worldwide, developing governance and new business processes for data management and developing and delivering training. He is also advising other commercial and public sector clients on information asset management.

    Chris is a member of the Meta Data Professionals Organisation (MPO) and DAMA, and has archived Certified Data Management Professional Master status (CDMP Master). He has recently co-authored a book Data Modelling For The Business –  A Handbook for Aligning the Business with IT Using High-Level Data Models. You can reach him at Chris.Bradley@ipl.com.

Recent articles by Chris Bradley



 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!