Oops! The input is malformed!
Originally published 13 July 2011
A recent (June 2011) IDC Digital Universe study found that the world's data is doubling every two years – faster than Moore's Law. It reckoned that 1.8 zettabytes (1.8 trillion gigabytes) will be created and replicated in 2011, enterprises will manage 50 times more data, and files will grow 75 times more in the next decade. (Do you have any idea how much data 1.8 zettabytes really is? It’s about the same amount of data if every person in the world sent twenty tweets an hour for the next 1200 years!)
The “big data” phenomenon is driving transformational, technological, scientific, and economic changes; and "information taming" technologies are driving down the cost of creating, capturing, managing and storing information.
We’ve all seen how organisations have an insatiable desire for more data as they believe this information will radically change their businesses. They are right, but data by itself is useless. Only the effective exploitation of this vast mountain of data – using business intelligence to convert it into helpful information, knowledge and applied decision making – will help it reach its true potential.
The problem is that big data analytics push the limit of traditional data management. Allied to this, the most complex big data problems start with huge volumes of data in disparate stores with high volatility of data. However, big data problems aren’t just about volume; there’s also the volatility of the data sources and rate of change, the variety of the data formats, and the complexity of the individual data types themselves. So is it always the most appropriate route to pull all this data into yet another location for its analysis?
Unfortunately, many organisations are constrained by traditional data integration approaches that can slow adoption of big data analytics. Approaches that can provide high performance data integration to overcome data complexity and data silos will be those that win. These approaches need to integrate the major types of “big data” into the enterprise. The typical “big data” sources include:
Fortunately, approaches such as data federation and data virtualisation are stepping up to meet this challenge.
Finally and of utmost importance is managing the quality of the data. What’s the use of this vast resource if its quality and trustworthiness are questionable? Thus, driving your data quality capability up the maturity levels, as listed in Figure 1, is key.
Figure 1: Data Quality Maturity – 5 levels of maturity (© IPL)
SOURCE: Big Data – Same Problems?
Recent articles by Chris Bradley