Oops! The input is malformed!
Originally published 3 April 2008
In the transaction processing world, data archiving is a system design feature. When you move the same data to your data warehouse, rarely do you have a compelling need to have an archival strategy. How much of the data in your data warehouse do you (or does your business) really use every day? I bet it is less than 1%. Then, why do you need to keep all this data online in a data warehouse? Single version of truth, compliance, audit, fraud detection or business needs – whatever your driver, the value of old data should be evaluated.
Old data is a boat anchor, whether you are a wholesale member’s only club with a high volume of transactions or a wireless company with call detail record (CDR) data. The quality of information stored in the data warehouse is valuable only for a period of time in its current state, and for a future period in a summarized state.
Most data warehouses that are in use today were built to satisfy certain data and reporting requirements. If those requirements are no longer valid, how do you really understand the business value of this data, and how do you manage the life cycle of this data?
Data life cycle value determination:
Data retention requirements and storage strategies:
Implement an archival program.
Online/Offline Storage – If the legacy data needs to be accessed readily, then online storage of the data is essential. If legacy data does not need immediate access, then offline storage of the data is advised.
Traditional offline storage is becoming more expensive both to implement and restore the data. One of the emerging trends in the data management area is the consideration of using the data warehouse appliance as an alternative storage platform. Next month’s article will cover this topic.
In conclusion, understanding the value of the data in the data warehouse and managing the data life cycle is a first step in managing the health of your data warehouse.
Recent articles by Krish Krishnan