Active Data Warehouse is repository of any form of captured transactional data so that they can be used for the purpose of finding trends and patterns to be used for future decision making.
According to Bill Inmon, a prominent data warehousing practitioner, data warehouse defined in terms of subject-oriented, time variant, non-volatile and integrated.
Subject-oriented means that the data captured is organized to have similar data linked together. Time-variant data changes are recorded and tracked so that a change patterns can be determined over time. Non-volatile means that when data is stored and committed, it can be read only and never deleted for comparison with newer data.
An active data warehouse has a feature that can integrate data changes while maintaining batch or scheduled cycle refreshes.
Companies use an active data warehouse in drawing an image of the company in statistical manner. For example, companies may be able to determine which months during a year employees have most absences and which branch has the most and least absences in a given period within a year. Among others, companies will be able to tell sales pattern like what particular products sell the most during a certain month and which countries have the most sales. With these patterns being found, the company can then formulate strategies on how to best optimize sales and generate revenues.
Data warehousing has existed in the late 80s as a type of computer database. It was developed to overcome the pattern spotting limitations of operational systems which could not handle intensive processing load for company wide reporting.
The early data warehouses were stored in separate computer databases designed specifically for the purpose of management information and analysis. Data came from several sources including mainframe computers, mini computer and personal computers. These data were integrated in one place for faster processing and user friendly software applications were developed to present statistical reports from the integrated data.
As technology evolved, data warehousing methods improved along with greater demands from company users. Data warehouses had several stages of evolution. At the early stage, data is copied from an operational system database into an offline database server where processing requirements do not affect the performance of the operational system. The offline data warehouse regularly updates data from the operational systems and store the data in an integrated data structure.
Real time data warehouse updates data on during actual transaction time in the operational system. Integrated data warehouse generate transaction events which are given back to operational systems for worker’s daily use.
Online transaction processing (OLTP) is the storage system often used for active data warehousing. OLTP is a relational database design that breaks down complex information into simple data tables. It is very efficient in analyzing and reporting billions of captured transactional data into user friendly format. It can also be tuned up to maximize computing power although data warehousing professionals recommend having a separate reporting database in other computer given the fact that millions of data may be processed by the OLTP database every second.
Active data warehouse professionals are often called Data Warehouse Architects. They are primarily top-notched database administrators who are tasked to handle a huge amount of complex data from different sources sometimes coming from different countries around the world.
An active data warehouse is often associated with Business Intelligence Systems. In the past, it was also referred to as Decision Support System (DSS) and Management Information System (MIS).