Highly Summarized Data

A data warehouse is "a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management’s decision making process".

As brief backgrounder, a data warehouse is a rich repository of corporate data containing not just historical corporate information which is tightly integrated into the operational database but also current transactional database that that are currently the life blood of the day to day operation of the company.

Highly Summarized Data are summarized data obtained by removing many data characteristics from the primary key of the data focus. Dealing with highly summarized data which refer to evaluational data, greatly falls under the domain of a data warehousing implementation. Highly summarized data have coarse granularity.

Since data that are being dealt with in a data warehouse are of very high volume, there need to be an effective mechanism for breaking down the load instead of having all the loads assigned to a central database which may be very risky for the system and for the business organization in general. It is a common data warehouse practice to have the data in the warehouse stored in different databases for data to day operations.

A data warehouse should be able to display "point of time" information or even after the point has passed, anytime as any data consumer wishes to access the database.

For instance, if one sales territory are being realigned, the data warehouse should still be able to view the historical data in the context of both the original location and the newly realigned position. The data going into the data warehouse is static, dependable and reliable.

Of course, the biggest underlying reason for implementing a data warehouse despite its prohibitive cost is that of the company executive’s demands for information about the how company is performing and how the performance relates to the wider world and the competition it will face in the industry it is operating in. So, a data warehouse must be able to come up with highly summarized data because what is the point of gathering all those millions of disparate data if there is no way of extracting meaningful and relevant information from them?

Most highly summarized data are taken from evaluational databases. Evaluational databases have been created as separate from the operational databases to overcome the problems arising from the difference of OLAP and OLTP.

Some time ago, most of the databases were designed to handle on-line transaction processing (OLTP) and concerns were rising in relation to design redundancy and speed issues in entering data. Another system which came was on-line analytical processing (OLAP) and was designed to focus on the retrieval and display of data as quickly and visually easy as possible.

While OLTP and OLAP are substantially beneficial in attaining company success, a problem exists when both reside on the same system using the same database design.
And since it will never be possible to discard the requirements for OLTP and reengineer for OLAP, it was decided that the way to solve the problem would be to create, manage and support a dual database system. Thus were born the operational database and evaluational database.

The evaluational database is the data store to support OLAP function. It is also from this database where a highly summarized data gets certain chunks of data, strips the data of many characteristics from the primary key and make relevant information for use by the company or any data consumer requesting the information.

The use of highly summarized data is very useful for quick meeting reports and presentations or during times of very hectic schedules as in having stiff competition and an instant relevant and high quality information is needed as a basis for wise decision under pressure.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It