In big data warehouses such as those used by business organizations which may have many branches around the world and which may have diversified products and services, different kinds of data flood the warehouse every single day.
These data may come from other warehouse data sources, or simply freshly entered from staff within various departments or any data coming from company subscriptions.
These data could highly likely come in different formats but the purpose of having a data warehouse is to give the company a clear statistical report of industry trends and patterns so data warehouses should have a mechanism of coming up with analysis and reporting tools.
For a business to have an intelligent system which relies on the data supplied by the data warehouse, a well defined business data architecture is a very important consideration. Just as when we are building our house, to facilitate smooth flow of the construction and to make sure that all the materials, interior setup and design and other specifications, a good plan or blue print is essential to that carpenters, masons, electricians and other builder professionals who different areas of specialization can agree on one standard and the house will not go into disarray.
The same is true within the business organization. Different companies can have different perspective of the world transactions. For instance, for an organization offering flower shop services, the word transaction is definitely entirely different from an organization offering computer services.
Even in a homogeneous company, disparity with interpretations of the same business could still exist within a written well defined data architecture. In non-automated systems, one clerk may specify a transaction of the same nature differently from another clerk. A software application can overcome this problem but software applications cannot function without data. So, a Common Data Structure is important for software applications to function properly.
The fact is that in the design of software programs, the choice of Data Structures is the top consideration in design. Many IT professionals have experiences that tell that building large systems has shown that the degree of difficulty in software implementation and the performance and quality of final output is heavily dependent on choosing the best Data Structure. So as early as the planning stage, the definition of a Data Structure is already given much time on.
The Data Structure is the technical interpretation of real life business activities. In real life scenario, businesses include entities like persons, products, kinds of services. This will be translated into Data Structure so that database or any software applications know to store them.
For instance, a person’s name may be stored as a string of characters while the age may be stored as integer. This way, if very specific Data Structures are followed, the system can save on data space storage on disk at the same time, the algorithm for processing may be optimized for speed and less load for the computer.
In today’s data warehouses where distributed systems are common, a Common Data Structure can make it easy to share information between servers in distributed systems. Distributed systems are composed of many computer servers each trying to process business events and sharing the results to be aggregated and used as statistical report for the company.
If a Common Data Structure exists, problems with cross compatibility and portability will be greatly overcome as disparate systems will share the same view on a data following a common structure.