Disparate Metadata Cycle

Disparate Metadata Cycle is a kind of cyclic process in which disparate metadata gets  produced rapidly.  With data warehouse implantations becoming more ubiquitous these days due to the fact that businesses can no longer function without being supplied by relevant data, the use of metadata has become relevant too. As a short backgrounder, a metadata is a data describing another data in the goal of facilitating the use, management and understanding of data in a large data warehouse.

For instance in a library system, metadata to be used will surely include description of book contents, authors, data of publication and the physical location of books in the library. If the context of use is about photography, the metadata which will be used are for description of the camera, model, types, photographer, date photograph was taken, location where photograph is taken and many other things.

But managing metadata is not as simple as making it. Management can all the more become complex when dealing with large data warehouses which get data from various data sources powered by disparate databases.

The problem of disparate metadata has been a big challenge in many areas of information technology. Such technologies as data warehousing, enterprise resource planning (ERP), supply chain management and all other application dealing with transactional systems with disparate data as wells as duplication and redundancy. In the case disparate metadata, the most common problems are missing metadata relationships, costly implementation and maintenance and poor choice of technology platforms.

A metadata management project is more about systems-integration problem than an application-development effort. Although it may possible for an organization to simply build its own metadata management tools, many opt to buying a collection of vendor-provided products that work together to move the data from source applications into the warehouse environment.

Each of these metadata management tools can support several sets of data warehouse functions which se or create subsets of the data warehouse metadata which have been taken from various disparate databases.

In general practice, many tools require access to the same metadata but in many instances, each product is self-contained and provides its own metadata management facilities. Consequently, the definition of metadata may be redundant across the data warehouse tool suite and this can result in disparate data. But as part of the cycle, the metadata must be integrated for them to be useful to the user of the data warehouse metadata.

Integration metadata is in fact just similar to general data warehouse data integration. In the same manner that the data warehouse system integrates data from disparate databases, the warehouse metadata management environment must integrate disparate metadata.

The system is simply configured for collecting the disparate metadata from the source tools then after integration, metadata is and disbursed it back to any tools that use it. It should be wise though to determine which tool is the most appropriate source for each metadata object as various data warehouse tools may be capturing the same metadata at any given time.

Managing a disparate metadata cycle can really pose an incredible challenge. This is mainly because the only common attribute in identifying different versions of the same metadata objects is its name.

If different tools have been used in recording the same metadata attribute, certain rules must be established regarding which tool should maintain the master version in order to maintain consistency. So despite all these tools, disparate metadata needs to be repetitive process until such integration can take place. And the same goes disparate data, the cycle will always take place as long as the data warehouse is still in use.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It