In a company, a database contains millions of atomic data. Atomic data are data information that cannot be further broken down. For example, product name is an atomic data because it can longer be broken down but product raw material can be broken further into raw components depending on the good. An individual products sales is another atomic data.
But business organizations are not just interested in the minute details but they are also interested in the bigger picture. So, atomic data are combines and aggregated. When this is done, the company can already determine regional or total sales, total cost of goods, selling, general and administrative expenses, operating income, receivables, inventories, depreciation, amortization, debt, taxes and other figures.
Data Mining, or taking data from the vast repository of data warehouse, uses combined data intensively. Software applications in conjunction with a good relational database management system have been developed to come up efficient ways to store and access data gathered over time and space for statistical analysis.
Data Mining is technically described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" and "the science of extracting useful information from large data sets or databases". It is a process involving large amounts of data being sorted to pick out relevant data from potentially non-relevant sources.
One of the biggest problems with Data Mining is level of Data Aggregation. For example, in an online survey by a private organization on the smoking trends of one region, it can be reflected that one data set contains records of those who currently smoke, another of those who have quit smoking and another data set contains records of those who have never smoked at all.
The collection within each data set continuously rises as data from other sources keep coming. The traditional ways to combine these data are done with either using ad hoc method or putting each data set to certain model and them combining them.
Newer methods have been developed to efficiently Combine Data from various sources. Several data coming various tables and databases can now be combined into a single information table. One method used is a likelihood procedure which provides an estimation technique to address identifiable problems with aggregated data from some tables related to other tables.
Companies find valuable investment in technologies having Business Intelligence. Business Intelligence combines the vast repository of business data warehouse with a software systems that analyzes and reports based on the gathered Business Data.
An example of a Business Intelligence technology is Online Analytical Processing, or OLAP. OLAP can quickly provide answers to analytic queries which are in nature multidimensional. It can combine data from different sources and generate reports for sales, marketing, financial forecasting, budgeting, and other related aspects of business.
OLAP can make complex ad hoc and analytical queries on a database configured for OLAP use and the execution can be very fast given the fact that a server needs to answer many users at a time from different geographical locations. OLAP combines data to give a matrix output format with dimensions forming rows and columns representing the values and measures.
Combined Data is also heavily found in Data Farming, a process where high performance computers or computing grids run simulations billion times across a large parameters and value space to come up a landscape of output data to be used for analyzing trends, insights and anomalies of many dimensions. It can be compared to a real plant farm and a harvest data comes after some time.