Distributed Database is a database which consists of two or more data files which are all differently stored on different sites within a computer network. Since the database is distributed, many different end users may be accessing it without having to interfere with one another. Also, because it is distributed, there should be a mechanism which synchronizes the scattered databases in order to make sure that they all contain consistent data.
A central database management system (DBMS) holds the mechanism for managing, controlling and synchronizing the different distributed databases in which the storage devices are not all controlled by a common central processing unit.
The data collections in database tables are distributed among multiple physical data storage in different locations and placed in separate partitions of fragments. In order to make sure that data is always available, each fragment or partition of the distributed database is replicated in such fashion as redundant fail-overs, RAID, etc.
Aside from distributed database replication and fragmentation, there also exist various other distributed database methods and implementation techniques. Some of these methods are local autonomy, synchronous and asynchronous distributed database technologies wherein the technology behind them does not just depend on the business needs and sensitivity or confidentiality of the data which will be placed inside the database.
The basic structure of a distributed database is a client – sever structure. This is a common method in network technology. The most common applications employing client – server methods are file transfer protocol and the world wide web. In client – sever, each computer acts as a node in the system and such that in a distributed system, each computer database server is a node which can acts as either client or server or both.
If one closely examines, this similar to how the internet works: servers give out webpages and computer used by views are the client. But when somebody is browsing a website from a server computer, his computer is both a client and a server at that certain moment he is accessing a webpage.
Because a distributed database is a network technology in general, implementing it should be with utmost care as data is exposed during the server’s and clients’ constant communication with each other. Care should be observed in order to make sure that the data distribution is transparent making the end user interact with the system as if it is one homogenous structure with performance at its peak.
Transactions should also be transparent with database integrity maintained across the whole distributed system. These transactions should also be divided into sub-transactions to better distribute the load and speed up processing but computers need to communicate with other computers in the nodes to synchronize data for consistency.
Many advantages can be derived from a distributed database system instead of a centralized one. A distributed database system can reflect the organizational structure as the database is fragmented and located in different areas of the company. This can make data management easier and problems can be easily traced because of the modularized nature.
Each department can also have local autonomy with their database so control with data is efficient and sense of ownership can be had by each department making sure that each fragment takes care of its data in terms of integrity and security.
Distributed system may give the impression that is very costly because many computers are employed as well as network devices and network maintenance. But in the long run, it costs less because of benefits that can be derived from its speed and efficiency.