This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)(Learn how and when to remove this template message)
A distributed database is a database in which not all storage devices are attached to a common processor. It may be stored in multiple computers, located in the same physical location; or may be dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.
System administrators can distribute collections of data (e.g. in a database) across multiple physical locations. A distributed database can reside on organized network servers or decentralized independent computers on the Internet, on corporate intranets or extranets, or on other organization networks. Because distributed databases store data across multiple computers, distributed databases may improve performance at end-user worksites by allowing transactions to be processed on many machines, instead of being limited to one.
Both replication and duplication can keep the data current in all distributive locations.
Besides distributed database replication and fragmentation, there are many other distributed database design technologies. For example, local autonomy, synchronous and asynchronous distributed database technologies. These technologies' implementations can and do depend on the needs of the business and the sensitivity/confidentiality of the data stored in the database, and the price the business is willing to spend on ensuring data security, consistency and integrity.
When discussing access to distributed databases, Microsoft favors the term distributed query, which it defines in protocol-specific manner as "[a]ny SELECT, INSERT, UPDATE, or DELETE statement that references tables and rowsets from one or more external OLE DB data sources". Oracle provides a more language-centric view in which distributed queries and distributed transactions form part of distributed SQL.
Today the distributed DBMS market is evolving dramatically, with new, innovative entrants and incumbents supporting the growing use of unstructured data and NoSQL DBMS engines, as well as XML databases and NewSQL databases. These databases are increasingly supporting distributed database architecture that provides high availability and fault tolerance through replication and scale out ability. Some examples are Aerospike, Cassandra, ArangoDB, Clusterpoint, ClustrixDB, Couchbase, Druid (open-source data store), FoundationDB, NuoDB, Riak, MongoDB and OrientDB. The blockchain technology popularised by bitcoin is an implementation of a distributed database.
A database user accesses the distributed database through:
A homogeneous distributed database has identical software and hardware running all databases instances, and may appear through a single interface as if it were a single database. A heterogeneous distributed database may have different hardware, operating systems, database management systems, and even data models for different databases.
In homogeneous distributed database, all sites have identical software and are aware of each other and agree to cooperate in processing user requests. Each site surrenders part of its autonomy in terms of right to change schema or software. A homogeneous DBMS appears to the user as a single system. The homogeneous system is much easier to design and manage. The following conditions must be satisfied for homogeneous database:
In a heterogeneous distributed database, different sites may use different schema and software. Difference in schema is a major problem for query processing and transaction processing. Sites may not be aware of each other and may provide only limited facilities for cooperation in transaction processing. In heterogeneous systems, different nodes may have different hardware & software and data structures at various nodes or locations are also incompatible. Different computers and operating systems, database applications or data models may be used at each of the locations. For example, one location may have the latest relational database management technology, while another location may store data using conventional files or old version of database management system. Similarly, one location may have the Windows operating system, while another may have UNIX. Heterogeneous systems are usually used when individual sites use their own hardware and software. On heterogeneous system, translations are required to allow communication between different sites (or DBMS). In this system, the users must be able to make requests in a database language at their local sites. Usually the SQL database language is used for this purpose. If the hardware is different, then the translation is straightforward, in which computer codes and word-length is changed. The heterogeneous system is often not technically or economically feasible. In this system, a user at one location may be able to read but not update the data at another location.
Care with a distributed database must be taken to ensure the following:
There are two principal approaches to store a relation r in a distributed database system:
A) Replication: In replication, the system maintains several identical replicas of the same relation r in different sites.
B) Fragmentation: The relation r is fragmented into several relations r1, r2, r3....rn in such a way that the actual relation could be reconstructed from the fragments and then the fragments are scattered to different locations. There are basically two schemes of fragmentation:
The Merge Replication Method is popularly used to consolidate the data between databases.
distributed query[:] Any SELECT, INSERT, UPDATE, or DELETE statement that references tables and rowsets from one or more external OLE DB data sources.
Distributed SQL synchronously accesses and updates data distributed among multiple databases. [...] Distributed SQL includes distributed queries and distributed transactions.
This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (April 2013) (Learn how and when to remove this template message)
None of the audio/visual content is hosted on this site. All media is embedded from other sites such as GoogleVideo, Wikipedia, YouTube etc. Therefore, this site has no control over the copyright issues of the streaming media.
All issues concerning copyright violations should be aimed at the sites hosting the material. This site does not host any of the streaming media and the owner has not uploaded any of the material to the video hosting servers. Anyone can find the same content on Google Video or YouTube by themselves.
The owner of this site cannot know which documentaries are in public domain, which has been uploaded to e.g. YouTube by the owner and which has been uploaded without permission. The copyright owner must contact the source if he wants his material off the Internet completely.