Berkeley DB

From Wikipedia, the free encyclopedia

Berkeley DB (BDB) is a computer software library that provides a high-performance embedded database, with bindings in C, C++, Java, Perl, Python, Ruby, Tcl, Smalltalk, and many other programming languages. BDB stores arbitrary key/data pairs as byte arrays, and supports multiple data items for a single key. BDB can support thousands of simultaneous threads of control or concurrent processes manipulating databases as large as 256 terabytes, on a wide variety of operating systems including most Unix-like and Windows systems, and real-time operating systems.

Contents

[edit] Origin

Berkeley DB was first developed at the University of California, Berkeley as part of the transition from BSD 4.3 to 4.4 and the effort to remove AT&T-encumbered code. In 1996 the authors of Berkeley DB were asked by Netscape to improve and extend the library, then version 1.85, to suit their requirements for an LDAP server[1] and for use in the Netscape browser. That request led to the creation of Sleepycat Software, which was acquired by Oracle Corporation in February 2006. Berkeley DB is redistributed under the Sleepycat Public License, an OSI- and FSF-approved license. The product ships with complete source code, build tools, test suite, and documentation. The code quality and general utility along with the free software/open source license has led to its use in a multitude of free software/open source programs. Those who don't wish to abide by the terms of the Sleepycat Public License have the option of purchasing another proprietary license for redistribution from Oracle Corporation. This technique is called dual licensing.

Berkeley DB includes compatibility interfaces for some historic UNIX database libraries: dbm, ndbm and hsearch.

[edit] Architecture

Berkeley DB is notable for having a simple architecture compared with other database systems like Microsoft SQL Server and Oracle. For example, it does not provide support for network access — programs access the database using in-process API calls. It does not support SQL or any other query language, nor does it support table schema or table columns. A program accessing the database is free to decide how the data is to be stored in a record. Berkeley DB puts no constraints on the record's data. The record and its key can both be up to four gigabytes long.

Despite having a simple architecture, Berkeley DB supports many advanced database features such as ACID transactions, fine-grained locking, an XA interface, hot backups and replication.

[edit] Editions

Berkeley DB comes in three different editions:

  • Berkeley DB
  • Berkeley DB Java Edition
  • Berkeley XML DB

These are three separate database libraries despite the common branding. The first is the traditional Berkeley DB, written in C.

Berkeley DB Java Edition is a pure Java database. It is similar, but not identical, in design to Berkeley DB. It does not offer all the features that traditional Berkeley DB has. However, it has the advantage of being written in pure Java, not requiring any native code; it also has a different architecture, which gives it different performance & concurrency characteristics, which may be advantageous or dis-advantageous depending on the application. It provides two APIs -- one which is based on the Java Collections Framework (an object persistence approach); and one based on the traditional Berkeley DB API. Note that traditional Berkeley DB also supports a Java API, but it does so via JNI and thus requires the native library to be installed.

Berkeley XML DB is a database specialised for the storage of XML documents, supporting XQuery queries. It is implemented as an additional layer on top of Berkeley DB. It supports multiple language bindings, including C and Java. However, note that the Java binding uses JNI, and thus this is not a pure Java solution.

[edit] Programs that use Berkeley DB

Berkeley DB is the underlying storage system of several LDAP servers, database systems, and many other proprietary and free/open source applications. Below is a list of notable programs that use Berkeley DB for data storage.

[edit] Licensing

Versions 2.0 and higher of Berkeley DB are available under a dual license. Versions earlier than 2.0 are available under a BSD-like license that has an unusual additional clause similar to the GNU GPL version 2's Section 3.

The Sleepycat Public License requires that software that uses the Berkeley DB code be free/open source software (under an approved OSI license) to allow redistribution. Alternatively, if the application using the Sleepycat licensed code is not redistributed (see the GPL's definition of 'redistribution') then the Sleepycat license terms are not broken. If a given application is using Berkeley DB and redistributing it under a non-free/closed source license then the publisher of that application is in violation of the Sleepycat Public License and must negotiate a new license allowing those rights from the copyright holder (in this case Oracle). Oracle offers a standard commercial Berkeley DB license agreement for this purpose.

[edit] External links

[edit] References

  1. ^ Brunelli, Mark. "A Berkeley DB primer", Enterprise Linux News, March 28, 2005. Accessed October 18, 2007.