Berkeley DB
From Wikipedia, the free encyclopedia
Berkeley DB (BDB) is a computer software library that provides a high-performance embedded database, with bindings in C, C++, Java, Perl, Python, Ruby, Tcl, Smalltalk, and many other programming languages. BDB stores arbitrary key/data pairs as byte arrays, and supports multiple data items for a single key. BDB can support thousands of simultaneous threads of control or concurrent processes manipulating databases as large as 256 terabytes, on a wide variety of operating systems including most Unix-like and Windows systems, and real-time operating systems.
Contents |
[edit] Origin
Berkeley DB was first developed at the University of California, Berkeley as part of the transition from BSD 4.3 to 4.4 and the effort to remove AT&T-encumbered code. In 1996 the authors of Berkeley DB were asked by Netscape to improve and extend the library, then version 1.85, to suit their requirements for an LDAP server[1] and for use in the Netscape browser. That request led to the creation of Sleepycat Software, which was acquired by Oracle Corporation in February 2006. Berkeley DB is redistributed under the Sleepycat Public License, an OSI- and FSF-approved license. The product ships with complete source code, build tools, test suite, and documentation. The code quality and general utility along with the free software/open source license has led to its use in a multitude of free software/open source programs. Those who don't wish to abide by the terms of the Sleepycat Public License have the option of purchasing another proprietary license for redistribution from Oracle Corporation. This technique is called dual licensing.
Berkeley DB includes compatibility interfaces for some historic UNIX database libraries: dbm, ndbm and hsearch.
[edit] Architecture
Berkeley DB is notable for having a simple architecture compared with other database systems like Microsoft SQL Server and Oracle. For example, it does not provide support for network access — programs access the database using in-process API calls. It does not support SQL or any other query language, nor does it support table schema or table columns. A program accessing the database is free to decide how the data is to be stored in a record. Berkeley DB puts no constraints on the record's data. The record and its key can both be up to four gigabytes long.
Despite having a simple architecture, Berkeley DB supports many advanced database features such as ACID transactions, fine-grained locking, an XA interface, hot backups and replication.
[edit] Editions
Berkeley DB comes in three different editions:
- Berkeley DB
- Berkeley DB Java Edition
- Berkeley XML DB
These are three separate database libraries despite the common branding. The first is the traditional Berkeley DB, written in C.
Berkeley DB Java Edition is a pure Java database. It is similar, but not identical, in design to Berkeley DB. It does not offer all the features that traditional Berkeley DB has. However, it has the advantage of being written in pure Java, not requiring any native code; it also has a different architecture, which gives it different performance & concurrency characteristics, which may be advantageous or dis-advantageous depending on the application. It provides two APIs -- one which is based on the Java Collections Framework (an object persistence approach); and one based on the traditional Berkeley DB API. Note that traditional Berkeley DB also supports a Java API, but it does so via JNI and thus requires the native library to be installed.
Berkeley XML DB is a database specialised for the storage of XML documents, supporting XQuery queries. It is implemented as an additional layer on top of Berkeley DB. It supports multiple language bindings, including C and Java. However, note that the Java binding uses JNI, and thus this is not a pure Java solution.
[edit] Programs that use Berkeley DB
Berkeley DB is the underlying storage system of several LDAP servers, database systems, and many other proprietary and free/open source applications. Below is a list of notable programs that use Berkeley DB for data storage.
- Bogofilter—A free/open source spam filter that saves its wordlists using Berkeley DB.
- Caravel CMS—A free/open source content management system originally designed for the 2,000+ organizations of the Mennonite Church
- Carbonado—An open source relational database access layer.
- Cfengine—A free/open source configuration management system, developed by Mark Burgess of Oslo University College.
- Citadel—A free/open source groupware platform that keeps all of its data stores, including the message base, in Berkeley DB.
- Jabberd2—A Jabber server
- KDevelop—An IDE for Linux and other Unix-like operating systems
- KLibido—A free/open source Newsgroup reader tailored for binary downloads
- Movable Type (until version 4.0)—A proprietary weblog publishing system developed by California-based Six Apart
- MySQL database system—A multithreaded, multi-user, SQL (Structured Query Language) Database Management System (DBMS) with an estimated six million installations. BDB is one of several data storage backends available for MySQL; others include MyISAM and InnoDB. Support for BDB is removed in version 5.1.
- OpenLDAP—A free/open source implementation of the Lightweight Directory Access Protocol (LDAP)
- Postfix—A fast, secure, easy to administer MTA for Linux/Unix systems
- Redland—A RDF Application Framework can use BDB for persistent storage (triplestore)
- RPM—The RPM Package Manager, uses Berkeley DB to retain its internal database of packages installed on the system
- Spamassassin—An anti-spam application
- SquidGuard —A filtering proxy program. See Squid cache
- Subversion—A version control system designed specifically to replace CVS
- Sun Grid Engine—A free/open source distributed resource management system; the most popular batch-queueing job scheduler for server farms.
[edit] Licensing
Versions 2.0 and higher of Berkeley DB are available under a dual license. Versions earlier than 2.0 are available under a BSD-like license that has an unusual additional clause similar to the GNU GPL version 2's Section 3.
The Sleepycat Public License requires that software that uses the Berkeley DB code be free/open source software (under an approved OSI license) to allow redistribution. Alternatively, if the application using the Sleepycat licensed code is not redistributed (see the GPL's definition of 'redistribution') then the Sleepycat license terms are not broken. If a given application is using Berkeley DB and redistributing it under a non-free/closed source license then the publisher of that application is in violation of the Sleepycat Public License and must negotiate a new license allowing those rights from the copyright holder (in this case Oracle). Oracle offers a standard commercial Berkeley DB license agreement for this purpose.
[edit] External links
[edit] References
- ^ Brunelli, Mark. "A Berkeley DB primer", Enterprise Linux News, March 28, 2005. Accessed October 18, 2007.
|