IDMS
From Wikipedia, the free encyclopedia
IDMS (Integrated Database Management System) is a (network) CODASYL database management system first developed at B.F. Goodrich and later marketed by Cullinane Database Systems (renamed Cullinet in 1983). Since 1989 the product has been owned by Computer Associates, who renamed it CA-IDMS.
Contents |
[edit] History
The roots of IDMS go back to Dr. Charles Bachman's IDS (Integrated Data Store), an early database engine developed at General Electric.
In the early 1960s IDS was taken from its original form, by the Computer Group of the B.F. Goodrich Chemical Division, and re-written in a language called ISL (Intermediate System Language). ISL was designed as a portable system programming language able to produce code for a variety of target machines. Since ISL was actually written in its own language (ISL) it was able to be ported to other machine architectures with relative ease, and then to produce code that would execute on them.
The Chemical Division computer group had given some thought to selling copies of IDMS to other companies, but was told by management that they were not in the software products business. Eventually a deal was struck with John Cullinane to buy the rights and market the product.
Because Cullinane was required to remit royalties back to B.F. Goodrich, all add-on products were listed and billed as separate products - even if they were mandatory for the core IDMS product to work. This sometimes confused customers.
The original platforms were the GE 235 computer and GE Datanet 30 message switching computer: later the product was ported to IBM mainframes and to DEC and ICL hardware.
The IBM-ported version ran on IBM mainframe systems (System/360, System/370, System/390, zSeries, System z9). In the mid-1980s, it was claimed that some 2,500 IDMS licenses had been sold. Users included the Strategic Air Command, Ford of Canada, Royal Insurance, Manulife, Hudson's Bay Company, Cleveland Clinic, Bank of Canada and BT in the UK.
A version for use on the DECSYSTEM series of computers was sold to DEC and was marketed as DBMS10 and later DBMS20.
In 1976 the source code was sold to International Computers Ltd (ICL), who ported the software to run on their 2900 series mainframes, and subsequently also on the older 1900 range. ICL continued development of the software independently of Cullinane, selling the original ported product under the original name IDMS and an enhanced version as IDMSX. In this form it was used by many large UK users, an example being the Pay-As-You-Earn system operated by the UK Inland Revenue. Many of these systems are still running in 2005.
In the early to mid 1980s, relational database management systems started to become more popular, encouraged by increasing hardware power and the move to minicomputers and client-server architecture. Relational databases offered improved development productivity over CODASYL systems, and the traditional objections based on poor performance were steadily diminishing.
Cullinet attempted to compete against IBM's DB2 and other relational databases by developing a relational front-end and a range of productivity tools. These included Automatic System Facility (ASF), which made use of a pre-existing IDMS feature called LRF (Logical Record Facility). ASF was a fill-in-the-blanks database generator that would also develop a mini-application to maintain the tables.
It is difficult to judge whether such features may have been successful in extending the selling life of the product, but they made little impact in the long term. Those users who stayed with IDMS were primarily interested in its high performance, not in its relational capabilities. It was widely recognized (helped by a high-profile campaign by E. F. Codd, the father of the relational model) that there was a significant difference between a true relational database and a traditional database with a relational veneer.
IDMS legacy systems are still being run today. Few customers have migrated to Cullinet's other database offering IDMS/R.
[edit] Integrated Data Dictionary
One of the sophisticated features of IDMS was its built-in Integrated Data Dictionary (IDD). The IDD was primarily developed to maintain database definitions. It was itself an IDMS database.
DBAs (database administrators) and other users interfaced with the IDD using a language called Data Dictionary Definition Language (DDDL).
IDD was also used to store definitions and code for other products in the IDMS family such as ADS/Online and IDMS-DC.
IDD's power was that it was extensible and could be used to create definitions of just about anything. Some companies used it to develop in-house documentation.
[edit] Overview
[edit] Logical Data Model
The data model offered to users is the CODASYL network model. The main structuring concepts in this model are records and sets. Records essentially follow the COBOL pattern, consisting of fields of different types: this allows complex internal structure such as repeating items and repeating groups.
The most distinctive structuring concept in the Codasyl model is the set. Not to be confused with a mathematical set, a Codasyl set represents a one-to-many relationship between records: one owner, many members. The fact that a record can be a member in many different sets is the key factor that distinguishes the network model from the earlier hierarchical model. As with records, each set belongs to a named set type (different set types model different logical relationships). Sets are in fact ordered, and the sequence of records in a set can be used to convey information. A record can participate as an owner and member of any number of sets.
Records have identity, the identity being represented by a value known as a database key. In IDMS, as in most other Codasyl implementations, the database key is directly related to the physical address of the record on disk. Database keys are also used as pointers to implement sets in the form of linked lists and trees. This close correspondence between the logical model and the physical implementation (which is not a strictly necessary part of the Codasyl model, but was a characteristic of all successful implementations) is responsible for the efficiency of database retrieval, but also makes operations such as database loading and restructuring rather expensive.
Records can be accessed directly by database key, by following set relationships, or by direct access using key values. Initially the only direct access was through hashing, a mechanism known in the Codasyl model as CALC access. In IDMS, CALC access is implemented through an internal set, linking all records that share the same hash value to an owner record that occupies the first few bytes of every disk page.
In subsequent years, some versions of IDMS added the ability to access records using BTree-like indexes.
[edit] Storage
IDMS organizes its databases as a series of files. These files are mapped and pre-formatted into so-called areas. The areas are subdivided into pages which correspond to physical blocks on the disk. The database records are stored within these blocks.
The DBA allocates a fixed number of pages in a file for each area. The DBA then defines which records are to be stored in each area, and details of how they are to be stored.
IDMS intersperses special space-allocation pages throughout the database. These pages are used to keep track of the free space available in each page in the database. To reduce I/O requirements, the free space is only tracked for pages where the used space is less than 30%.
Three methods are available for storing records in an IDMS database: Sequential, CALC, and VIA.
Sequential placement (not to be confused with indexed sequential), simply places each new record at the end of the area. This option is rarely used.
CALC uses a hashing algorithm to decide where to place the record; the hash key then provides efficient retrieval of the record. The entire CALC area is preformatted each with a header consisting of a special CALC "owner" record. The hashing algorithm determines a page number (from which the physical disk address can be determined), and the record is then stored on this page, or as near as possible to it, and is linked to the header record on that page using the CALC set. The CALC records are linked to the page's CALC Owner record using a single link-list (pointers). The CALC Owner located in the page header thusly owns the set of all records which target to its particular page (whether the records are stored on the page of off the page in the case of an overflow).
CALC provides extremely efficient storage and retrieval: IDMS can retrieve a CALC record in 1.1 I/O operations. However, the method does not cope well with changes to the value of the primary key, and expensive reorganization is needed if the number of pages needs to be expanded.
VIA placement attempts to store a record near its owner in a particular set. Usually the records are clustered on the same physical page as the owner. This leads to efficient navigation when the record is accessed by following that set relationship. (VIA allows records to be stored in a different IDMS area so that they can be store separately from the owner, yet remain clustered together for efficiency.)
Sets are generally maintained as linked lists, using the database key as a pointer. Every record includes a forwards link to the next record; the database designer can choose whether to include owner pointers and prior pointers (if not provided, navigation in those directions will be slower).
Some versions of IDMS subsequently included the ability to define indexes: either record indexes, allowing records to be located from knowledge of a secondary key, or set indexes, allowing the members of a set to be retrieved by key value.