Database tuning

Database tuning describes a group of activities used to optimize and homogenize the performance of a database. It usually overlaps with query tuning, but refers to design of the database files, selection of the database management system (DBMS) application, and configuration of the database's environment (operating system, CPU, etc.).

Database tuning aims to maximize use of system resources to perform work as efficiently and rapidly as possible. Most systems are designed to manage their use of system resources, but there is still much room to improve their efficiency by customizing their settings and configuration for the database and the DBMS.

I/O tuning

Hardware and software configuration of disk subsystems are examined: RAID levels and configuration,[1] block and stripe size allocation, and the configuration of disks, controller cards, storage cabinets, and external storage systems such as SANs. Transaction logs and temporary spaces are heavy consumers of I/O, and affect performance for all users of the database. Placing them appropriately is crucial.

Frequently joined tables and indexes are placed so that as they are requested from file storage, they can be retrieved in parallel from separate disks simultaneously. Frequently accessed tables and indexes are placed on separate disks to balance I/O and prevent read queuing.

DBMS tuning

DBMS tuning refers to tuning of the DBMS and the configuration of the memory and processing resources of the computer running the DBMS. This is typically done through configuring the DBMS, but the resources involved are shared with the host system.

Tuning the DBMS can involve setting the recovery interval (time needed to restore the state of data to a particular point in time), assigning parallelism (the breaking up of work from a single query into tasks assigned to different processing resources), and network protocols used to communicate with database consumers.

Memory is allocated for data, execution plans, procedure cache, and work space. It is much faster to access data in memory than data on storage, so maintaining a sizable cache of data makes activities perform faster. The same consideration is given to work space. Caching execution plans and procedures means that they are reused instead of recompiled when needed. It is important to take as much memory as possible, while leaving enough for other processes and the OS to use without excessive paging of memory to storage.

Processing resources are sometimes assigned to specific activities to improve concurrency. On a server with eight processors, six could be reserved for the DBMS to maximize available processing resources for the database.

Database maintenance

Database maintenance includes backups, column statistics updates, and defragmentation of data inside the database files.[2]

On a heavily used database, the transaction log grows rapidly. Transaction log entries must be removed from the log to make room for future entries. Frequent transaction log backups are smaller, so they interrupt database activity for shorter periods of time.

DBMS use statistic histograms to find data in a range against a table or index. Statistics updates should be scheduled frequently and sample as much of the underlying data as possible. Accurate and updated statistics allow query engines to make good decisions about execution plans, as well as efficiently locate data.

Defragmentation of table and index data increases efficiency in accessing data. The amount of fragmentation depends on the nature of the data, how it is changed over time, and the amount of free space in database pages to accept inserts of data without creating additional pages.

References