Database tuning
From Wikipedia, the free encyclopedia
Database tuning describes a group of activities used to optimize and homogenize the performance of a database. It usually overlaps with query tuning, but refers to configuration of the database files, the database management system (DBMS), and the operating system and hardware the DBMS runs on.
The goal is to maximize use of system resources to perform work as efficiently and rapidly as possible. Most systems are designed to manage work efficiently, but it is possible to greatly improve performance by customizing settings and the configuration for the database and the DBMS being tuned.
[edit] I/O tuning
I/O tuning is placing database transaction logs, files associated with temporary work spaces, and table and index file storage to optimize and balance reads and writes against these files. I/O is generally the most expensive operation in database work, and is typically the first bottleneck in database performance encountered.
Hardware and software configuration of disk subsystems are examined: RAID levels and configuration, block and stripe size allocation, and the configuration of disks, controller cards, storage cabinets, and external storage systems such as a SAN. Transaction logs and temporary spaces are heavy consumers of I/O, and affect performance for all users of the database. Placing them appropriately is crucial.
Frequently joined tables and indexes are placed so that as they are requested from file storage, they can be retrieved in parallel from separate disks simultaneously. Frequently accessed tables and indexes are placed on separate disks to balance I/O and prevent read queuing.
[edit] DBMS tuning
DBMS tuning refers to tuning of the DBMS and the configuration of the memory and processing resources of the computer running the DBMS. This is typically done through configuring the DBMS, but the resources involved are shared with the host system.
Tuning the DBMS can involve setting the recovery interval (time needed to restore the state of data to a particular point in time), assigning parallelism (the breaking up of work from a single query into tasks assigned to different processing resources), and network protocols used to communicate with database consumers.
Memory is allocated for data, execution plans, procedure cache, and work space. It is much faster to access data in memory than data on storage, so maintaining a sizable cache of data makes activities perform faster. The same consideration is given to work space. Caching execution plans and procedures means that they are reused instead of recompiled when needed. It is important to take as much memory as possible, while leaving enough for other processes and the OS to use without excessive paging of memory to storage.
Processing resources are sometimes assigned to specific activities to improve concurrency. On a server with eight processors, six could be reserved for the DBMS to maximize available processing resources for the database.
[edit] Database maintenance
Database maintenance includes backups, column statistics updates, and defragmentation of data inside the database files.
On a heavily used database, the transaction log grows rapidly. Transaction log entries must be removed from the log to make room for future entries. Frequent transaction log backups are smaller, so they interrupt database activity for shorter periods of time.
DBMS use statistic histograms to find data in a range against a table or index. Statistics updates should be scheduled frequently and sample as much of the underlying data as possible. Accurate and updated statistics allow query engines to make good decisions about execution plans, as well as efficiently locate data.
Defragmentation of table and index data increases efficiency in accessing data. The amount of fragmentation depends on the nature of the data, how it is changed over time, and the amount of free space in database pages to accept inserts of data without creating additional pages.