Mean time between failures

From Wikipedia, the free encyclopedia

Mean time between failures (MTBF) is the mean (average) time between failures of a system, and is often attributed to the "useful life" of the device i.e. not including 'infant mortality' or 'end of life' if the device is not repairable. Calculations of MTBF assume that a system is "renewed", i.e. fixed, after each failure, and then returned to service immediately after failure. The average time between failing and being returned to service is termed mean down time (MDT) or mean time to repair (MTTR).

Contents

[edit] Overview

Image:Time between failures.jpg

\text{Mean time between failures}
= \text{MTBF} =\frac{\Sigma{(\text{downtime} - \text{uptime})}}\text{number of failures}. \!

Note: For each observation, downtime is the instantaneous time it went down, which is after (i.e. greater than) the moment it went up, uptime. The difference (downtime - uptime) is the amount of time it was operating between these two events.

[edit] Formal definition of MTBF

Mathematically, the MTBF is the sum of the MTTF (mean time to failure) and MTTR (mean time to repair). The MTTF is simply the reciprocal of the failure rate,

\text{MTTF} = \frac{1}{\lambda}. \!

The MTTF is often denoted by the symbol \! \theta, or

\text{MTTF} = \theta. \!

Since failure rate and MTTF are simply reciprocals, both notations are found in the literature, depending on which notation is most convenient for the application.

The MTTF can be defined in terms of the expected value of the failure density function f(t)

\text{MTTF} = \int_{0}^{\infty} tf(t)\, dt \!

with

\int_{0}^{\infty} f(t)\, dt=1. \!

The MTTR can be similarly derived from the repair rate.

A common misconception about the MTBF is that it specifies the time (on average) when the probability of failure equals the probability of not having a failure. This is only true for certain symmetric distributions. In many cases, such as the (non-symmetric) exponential distribution, this is not the case. In particular, for an exponential failure distribution, the probability that an item will fail after an MTBF is approximately 0.63. For typical distributions with some variance, MTBF only represents a top-level aggregate statistic, and thus is not suitable for predicting specific time to failure, the uncertainty arising from the variability in the time-to-failure distribution.

On commercial product descriptions, the "MTTF lifetime" is the amount of time the product should last, assuming that it is used properly.

[edit] Variations of MTBF

There are many variations of MTBF, such as mean time between system aborts (MTBSA) or mean time between critical failures (MTBCF) or mean time between unit replacement (MTBUR). Such nomenclature is used when it is desirable to differentiate among types of failures, such as critical and non-critical failures. For example, in an automobile, the failure of the FM radio does not prevent the primary operation of vehicle. Mean time to failure (MTTF) is sometimes used instead of MTBF in cases where a system is replaced after a failure, since MTBF denotes time between failures in a system which is repaired.

[edit] Problems with MTBF

As of 1995, the use of MTBF in the aeronautical industry (and others) has been called into question due to the inaccuracy of its application to real systems and the nature of the culture which it engenders. Many component MTBFs are given in databases, and often these values are very inaccurate.

This has led to the negative exponential distribution being used much more than it should have been. Some estimates say that only 40% of components have failure rates described by this. It has also been corrupted into the notion of an "acceptable" level of failures, which removes the desire to get to the root cause of a problem and take measures to delete it. The British Royal Air Force is looking at other methods to describe reliability, such as maintenance-free operating period (MFOP).

[edit] See also

[edit] External links