A highly accelerated life test (HALT), is a stress testing methodology for accelerating product reliability during the engineering development process. It is commonly applied to electronic equipment and is performed to identify and thus help resolve design weaknesses in newly-developed equipment. Thus it greatly reduces the probability of in-service failures (i.e., it increases the product's reliability). Progressively more severe environmental stresses are applied building to a level significantly beyond what the equipment will see in-service. By this method weaknesses can be identified using a small number of samples (sometimes one or two but preferably at least five) in the shortest possible time and at least expense. A second function of HALT testing is that it characterises the equipment under test, and identifies the equipment's safe operating limits and design margins. Data from a HALT test is therefore used as a basis for the design of an optimal "HASS" or "ESS" test, which is used to screen every piece of production equipment for latent manufacturing defects and defective components. HASS or "highly accelerated stress screening" is an extension of HALT, but is applied during production.
Individual components, populated printed circuit boards, and whole electronic systems can be subjected to HALT testing. The size of the test sample is governed by many factors including the number of samples available, cost, type of stresses applied, and physical size. For example, component manufacturers can typically test thousands of individual components at one time whereas often it is not economically feasible to write off more than a few items of very expensive equipment because production quantities or the application does not justify the cost. A general principal is that while HALT test can and should be conducted at unit level, it is very desirable to conduct it at sub-assembly and piece-part level as well.
Temperature cycling and random vibration, power margining and power cycling are the most common form of failure acceleration for electronic equipment. HALT does not measure or determine equipment reliability but it does serve to improve the reliability of a product. It is an empirical method used across industry to identify the limiting failure modes of a product and the stresses at which these failures occur.
A significant advantage of accelerated life testing is that it can be conducted during the development phase of a product to weed out design problems and marginal components. Thus a consumer products company can achieve better customer satisfaction because fewer products have to be returned for repair, and can also save money on warranty returns, or an aerospace manufacturer can avoid catastrophic failures in aircraft or space vehicles. Another major advantage is that the design team can be moved on to designing new products rather than becoming occupied with problems in older products.
On military design and development programs HALT is conducted before qualification testing. By so doing, significant cost savings can be accrued, because the formal qualification of the equipment and subsequent customer acceptance will proceed more rapidly and at lower cost, and the need for multiple redesigns and repeat testing (regression tests) will be greatly reduced or eliminated.
Contents |
Several standards and test methods are available for a HALT test. Different stresses are applied with different failures occurring during each. The types of stress typically employed are:
In HALT these stresses are applied in a controlled, incremental fashion while the unit under test is continuously monitored for failures. Once the weaknesses of the product are uncovered and corrective actions taken, the limits of the product are clearly understood and the operating margins have been extended as far as possible. The result is that a much more mature product can be introduced much more quickly with a higher degree of reliability.
When HALT testing is applied during the design process, it can produce a very robust product without undue cost, because improvements are targeted only where they are needed. As failure modes are discovered and understood the product life can increase significantly. This makes the product more robust and risk of failure reduces drastically.
Individual components such as resistors, capacitors, and diodes, printed circuit boards, and whole electronic products such as cell phones, PDAs, and televisions, eventually fail at different rates under different end-user stress levels. For example, a typical consumer-owned television will not likely be operated at temperatures outside the range of normal living accommodations, or subjected to mechanical stress by being repeatedly dropped. A cell phone, on the other hand, may be dropped from 3 or 4 feet off the ground fairly often, and subjected to a varying range of vibrations. A commercial telephone switch may be required to operate in remote installations ranging from Barrow, Alaska to Phoenix, Arizona, at an ambient temperature range of from minus 50 to plus 120 degrees Fahrenheit. Components used in military and aerospace applications may be subjected to even more severe operating temperature requirements as well as high G-forces and ionizing radiation, sometimes simultaneously, to meet specific MIL-SPEC standards.
Therefore, failure rate data used to select any device in a product must correlate with the stress levels in the product or application. To accomplish this, the required lifetime and operating conditions for the product into which the components are designed must first be determined. For instance, in the above examples, the television set may only be required to operate through its warranty period, whereas the telephone switch may be required to operate without being serviced for ten years or more. Components used in a missile may only be required to operate for a few hours of testing and a few minutes of actual use, but their failure rate will be expected to be zero during that period. Devices used in satellites or space vehicles where replacement is not possible are expected to have a zero failure rate for the lifetime of the vehicle. Knowing the required failure rate as determined by the application, components can be selected based on the failure rate data supplied by the manufacturer as described above.
Typically, components used in consumer devices are chosen by finding the least expensive component which will meet the requirement for the warranty period. At the other end of the scale, components used in aerospace applications are more likely to be chosen for maximum reliability independent of cost. Moreover, due to cost, warranty failures are usually not expected to be zero during the warranty period, but rather to not exceed a level which might subject the manufacturer to unwanted publicity or legal action. Additionally, the failure rate for critical components may be required to be lower than for other components in a system. For instance, components in an automobile which may cause so-called "walk-home failures" are usually subject to higher reliability requirements than are components in the automobile's entertainment or security systems.
Once the product or device is deployed or sold into the marketplace, proper quality control procedure requires that the "quality loop" be closed by retrieving all components which fail in the field during the predicted lifetime, analyzing them to determine why they failed before they were predicted to fail, and determining where the reliability failure prediction was in error. Information from these analyses should then be used for appropriate corrective action in the reliability failure prediction methods.
A environmental chamber designed for HALT testing is required for satisfactory and successful HALT testing. A temperature ramp rate of at least 45 degrees Celsius per minute is required, and some HALT chambers can achieve closer to 60 degrees Celsius per minute. To achieve these high ramp-rates, cooling by means of the evaporation of liquid nitrogen is required. Chambers equipped with one or two compressors for cooling, such as those typically employed during qualification testing, are really not adequate. Such chambers are typically only capable of 15–18 degrees C/minute, and are subject to frequent breakdowns if always pushed to their limit.
A suitable chamber also has to be capable of applying random vibration with a suitable spectral density profile in relation to frequency. It has to be capable of doing this at the same time as it applies temperature cycling so a combined chamber is essential rather than separate chambers for vibration and temperature cycling. Whereas chambers used for qualification testing often use electro-dynamic vibrators (like a giant moving-coil loudspeaker). Chambers designed for HALT testing use pneumatic hammers arranged to strike the baseplate upon which the equipment is mounted. The hammers produce six degree of freedom random vibration: they produce simultaneous vibration in each principle axis and rotations about each axis. This compares with the single axis of vibration provided by electro-dynamic vibrators.
Test fixture design for HALT testing requires that the random vibration be transmitted to the item under test such that all areas and components are stressed as nearly as possible to the same degree. Thus a successful fixture is one that has been designed or adjusted to minimise resonances at specific frequencies. At the same time a successful fixture is one that is adequately open in design or uses forced air circulation so that ramp rates are maximised
During a HALT test the equipment under test has to be functioned and its operation monitored so that if the equipment fails whilst being stressed the failure is detected. The failure may only be present whilst the stress is applied and may not cause permanent degradation that would be apparent after the stress is removed. It can be a challenge to adequately monitor and diagnose failure conditions. All failures during HALT testing are subject to failure analysis and root cause analysis. Not all failures result in redesign. Some failures are accepted as operational limits and at some level of high stress it may be decided that because the stress level sufficiently exceeds the operational requirement an adequate design margin exists and no further design improvement is necessary or economically justified.
"HALT, HASS & HASA Explained, Accelerated Reliability Techniques, Revised Edition" by Harry W.McLean, ASQ ISBN 978-0-87389-766-2.
"Management & Technical Guidelines for the ESS Process" IEST-RP-PR001.1, published by the Institute of Environmental Sciences and Technology.
"Accelerated testing" a practitioner’s guide to accelerated and reliability testing, by Bryan Dodson and Harry Schwab.
"Accelerated Reliability Engineering", by Gregg Hobbs. ISBN 0-615-12833-5.