Worst-case execution time

The worst-case execution time (WCET) of a computational task is the maximum length of time the task could take to execute on a specific hardware platform.

What WCET is used for

Worst case execution time is typically used in reliable real-time systems, where understanding the worst case timing behaviour of software is important for reliability or correct functional behaviour.

As an example, a computer system that controls the behaviour of an engine in a vehicle might need to respond to inputs within a specific amount of time. One component that makes up the response time is the time spent executing the software – hence if the software worst case execution time can be determined, then the designer of the system can use this with other techniques such as schedulability analysis to ensure that the system responds fast enough.

While WCET is potentially applicable to many real-time systems, in practice the an assurance of WCET is mainly used by real-time systems that are related to high reliability or safety. For example in airborne software, some attention to software is required by DO178B section 6.3.4. The increasing use of software in automotive systems is also driving the need to use WCET analysis of software.

In the design of some systems, WCET is often used as an input to schedulability analysis, although a much more common use of WCET in critical systems is to ensure that the pre-allocated timing budgets in a partition-scheduled system such as ARINC 653 are not violated.

Finding WCET

Since the early days of embedded computing, embedded software developers have either used:

end-to-end measurements of code, for example performed by setting an I/O pin on the device to high at the start of the task, and to low at the end of the task and using a logic analyzer to measure the longest pulse width, or by measuring within the software itself using the processor clock or instruction count.
manual static analysis techniques such as counting assembler instructions for each function, loop etc. and then combining them.

Both of these techniques have limitations. End to end measurements place a high burden on software testing to achieve the longest path; counting instructions is only applicable to simple software and hardware. In both cases a margin for error is often used to account for untested code, hardware performance approximations or mistakes. A margin of 20% is often used, although there is very little justification used for this figure, save for historical confidence (“it worked last time”).

As software and hardware have increased in complexity, it has driven the need for tool support. Complexity is increasingly becoming an issue in both static analysis and measurements. It is difficult to judge how wide the error margin should be and how well tested the software system is. System safety arguments based on a high-water mark achieved during testing are widely used, but become harder to justify as the software and hardware are less predicable.

In the future, it is likely that a requirement for safety critical systems is that they are analyzed using both static and measurement-based approaches.

Considerations when calculating WCET

The problem of finding WCET by analysis is equivalent to the halting problem and is therefore insoluble in the general case. Fortunately for the kind of systems that engineers typically want to find WCET for, the software is typically well structured, will always terminate and analyzable.

Most methods for finding a WCET involve approximations (usually a rounding upwards when there are uncertainties) and hence in practice the exact WCET itself is often regarded as unobtainable. Instead, different techniques for finding the WCET produce estimates for the WCET. Those estimates are typically pessimistic, meaning that the estimated WCET is known to be higher than the real WCET (which is usually what is desired). Much work on WCET analysis is on reducing the pessimism in analysis so that the estimated value is low enough to be valuable to the system designer.

WCET analysis usually refers to the execution time of single thread, task or process. However, of course on modern hardware, especially multi-core, other tasks in the system will impact the WCET of a given task if they share cache, memory lines and other hardware features. Further, task scheduling events such as blocking or to be interruptions should be considered in WCET analysis if they can occur in a particular system. Therefore, it is important to consider the context in which WCET analysis is applied.

Automated Approaches to finding WCET

There are many automated approaches to calculating WCET beyond the manual techniques above. These include:

analytical techniques to improve test cases to increase confidence in end to end measurements
static analysis of the software (“static” meaning without executing the software).
combined approaches, often referred to as “hybrid” analysis, being a combination of measurements and structural analysis

Static Analysis Techniques

A static WCET tool attempts to provide WCET estimations by examining the computer software without executing it directly on the hardware. Static analysis techniques have dominated research in the area since the late 1980s, although in an industrial setting end to end measurements approaches were standard practice.

Static analysis tools work at a high-level to determine the structure of a program's task, working either on a piece of source code or disassembled binary executable. They also work at a low-level, using timing information about the real hardware that the task will execute on, with all its specific features. By combining those two kinds of analysis, the tool attempts to give an upper bound on the time required to execute a given task on a given hardware platform.

At the low-level, static WCET analysis is complicated by the presence of architectural features that improve the average-case performance of the processor: instruction/data caches, branch prediction and instruction pipelines for example. It is possible, but increasingly difficult, to determine tight WCET bounds if these modern architectural features are taken into account in the timing model used by the analysis.

Certification authorities such as the European Aviation Safety Agency, therefore, rely on model validation suites.

Static analysis has resulted in good results for simpler hardware, however a possible limitation of static analysis is that the hardware (the CPU in particular) has reached a complexity which is extremely hard to model. In particular, the modelling process can introduce errors from several sources: errors in chip design, lack of documentation, errors in documentation, errors in model creation; all leading to cases where the model predicts a different behavior to that observed on real hardware. Typically, where it is not possible to accurately predict a behavior, a pessimistic result is used, which can lead to the WCET estimate being much larger than anything achieved at run-time.

Multi-core hardware is particularly difficult to obtain good static WCET estimations for.

There are a number of commercial and academic tools that implement various forms of static analysis.

Measurement and Hybrid Techniques

Measurement-based and hybrid approaches usually try to measure the execution times of short code segments on the real hardware, which are then combined in a higher level analysis. Tools take into account the structure of the software (e.g. loops, branches), to produce an estimate of the WCET of the larger program. The rationale is that it's hard to test the longest path in complex software, but it is easier to test the longest path in many smaller components of it. A worst case effect needs only to be seen once during testing for the analysis to be able to combine it with other worst case events in its analysis.

Typically the small sections of software can be measured automatically using techniques such as instrumentation (adding markers to the software) or with hardware support such as debuggers, and CPU hardware tracing modules. These markers result in a trace of execution, which includes both the path taken through the program and the time at which different points were executed. The trace is then analyzed to determine the maximum time that each part of the program has ever taken to execute, what the maximum observed iteration time of each loop is and whether there are any parts of the software that are untested (Code coverage).

Measurement-based WCET analysis has resulted in good results for both simple and complex hardware, although like static analysis it can suffer excessive pessimism in multi-core situations, where the impact of one core on another is hard to define. A limitation of measurement is that it relies on observing the worst-case effects during testing (although not necessarily at the same time). It can be hard to determine if the worst case effects have necessarily been tested.

There are a number of commercial and academic tools that implement various forms of measurement-based analysis.

Research

The most active research groups are in Sweden (Mälardalen, Linköping), Germany (Saarbrücken, Dortmund, Braunschweig), France (Toulouse, Saclay, Rennes), Austria (Vienna), UK (University of York and Rapita Systems Ltd), Italy (Bologna), Spain (Cantabria, Valencia), and Switzerland (Zurich). Recently, the topic of code-level timing analysis has found more attention outside of Europe by research groups in the US (North Carolina, Florida), Canada, Australia, Bangladesh(MBI LAB), Kingdom of Saudi Arabia-UQU(HISE LAB) and Singapore.

WCET Tool Challenge

The first international WCET Tool Challenge took place during the autumn of 2006. It was organized by the University of Mälardalen and sponsored by the ARTIST2 Network of Excellence on Embedded Systems Design. The aim of the Challenge was to inspect and to compare different approaches in analyzing the worst-case execution time. All available tools and prototypes able to determine safe upper bounds for the WCET of tasks have participated. The final results were presented in November 2006 at the ISoLA 2006 International Symposium in Paphos, Cyprus.

A second Challenge took place in 2008 .