Vienna Development Method

From Wikipedia, the free encyclopedia

The Vienna Development Method (VDM) is one of the longest-established Formal Methods for the development of computer-based systems. Originating in work done at IBM's Vienna Laboratory in the 1970s, it has grown to include a group of techniques and tools based on a formal specification language - the VDM Specification Language (VDM-SL). It has an extended form, VDM++, which supports the modeling of object-oriented and concurrent systems. Support for VDM includes commercial and academic tools for analyzing models, including support for testing and proving properties of models and generating program code from validated VDM models. There is a history of industrial usage of VDM and its tools and a growing body of research in the formalism has led to notable contributions to the engineering of critical systems, compilers, concurrent systems and in logic for computer science.

Contents

[edit] Philosophy

Computing systems may be modeled in VDM-SL at a higher level of abstraction than is achievable using programming languages, allowing the analysis of designs and identification of key features, including defects, at an early stage of system development. Models that have been validated can be transformed into detailed system designs through a refinement process. The language has a formal semantics, enabling proof of the properties of models to a high level of assurance. It also has an executable subset, so that models may be analyzed by testing and can be executed through graphical user interfaces, so that models can be evaluated by experts who are not necessarily familiar with the modeling language itself.

[edit] History

The origins of VDM-SL lie in the IBM Laboratory in Vienna where the first version of the language, then called Meta-IV (Bjørner et al. 1978), was used to describe the PL/I programming language. Other programming languages described, or partially described, using Meta-IV and VDM-SL include the BASIC programming language, FORTRAN, the APL programming language, ALGOL 60, the Ada programming language and the Pascal programming language. Meta-IV evolved into several variants, generally described as the Danish, English and Irish Schools.

The "English School" derived from work by Cliff Jones on the aspects of VDM not specifically related to language definition and compiler design (Jones 1980, 1990). It stresses modelling persistent state through the use of data types constructed from a rich collection of base types. Functionality is typically described through operations which may have side-effects on the state and which are mostly specified implicitly using a precondition and postcondition. The "Danish School" (Bjørner et al. 1982) has tended to stress a constructive approach with explicit operational specification used to a greater extent. Work in the Danish school led to the first European validated Ada compiler.

An ISO Standard for the language was released in 1996 (ISO, 1996).

[edit] VDM Features

The VDM-SL and VDM++ syntax and semantics are described at length in the VDMTools language manuals and in the available texts. The ISO Standard contains a formal definition of the language’s semantics. In the remainder of this article, the ISO-defined interchange (ASCII) syntax is used. Some texts prefer a more concise mathematical syntax.

A VDM-SL model is a system description given in terms of the functionality performed on data. It consists of a series of definitions of data types and functions or operations performed upon them.

[edit] Basic Types: numeric, character, token and quote types

VDM-SL includes basic types modelling numbers and characters as follows:

Basic Types
bool Boolean datatype false, true
nat natural numbers (including zero) 0, 1, 2, 3, ...
nat1 natural numbers (excluding zero) 1, 2, 3, 4, ...
int integers ..., -3, -2, -1, 0, 1, 2, 3, ...
rat rational numbers a/b, where a and b are integers, b is not 0
real real numbers ...
char characters A, B, C, ...
token structureless tokens ...
<A> the quote type containing the value <A> ...

Data types are defined to represent the main data of the modelled system. Each type definition introduces a new type name and gives a representation in terms of the basic types or in terms of types already introduced. For example, a type modelling user identifiers for a log-in management system might be defined as follows:

types
UserId = nat

For manipulating values belonging to data types, operators are defined on the values. Thus, natural number addition, subtraction etc. are provided, as are Boolean operators such as equality and inequality. The language does not fix a maximum or minimum representable number or a precision for real numbers. Such constraints are defined where they are required in each model by means of data type invariants -- Boolean expressions denoting conditions that must be respected by all elements of the defined type. For example a requirement that user identifiers must be no greater than 9999 would be expressed as follows (where <= is the “less than or equal to” Boolean operator on natural numbers):

UserId = nat
inv uid == uid <= 9999

Since invariants can be arbitrarily complex logical expressions, and membership of a defined type is limited to only those values satisfying the invariant, type correctness in VDM-SL is not automatically decidable in all situations.

The other basic types include char for characters. In some cases, the representation of a type is not relevant to the model’s purpose and would only add complexity. In such cases, the members of the type may be represented as structureless tokens. Values of token types can only be compared for equality – no other operators are defined on them. Where specific named values are required, these are introduced as quote types. Each quote type consists of one named value of the same name as the type itself. Values of quote types (known as quote literals) may only be compared for equality.

For example, in modelling a traffic signal controller, it may be convenient to define values to represent the colours of the traffic signal as quote types:

<Red>, <Amber>, <FlashingAmber>, <Green>

[edit] Type Constructors: Union, Product and Composite Types

The basic types alone are of limited value. New, more structured data types are built using type constructors.

Basic Type Constructors
T1 | T2 | ... | Tn Union of types T1,...,Tn
T1*T2*...*Tn Cartesian product of types T1,...,Tn
T :: f1:T1 ... fn:Tn Composite (Record) type

The most basic type constructor forms the union of two predefined types. The type (A|B) contains all elements of the type A and all of the type B. In the traffic signal controller example, the type modelling the colour of a traffic signal could be defined as follows:

SignalColour =  <Red> | <Amber> | <FlashingAmber> | <Green>

Enumerated types in VDM-SL are defined as shown above as unions on quote types.

Cartesian product types may also be defined in VDM-SL. The type (A1*…*An) is the type composed of all pairs of values, the first element of which is from the type A1 and the second from the type A2 and so on. The composite or record type is a Cartesian product with labels for the fields. The type

T :: f1:A1
     f2:A2
     ...
     fn:An 

is the Cartesian product with fields labelled f1,…,fn. An element of type T can be composed from its constituent parts by a constructor, written mk_T. Conversely, given an element of type T, the field names can be used to select the named component. For example, the type

Date :: day:nat1
        month:nat1
        year:nat
inv mk_Date(d,m,y) == day <=31 and month<=12 

models a simple date type. The value mk_Date(1,4,2001) corresponds to 1 April 2001. Given a date d, the expression d.month is a natural number representing the month. Restrictions on days per month and leap years could be incorporated into the invariant if desired. Combining these:

mk_Date(1,4,2001).month = 4

[edit] Collections: Sets, Mappings and Sequences

Collection types model groups of values. Sets are finite unordered collections in which duplication between values is suppressed. Sequences are finite ordered collections (lists) in which duplication may occur and mappings represent finite correspondences between two sets of values.

The set type constructor (written set of T where T is a predefined type) constructs the type composed of all finite sets of values drawn from the type T. For example, the type definition

UGroup = set of UserId 

defines a type UGroup composed of all finite sets of UserId values. Various operators are defined on sets for constructing their union, intersections, determining proper and non-strict subset relationships etc.

Main Operators on Sets (s, s1, s2 are sets)
{a, b, c} Set enumeration: the set of elements a, b and c
{ x | x:T & P(x)} Set comprehension: the set of x from type T such that P(x)
{i, ..., j} The set of integers in the range i to j
e in set s e is an element of set s
e not in set s e is not an element of set s
s1 union s2 Union of sets s1 and s2
s1 inter s2 Intersection of sets s1 and s2
s1 \ s2 Set difference of sets s1 and s2
dunion s Distributed union of set of sets s
s1 psubset s2 s1 is a (proper) subset of s2
s1 subset s2 s1 is a (weak) subset of s2
card s The cardinality of set s

The finite sequence type constructor (written seq of T where T is a predefined type) constructs the type composed of all finite lists of values drawn from the type T. For example, the type definition

String = seq of char

Defines a type String composed of all finite strings of characters. Various operators are defined on sequences for constructing concatenation, selection of elements and subsequences etc. Many of these operators are partial in the sense that they are not defined for certain applications. For example, selecting the 5th element of a sequence that contains only three elements is undefined.

The order and repetition of items in a sequence is significant, so [a, b] is not equal to [b, a], and [a] is not equal to [a, a].

Main Operators on Sequences (s, s1,s2 are sequences)
[a, b, c] Sequence enumeration: the sequence of elements a, b and c
[f(x) | x:T & P(x)] Sequence comprehension: sequence of expressions f(x) for each x of (numeric) type T such that P(x) holds (x values taken in numeric order)
hd s The head (first element) of s
tl s The tail (last element) of s
len s The length of s
elems s The set of elements of s
s(i) The ith element of s
s1^s2 the sequence formed by concatenating sequences s1 and s2

A finite mapping is a correspondence between two sets, the domain and range, with the domain indexing elements of the range. It is therefore similar to a finite function. The mapping type constructor in VDM-SL (written map T1 to T2) where T1 and T2 are predefined types) constructs the type composed of all finite mappings from sets of T1 values to sets of T2 values. For example, the type definition

Birthdays = map String to Date

Defines a type Birthdays which maps character strings to Date. Again, operators are defined on mappings for indexing into the mapping, merging mappings, overwriting extracting sub-mappings.

Main Operators on Mappings
{a | -> r, b | -> s} Mapping enumeration: a maps to r, b maps to s
{x | -> f(x) | x:T & P(x)} Mapping comprehension: x maps to f(x) for all x for type T such that P(x)
dom m The domain of m
rng m The range of m
m(x) m applied to x
m1 munion m2 Union of mappings m1 and m2 (m1, m2 must be consistent where they overlap)
m1 ++ m2 m1 overwritten by m2

[edit] Structuring

The main difference between the VDM-SL and VDM++ notations are the way in which structuring is dealt with. In VDM-SL there is a conventionel modular extension whereas VDM++ has a traditional object-oriented structuring mechanism with classes and inheritance.

[edit] Structuring in VDM-SL

In the ISO standard for VDM-SL there is an informative annex that contains different structuring principles. These all follow traditional information hiding principles with modules and they can be explained as:

  • Module naming: Each module is syntactically started with the keyword module followed by the name of the module. At the end of a module the keyword end is written followed again by the name of the module.
  • Importing: It is possible to import definitions that has been exported from other modules. This is done in an imports section that is started off with the keyword imports and followed by a sequence of imports from different modules. Each of these module imports are started with the keyword from followed by the name of the module and a module signature. The module signature can either simply be the keyword all indicating the import of all definitions exported from that module, or it can be a sequence of import signatures. The import signatures are specific for types, values, functions and operations and each of these are started with the corresponding keyword. In addition these import signatures name the constructs that there is a desire to get access to. In addition optional type information can be present and finally it is possible to rename each of the constructs upon import. For types one needs also to use the keyword struct if one wish to get access to the internal structure of a particular type.
  • Exporting: The definitions from a module that one wish other modules to have access to are exported using the keyword exports followed by an exports module signature. The exports module signature can either simply consist of the keyword all or as a sequence of export signatures. Such export signatures are specific for types, values, functions and operations and each of these are started with the corresponding keyword. In case one wish to export the internal structure of a type the keyword struct must be used.
  • More exotic features: In earlier versions of the VDM-SL tools there was also support for parameterized modules and instantiations of such modules. However, these features was taken out of VDMTools around 2000 because they was hardly ever used in industrial applications and there was a substantial number of tool challenges with these features.

[edit] Structuring in VDM++

In VDM++ structuring are done using classes and multiple inheritance. The key concepts are:

  • Class: Each class is syntactically started with the keyword class followed by the name of the class. At the end of a class the keyword end is written followed again by the name of the class.
  • Inheritence: In case a class inherits constructs from other classes the class name in the class heading can be followed by the keywords is subclass of followed by a comma-separated list of names of superclasses.
  • Access modifiers: Information hiding in VDM++ is done in the same way as in most object oriented languages using access modifiers. In VDM++ definitions are per default private but in from of all definitions it is possible to use one of the access modifier keywords: private, public and protected.

[edit] Modelling Functionality

[edit] Functional Modelling

In VDM-SL, functions are defined over the data types defined in a model. Support for abstraction requires that it should be possible to characterize the result that a function should compute without having to say how it should be computed. The main mechanism for doing this is the implicit function definition in which, instead of a formula computing a result, a logical predicate over the input and result variables, termed a postcondition, gives the result's properties. For example, a function SQRT for calculating a square root of a natural number might be defined as follows:

SQRT(x:nat)r:real
post r*r = n

Here the postcondition does not define a method for calculating the result r but states what properties can be assumed to hold of it. Note that this defines a function that returns a valid square root; there is no requirement that it should be the positive or negative root. The specification above would be satisfied, for example, by a function that returned the negative root of 4 but the positive root of all other valid inputs. Note that functions in VDM-SL are required to be deterministic so that a function satisfying the example specification above must always return the same result for the same input.

A more constrained function specification is arrived at by strengthening the postcondition. For example the following definition constrains the function to return the positive root.

SQRT(x:nat)r:real
post r*r = n and r>=0 

All function specifications may be restricted by preconditions which are logical predicates over the input variables only and which describe constraints that are assumed to be satisfied when the function is executed. For example, a square root calculating function that works only on positive real numbers might be specified as follows:

SQRTP(x:real)r:real
pre x >=0
post r*r = n and r>=0 

The precondition and postcondition together form a contract that to be satisfied by any program claiming to implement the function. The precondition records the assumptions under which the function guarantees to return a result satisfying the postcondition. If a function is called on inputs that do not satisfy its precondition, the outcome is undefined (indeed, termination is not even guaranteed).

VDM-SL also supports the definition of executable functions in the manner of a functional programming language. In an explicit function definition, the result is defined by means of an expression over the inputs. For example, a function that produces a list of the squares of a list of numbers might be defined as follows:

SqList: seq of nat -> seq of nat
SqList(s) == if s = [] then [] else [(hd s)**2] ^ SqList(tl s)

This recursive definition consists of a function signature giving the types of the input and result and a function body. An implicit definition of the same function might take the following form:

SqListImp(s:seq of nat)r:seq of nat
post len r = len s and 
     forall i in set inds s & r(i) = s(i)**2

The explicit definition is in a simple sense an implementation of the implicitly specified function. The correctness of an explicit function definition with respect to an implicit specification may be defined as follows.

Given an implicit specification:

f(p:T_p)r:T_r
pre pre-f(p)
post post-f(p, r)

and an explicit function:

f:T_p -> T_r

we say it satisfies the specification iff:

forall p in set T_p & pre-f(p) => f(p):T_r and post-f(p, f(p))

So, "f is a correct implementation" should be interpreted as "f satisfies the specification".

[edit] State-based Modelling

In VDM-SL, functions do not have side-effects such as changing the state of a persistent global variable. This is a useful ability in many programming languages, so a similar concept exists; instead of functions, operations are used to change state variables (AKA globals).

For example, if we have a state consisting of a single variable someStateRegister : nat, we could define this in VDM-SL as:

state Register of 
  someStateRegister : nat 
end 

In VDM++ this would instead be defined as:

instance variables
  someStateRegister : nat 

An operation to load a value into this variable might be specified as:

LOAD(i:nat) 
ext wr someStateRegister:nat
post someStateRegister = i

The externals clause (ext) specifies which parts of the state can be accessed by the operation; rd indicating read-only access and wr being read/write access.

Sometimes it is important to refer to the value of a state before it was modified; for example, an operation to add a value to the variable may be specified as:

ADD(i:nat)
ext wr someStateRegister : nat
post someStateRegister = someStateRegister~ + i 

Where the ~ symbol on the state variable in the postcondition indicates the value of the state variable before execution of the operation.

[edit] Examples

[edit] The max function

This is an example of an implicit function definition. The postcondition characterizes the result rather than defining an algorithm for obtaining it. The function returns the element from a set of positive integers:

max(s:set of nat)r:nat
pre true
post r in set s and 
     forall r' in set s & r' <= r

[edit] Natural number multiplication

multp(i,j:nat)r:nat
pre true 
post r = i*j 

Applying the proof obligation forall p:T_p & pre-f(p) => f(p):T_r and post-f(p, f(p)) to an explicit definition of multp:

multp(i,j) == 
 if i=0 
 then 0 
 else if is-even(i) 
      then 2*multp(i/2,j)
      else j+multp(i-1,j)

Then the proof obligation becomes:

forall i, j : nat & multp(i,j):nat and multp(i, j) = i*j

This can be shown correct by:

  1. Proving that the recursion ends (this in turn requires proving that the numbers become smaller at each step)
  2. Mathematical induction

[edit] Queue abstract data type

This is a classical example illustrating the use of implicit operation specification in a state-based model of a well-known data structure. The queue is modelled as a sequence composed of elements of a type Qelt. The representation is Qelt is immaterial and so is defined as a token type.

types

Qelt = token;
Queue = seq of Qelt;

state TheQueue of 
  q : Queue
end 

operations

ENQUEUE(e:Qelt)
ext wr q:Queue
post q = q~ ^ [e]; 

DEQUEUE()e:Qelt
ext wr q:Queue
pre q <> [] 
post q~ = [e]^q;

IS-EMPTY()r:bool
ext rd q:Queue
post r <=> (len q = 0)

[edit] Bank system example

As a very simple example of a VDM-SL model, consider a system for maintaining details of customer bank account. Customers are modelled by customer numbers (CustNum), accounts are modelled by account numbers (AccNum). The representations of customer numbers are held to be immaterial and so are modelled by a token type. Balances and overdrafts are modelled by numeric types.

AccNum = token;
CustNum = token;
Balance = int; 
Overdraft = nat;

AccData :: owner : CustNum
           balance : Balance

state Bank of 
  accountMap : map AccNum to AccData
  overdraftMap : map CustNum to Overdraft
inv mk_Bank(accountMap,overdraftMap) == forall a in set rng accountMap & a.owner in set dom overdraftMap and 
                                        a.balance >= -overdraftMap(a.owner)

With operations: NEWC allocates a new customer number:

operations 
NEWC(od : Overdraft)r : CustNum
ext wr overdraftMap : map CustNum to Overdraft
post r not in set dom ~overdraftMap and overdraftMap = ~overdraftMap ++ { r |-> od};

NEWAC allocates a new account number and sets the balance to zero:

NEWAC(cu : CustNum)r : AccNum
ext wr accountMap : map AccNum to AccData
    rd overdraftMap map CustNum to Overdraft
pre cu in set dom overdraftMap
post r not in set dom accountMap~ and accountMap = accountMap~ ++ {r|-> mk_AccData(cu,0)}

ACINF returns all the balances of all the accounts of a customer, as a map of account number to balance:

ACINF(cu : CustNum)r : map AccNum to Balance 
ext rd accountMap : map AccNum to AccData
post r = {an |-> accountMap(an).balance | an in set dom accountMap & accountMap(an).owner = cu}

[edit] Tool Support

A number of different tools support VDM:

  • VDMTools are the leading commercial tools for VDM and VDM++, owned, marketed, maintained and developed by CSK Systems, building on earlier versions developed by the Danish Company IFAD. The manuals) and a practical tutorial are available. All licenses are available, free of charge, for the full version of the tool. The full version includes automatic code generation for Java and C++, dynamic link library and CORBA support.
  • Overture is a community-based open source initiative aimed at providing freely available tool support for VDM++ on top of the Eclipse platform. Its aim is to develop a framework for interoperable tools that will be useful for industrial application, research and education.
  • SpecBox: from Adelard provides syntax checking, some simple semantic checking, and generation of a LaTeX file enabling specifications to be printed in mathematical notation. This tool is freely available but it is not being further maintained.
  • LaTeX and LaTeX2e macros are available to support the presentation of VDM models in the ISO Standard Language's mathematical syntax. They have been developed and are maintained by the National Physical Laboratory in the UK. Documentation and the macros are available online.

[edit] Industrial Experience

VDM has been applied widely in a variety of application domains. The most well-known of these applications are:

  • Ada and Chill compilers: The first European validated Ada compiler was developed by DDC-International using VDM. Likewise the semantics of Chill and Modula-2 were described in their standards using VDM.
  • ConForm: An experiment at British Aerospace comparing the conventional development of a trusted gateway with a development using VDM.
  • Dust-Expert: A project carried out by Adelard in the UK for a safety related application determining that the safety is appropriate in the layout of industrial plants.
  • The development of VDMTools: Most components of the VDMTools tool suite are themselves developed using VDM. This development has been made at IFAD in Denmark and CSK in Japan.[1].
  • TradeOne: Certain key components of the TradeOne back-office system developed by CSK systems for the Japanese stock exchange were developed using VDM. Comparative measurements exist for developer productivity and defect density of the VDM-developed components versus the conventionally developed code.
  • FeliCa Networks have reported the development of an operating system for an integrated circuit for cellular telephone applications.

[edit] Refinement

Use of VDM starts with a very abstract model and develops this into an implementation. Each step involves Data Reification, then Operation Decomposition.

Data reification develops the abstract data types into more concrete data structures, while operation decomposition develops the (abstract) implicit specifications of operations and functions into algorithms that can be directly implemented in a computer language of choice.

Specification Implementation
Abstract data type ––– Data reification → Data structure
Operations ––– Operation decomposition → Algorithms

[edit] Data reification

Data reification (stepwise refinement) involves finding a more concrete representation of the abstract data types used in a specification. There may be several steps before an implementation is reached. Each reification step for an abstract data representation ABS_REP involves proposing a new representation NEW_REP. In order to show that the new representation is accurate, a retrieve function is defined that relates NEW_REP to ABS_REP, i.e. retr : NEW_REP -> ABS_REP. The correctness of a data reification depends on proving adequacy, i.e.

forall a:ABS_REP & exists r:NEW_REP & a = retr(r)

Since the data representation has changed, it is necessary to update the operations and functions so that they operate on NEW_REP. The new operations and functions should be shown to preserve any data type invariants on the new representation. In order to prove that the new operations and functions model those found in the original specification, it is necessary to discharge two proof obligations:

  • Domain rule:
forall r: NEW_REP & pre-OPA(retr(r)) => pre-OPR(r)
  • Modelling rule:
forall ~r,r:NEW_REP & pre-OPA(retr(~r)) and post-OPR(~r,r) => post-OPA(retr(~r,), retr(r))

[edit] Example data reification

In a business security system, workers are given ID cards; these are fed into card readers on entry to and exit from the factory. Operations required:

  • INIT() initialises the system, assumes that the factory is empty
  • ENTER(p : Person) records that a worker is entering the factory; the workers' details are read from the ID card)
  • EXIT(p : Person) records that a worker is exiting the factory
  • IS-PRESENT(p : Person) r : bool checks to see if a specified worker is in the factory or not

Formally, this would be:

types

Person = token;
Workers = set of Person;

state AWCCS of 
  pres: Workers
end 

operations

INIT()
ext wr pres: Workers
post pres = {}; 

ENTER(p : Person)
ext wr pres : Workers
pre p not in set pres 
post pres = pres~ union {p};

EXIT(p : Person)
ext wr pres : Workers
pre p in set pres
post pres = pres~\{p};

ISPRESENT(p : Person) r : bool 
ext rd pres : Workers
post r <=> p in set pres~

As most programming languages have a concept comparable to a set (often in the form of an array), the first step from the specification is to represent the data in terms of a sequence. These sequences must not allow repetition, as we do not want the same worker to appear twice, so we must add an invariant to the new data type. In this case, ordering is not important, so [a,b] is the same as [b,a].

The Vienna Development Method is valuable for model-based systems. It is not appropriate if the system is time-based. For such cases, the calculus of communicating systems (CCS) is more useful.

[edit] References

  1. ^ Peter Gorm Larsen, "Ten Years of Historical Development "Bootstrapping" VDMTools", In Journal of Universal Computer Science, volume 7(8), 2001

[edit] Books

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

[edit] External links

[edit] See also