Formal semantics of programming languages

From Wikipedia, the free encyclopedia

In theoretical computer science, formal semantics is the field concerned with the rigorous mathematical study of the meaning of programming languages and models of computation.

The formal semantics of a language is given by a mathematical model that describes the possible computations described by the language.

There are many approaches to formal semantics; these approaches belong to three major classes:

  • Denotational semantics, whereby each phrase in the language is translated into a denotation, i.e. a phrase in some other language. Denotational semantics loosely corresponds to compilation, although the "target language" is usually a mathematical formalism rather than another computer language. For example, denotational semantics of functional languages often translates the language into domain theory;
  • Operational semantics, whereby the execution of the language is described directly (rather than by translation). Operational semantics loosely corresponds to interpretation, although again the "implementation language" of the interpreter is generally a mathematical formalism. Operational semantics may define an abstract machine (such as the SECD machine), and give meaning to phrases by describing the transitions they induce on states of the machine. Alternatively, as with the pure lambda calculus, operational semantics can be defined via syntactic transformations on phrases of the language itself;
  • Axiomatic semantics, whereby one gives meaning to phrases by describing the logical axioms that apply to them. Axiomatic semantics makes no distinction between a phrase's meaning and the logical formulas that describe it; its meaning is exactly what can be proven about it in some logic. The canonical example of axiomatic semantics is Hoare logic.

The distinctions between the three broad classes of approaches can sometimes be blurry, but all known approaches to formal semantics use the above techniques, or some combination thereof.

Apart from the choice between denotational, operational, or axiomatic approaches, most variation in formal semantic systems arises from the choice of supporting mathematical formalism.

Some variations of formal semantics include the following:

  • Action semantics is an approach that tries to modularize denotational semantics, splitting the formalization process in two layers (macro and microsemantics) and predefining three semantic entities (actions, data and yielders) to simplify the specification;
  • Attribute grammars define systems that systematically compute "metadata" (called attributes) for the various cases of the language's syntax. Attribute grammars can be understood as a denotational semantics where the target language is simply the original language enriched with attribute annotations. Aside from formal semantics, attribute grammars have also been used for code generation in compilers, and to augment regular or context-free grammars with context-sensitive conditions;
  • Categorical (or "functorial") semantics uses category theory as the core mathematical formalism;
  • Concurrency semantics is a catch-all term for any formal semantics that describes concurrent computations. Historically important concurrent formalisms have included the Actor model and process calculi;
  • Game semantics uses a metaphor inspired by game theory.

For a variety of reasons, one might wish to describe the relationships between different formal semantics. For example:

  • One might wish to prove that a particular operational semantics for a language satisfies the logical formulas of an axiomatic semantics for that language. Such a proof demonstrates that it is "sound" to reason about a particular (operational) interpretation strategy using a particular (axiomatic) proof system.
  • Given a single language, one might define a "high-level" abstract machine and a "low-level" abstract machine for the language, such that the latter contains more primitive operations than the former. One might then wish to prove that an operational semantics over the high-level machine is related by a bisimulation with the semantics over the low-level machine. Such a proof demonstrates that the low-level machine "faithfully implements" the high-level machine.

One can sometimes relate multiple semantics through abstractions via the theory of abstract interpretation.

The field of formal semantics encompasses all of the following:

  • the definition of semantic models,
  • the relations between different semantic models,
  • the relations between different approaches to meaning, and
  • the relation between computation and the underlying mathematical structures from fields such as logic, set theory, model theory, category theory, etc.

It has close links with other areas of computer science such as programming language design, type theory, compilers and interpreters, program verification and model checking.

[edit] External links

[edit] References

  • Robert Harper. Practical Foundations for Programming Languages. Working draft, 2006. (online, as PDF)
  • Shriram Krishnamurthi. Programming Languages: Application and Interpretation. (online, as PDF)
  • John C. Reynolds. Theories of Programming Languages. Cambridge University Press, 1998. (ISBN 0-521-59414-6)
  • Carl Gunter. Semantics of Programming Languages. MIT Press, 1992. (ISBN 0-262-07143-6)
  • Glynn Winskel. The Formal Semantics of Programming Languages: An Introduction. MIT Press, 1993 (paperback ISBN 0-262-73103-7)