Anti-unification (computer science)

Anti-unification is the process of constructing a generalization common to two given symbolic expressions. As in unification, several frameworks are distinguished depending on which expressions (also called terms) are allowed, and which expressions are considered equal. If variables representing functions are allowed in an expression, the process is called higher-order anti-unification, otherwise first-order anti-unification. If the generalization is required to have an instance literally equal to each input expression, the process is called syntactical anti-unification, otherwise E-anti-unification, or anti-unification modulo theory.

An anti-unification algorithm should compute for given expressions a complete, and minimal generalization set, that is, a set covering all generalizations, and containing no redundant members, respectively. Depending on the framework, a complete and minimal generalization set may have one, finitely many, or possibly infinitely many members, or may not exist at all;[note 1] it cannot be empty, since a trivial generalization exists in any case. For first-order syntactical anti-unification, Gordon Plotkin[1][2] gave an algorithm that computes a complete and minimal singleton generalization set containing the so-called least general generalization (lgg).

Anti-unification should not be confused with dis-unification. The latter means the process of solving systems of inequations, that is of finding values for the variables such that all given inequations are satisfied.[note 2] This task is quite different from finding generalizations.

Prerequisites

Formally, an anti-unification approach presupposes

First-order term

Main article: Term (logic)

Given a set V of variable symbols, a set C of constant symbols and sets F_n of n-ary function symbols, also called operator symbols, for each natural number n \geq 1, the set of (unsorted first-order) terms T is recursively defined to be the smallest set with the following properties:[3]

For example, if x \in V is a variable symbol, 1 \in C is a constant symbol, and \textit{add} \in F_2 is a binary function symbol, then x \in T, 1 \in T, and (hence) add(x,1) \in T by the first, second, and third term building rule, respectively. The latter term is usually written as x+1, using Infix notation and the more common operator symbol + for convenience.

Higher-order term

Main article: Lambda calculus

Substitution

Main article: Substitution (logic)

A substitution is a mapping \sigma: V \longrightarrow T from variables to terms; the notation \{x_1 \mapsto t_1, \ldots, x_k \mapsto t_k \} refers to a substitution mapping each variable x_i to the term t_i, for i=1,\ldots,k, and every other variable to itself. Applying that substitution to a term t is written in postfix notation as t \{x_1 \mapsto t_1, \ldots, x_k \mapsto t_k \}; it means to (simultaneously) replace every occurrence of each variable x_i in the term t by t_i. The result t \sigma of applying a substitution \sigma to a term t is called an instance of that term t. As a first-order example, applying the substitution \{x \mapsto h(a,y), z \mapsto b\} to the term

f( x ,a,g( z ),y) yields
f( h(a,y) ,a,g( b ),y) .

Generalization, specialization

If a term t has an instance equivalent to a term u, that is, if t \sigma \equiv u for some substitution \sigma, then t is called more general than u, and u is called more special than, or subsumed by, t. For example, x \oplus a is more general than a \oplus b if \oplus is commutative, since then (x \oplus a)\{x \mapsto b\} = b \oplus a \equiv a \oplus b.

If \equiv is literal (syntactic) identity of terms, a term may be both more general and more special than another one only if both terms differ just in their variable names, not in their syntactic structure; such terms are called variants, or renamings of each other. For example, f(x_1,a,g(z_1),y_1) is a variant of f(x_2,a,g(z_2),y_2), since f(x_1,a,g(z_1),y_1) \{ x_1 \mapsto x_2, y_2 \mapsto y_2, z_1 \mapsto z_2\} = f(x_2,a,g(z_2),y_2) and f(x_2,a,g(z_2),y_2) \{x_1 \mapsto x_1, y_2 \mapsto y_1, z_2 \mapsto z_1\} = f(x_1,a,g(z_1),y_1). However, f(x_1,a,g(z_1),y_1) is not a variant of f(x_2,a,g(x_2),x_2), since no substitution can transform the latter term into the former one, although \{x_1 \mapsto x_2, z_1 \mapsto x_2, y_1 \mapsto x_2 \} achieves the reverse direction. The latter term is hence properly more special than the former one.

A substitution \sigma is more special than, or subsumed by, a substitution \tau if x \sigma is more special than x \tau for each variable x. For example, \{ x \mapsto f(u), y \mapsto f(f(u)) \} is more special than \{ x \mapsto z, y \mapsto f(z) \}, since f(u) and f(f(u)) is more special than z and f(z), respectively.

Anti-unification problem, generalization set

An anti-unification problem is a pair \langle t_1, t_2 \rangle of terms. A term t is a common generalization, or anti-unifier, of t_1 and t_2 if t \sigma_1 \equiv t_1 and t \sigma_2 \equiv t_2 for some substitutions \sigma_1, \sigma_2. For a given anti-unification problem, a set S of anti-unifiers is called complete if each generalization subsumes some term t \in S; the set S is called minimal if none of its members subsumes another one.

First-order syntactical anti-unification

The framework of first-order syntactical anti-unification is based on T being the set of first-order terms (over some given set V of variables, C of constants and F_n of n-ary function symbols) and on \equiv being syntactic equality. In this framework, each anti-unification problem \langle t_1, t_2 \rangle has a complete, and obviously minimal, singleton solution set \{t\}. Its member t is called the least general generalization (lgg) of the problem, it has an instance syntactically equal to t_1 and another one syntactically equal to t_2. Any common generalization of t_1 and t_2 subsumes t. The lgg is unique up to variants: if S_1 and S_2 are both complete and minimal solution sets of the same syntactical anti-unification problem, then S_1 = \{ s_1\} and S_2 = \{ s_2 \} for some terms s_1 and s_2, that are renamings of each other.

Plotkin[1][2] has given an algorithm to compute the lgg of two given terms. It presupposes an injective mapping \phi: T \times T \longrightarrow V, that is, a mapping assigning each pair s,t of terms an own variable \phi(s,t), such that no two pairs share the same variable. [note 4] The algorithm consists of two rules:

 f(s_1,\dots,s_n) \sqcup f(t_1,\ldots,t_n)  \rightsquigarrow  f( s_1 \sqcup t_1,  \ldots, s_n \sqcup t_n )
 s \sqcup t  \rightsquigarrow  \phi(s,t) if previous rule not applicable

For example, (0*0) \sqcup (4*4) \rightsquigarrow (0 \sqcup 4)*(0 \sqcup 4) \rightsquigarrow \phi(0,4) * \phi(0,4) \rightsquigarrow x*x; this least general generalization reflects the common property of both inputs of being square numbers.

Plotkin used his algorithm to compute the "relative least general generalization (rlgg)" of two clause sets in first-order logic, which was the basis of the Golem approach to inductive logic programming.

First-order anti-unification modulo theory

Equational theories

First-order sorted anti-unification

Applications

Anti-unification of trees and linguistic applications

Higher-order anti-unification

Notes

  1. Complete generalization sets always exist, but it may be the case that every complete generalization set is non-minimal.
  2. Comon referred in 1986 to inequation-solving as "anti-unification", which nowadays has become quite unusual. Comon, Hubert (1986). "Sufficient Completeness, Term Rewriting Systems and 'Anti-Unification'". Proc. 8th International Conference on Automated Deduction. LNCS 230. Springer. pp. 128–140.
  3. E.g.  a \oplus (b \oplus f(x)) \equiv a \oplus (f(x) \oplus b) \equiv (b \oplus f(x)) \oplus a \equiv (f(x) \oplus b) \oplus a
  4. From a theoretical viewpoint, such a mapping exists, since both V and T \times T are countably infinite sets; for practical purposes, \phi can be built up as needed, remembering assigned mappings \langle s,t,\phi(s,t) \rangle in a hash table.

References

  1. 1.0 1.1 Plotkin, Gordon D. (1970). Meltzer, B.; Michie, D., eds. "A Note on Inductive Generalization". Machine Intelligence (Edinburgh University Press) 5: 153–163.
  2. 2.0 2.1 Plotkin, Gordon D. (1971). Meltzer, B.; Michie, D., eds. "A Further Note on Inductive Generalization". Machine Intelligence (Edinburgh University Press) 6: 101–124.
  3. C.C. Chang, H. Jerome Keisler (1977). A. Heyting and H.J. Keisler and A. Mostowski and A. Robinson and P. Suppes, ed. Model Theory. Studies in Logic and the Foundation of Mathematics 73. North Holland.; here: Sect.1.3
  4. Boris Galitsky, Josep Lluís de la Rose, Gabor Dobrocsi (2011). "Mapping Syntactic to Semantic Generalizations of Linguistic Parse Trees". FLAIRS Conference.
  5. Boris Galitsky, Kuznetsov SO, Usikov DA (2013). "Parse Thicket Representation for Multi-sentence Search". Lecture Notes in Computer Science 7735: 1072–1091. doi:10.1007/978-3-642-35786-2_12.
  6. Boris Galitsky, Gabor Dobrocsi, Josep Lluís de la Rosa, Sergei O. Kuznetsov (2010). "From Generalization of Syntactic Parse Trees to Conceptual Graphs". Lecture Notes in Computer Science 6208: 185–190. doi:10.1007/978-3-642-14197-3_19.
  7. Boris Galitsky, de la Rosa JL, Dobrocsi G. (2012). "Inferring the semantic properties of sentences by mining syntactic parse trees". Data & Knowledge Engineering. 81-82: 21–45. doi:10.1016/j.datak.2012.07.003.