User:LajujKej
From Wikipedia, the free encyclopedia
In phrase structure grammars, such as generalised phrase structure grammar, head-driven phrase structure grammar and lexical functional grammar, a feature structure is essentially a set of attribute-value pairs. For example the attribute named number might have the value singular. The value of an attribute may be either atomic, e.g. the symbol singular, or complex (most commonly a feature structure, but also a list or a set).
A feature structure can be represented as a directed acyclic graph (DAG), with the nodes corresponding to the variable values and the paths to the variable names. Operations defined on feature structures, e.g. unification, are used extensively in phrase structure grammars. In most theories (e.g. HPSG), operations are strictly speaking defined over equations describing feature structures and not over feature structures themselves, though feature structures are usually used in informal exposition.
Often, feature structures are written like this:
Here we have the two features category and agreement. Category has the value noun phrase whereas the value of agreement is indicated by another feature structure with the features number and person being singular and third.
This particular notation is called attribute value matrix (AVM)
The matrix has two columns. One for the feature names and the other for the values. In this sense a feature structure is a list of key-value pairs. The value might be atomic or another feature structure. This leads to another notation for feature structures - the use of trees. In fact some systems use S-expression to represent feature structures.
Contents |
[edit] Connections to Modal Logic
We know that feature structures can be represented as directed acyclic graphs. Blackburn (2003) makes the crucial insight that these directed acyclic graphs are Krikpe frames and unification implies modal satisfaction.
[edit] From the Feature Language F to the Modal Model M
Let be a language where A is a set of A-indexed propositional letters, i.e., atomic values. L is an L-indexed set of modal operators, i.e., attributes. A feature structure of signature L, A is a triple where:
- N is a non-empty set of nodes.
- is a binary partial function on N for all . Notice that is an L-indexed multiframe. Call it L.
- is a unary relation on N for all . Notice that is an interpretation of propositional letters.
With a frame and a valuation we have a modal model. Let be any arbitrary feature structure.
[edit] Syntax
Using some truth-functionally adequate collection of Boolean connectives, parentheses and the members of , build wffs of F as follows:
- All propositional variables are wffs.
- If φ,ψ are wffs, then and are wffs.
- If f is a modality and φ is a wff, then is a wff.
- Nothing is a wff unless constructed in accordance with 1-3.
We can define . Let for all .
[edit] An Example of a wff
This formula is precisely the AVM given above. Notice that conjunction encodes stacking.
[edit] Semantics
Remember that M is an arbitrary feature structure with any frame and any valuation . For all :
- iff for all .
- iff
- iff or
- iff there is some n' Rf-related to n and for all
[edit] Unification and Satisfiability
We know that unification is the combination of the information in two feature structures, yielding a new feature structure. This new feature structure must be a wff, so we can capture unification in the modal language with the principle:
- Two wffs φ and ψ are unifiable iff their conjunction is satisfiable.
The following proofs about satisfiability are therefore also proofs about feature unification.
[edit] Soundness and Completeness
The model logic defined by AVMs is sound and strongly complete with respect to the class of all functional frames. The completeness result can be extended to all point generated and acyclic feature structures.
[edit] KAV , the Minimal Axiomatization of M
The minimal axiomatization of M contains all propositional logic tautologies, the K-axiom and . The rules of proof are modus ponens and necessitation. Call this axiom schema KAV.
[edit] Soundness of KAV
Show . Proof by induction on the complexity of derivations.
Base case of complexity m = 1, i.e., the sentence letters. There are three cases:
- φ is an tautology of modal logic. We know that any tautology of modal logic can be obtained from a tautology of propositional logic by uniformly substituting modal formulas for propositional symbols, so .
- φ is an instance of the K axiom. because for all modal structures M and all worlds , .
- Suppose φ is an instance of . We will show that φ is true in all frames generated from , which by definition is a partial funcion. Suppose for some and , i.e., there is some such that nRfn' and . Now suppose for contradiction that there is some such that nRfn'' and . But then nRfn' and nRfn'' and , which contradicts the claim that is a function. Assuming that produces a contraction, so is valid on all frames generated from .
Induction Hypothesis: Assume for all derivations of complexity < m that . We can construct a derivation of complexity m by:
- Adding an axiom (in which case the proof is exactly the same as in the base case).
- Modus ponens (in which case the proof is exactly the same as for propositional logic).
- φ follows from necessitation, i.e., φ = [f]ψ for arbitrary . By the definition of necessitation, ψ is a theorem, i.e., . Since we are constructing a derivation of length m, it must be the case that the derivation of ψ from the empty set is of complexity < m, thus by the induction hypothesis. But if , then for all , , so for all nodes that are Rf-related to n, , so .
The induction is complete.
[edit] Strong Completeness of KAV
Show . We will use the Henkin method.
Let be the canonical model on the Henkin multiframe of signature . is a triple where:
- is the set of all maximal KAV consistent sets.
- is the binary relation on defined by iff for all and for all φ, implies . Call the canonical relation.
- iff for all and . Call the canonical or natural valuation.
Henkin Lemma: . Proof by induction on the complexity of formulae.
Base Case (i.e., formula of complexity n = 1): Immediate from the definition of the canonical valuation.
Induction Hypothesis: Assume for all formula of complexity < n that . We now show for all formula of complexity n. There are two cases.
- The Boolean cases follows by the definition of maximal consistent set.
- φ is of the form [f]ψ for some . Suppose for some . We need to show that , which means that for all that are -related to n, . Hence by the induction hypothesis. Suppose for contradiction there is a such that and . Hence . By the definition of the canonical relation it must be the case that . But this is a contradiction because and .
- Suppose for all and . We will show that . To do this we must construct a , such that and . That is, . means that is consistent. We just have to extend to a maximal consistent set, which we know is possible by Lindenbaum's lemma.
The induction is complete.
[edit] Extending the Completeness of KAV
Because of the axiom , we already know that the completeness result holds with respect to the class of functional frames, but KAV is also complete with respect to the class of point generated and acyclic feature structures. KAV completeness can also be extended with the addition of other common axioms for doing computational linguistics.
[edit] Completeness and Point Generated Feature Structures
Extend a set Γ to a maximal KAV consistent set Γ + . The submodel generated by Γ + is a feature structure and is by definition point generated. Because modal truth transfers to generated submodels, satisfies Γ + .
[edit] Completeness and Other Axioms
- The axiom eliminates constant-constant clashes. When we add it to the Henkin model, it makes KAV complete with respect to the class of all features structure frames such that no node is labeled with two distinct pieces of atomic information.
- The axiom assures that atomic information is only inserted at terminal nodes. When we add it to the Henkin model, it makes KAV complete with respect to the class of all features structures frames such that only terminal nodes are labeled with atomic information.
- The axiom , where are metavariables over sequences of modalities and ν is a metavariable over atomic variables, assures that each piece of atomic information is associated with a unique node. When we add this axiom to the Henkin model, it makes KAV complete with respect to the class of all feature structure frames such that each node is labelled with unique atomic information. In the canonical model, it means that each piece of atomic information can occur in at most one maximal KAV consistent set.
[edit] KAV, the Finite Model Property, and Decidability
KAV has the finite model property, which will also assure its decidability. It follows from Post's Theorem that every recursively axiomatized modal logic with the finite model property is decidable. Suppose for any formula φ that φ is satisfied in a model at an arbitrary node n. Build a finite model out of that also satisfies φ at n as follows:
- Let the natural number k be the maximum depth of nested modalities in φ.
- Let Lφ be the subset of L that contains the indices of the modalities that occur in φ.
- Let be the structure generated from n, but only with relations and only generated in k steps.
Notice that must be finite. We now show that satisfies φ at n. By definition, is a generated submodel of . Therefore because modal truth holds across generated submodels.
[edit] Conclusions
The fact that the feature structure formalisms used by many linguists are essentially modal logics has far reaching consequences. Not only does it allow for feature structures to be tied to other modal and first order logics, but it permits proofs about unification and the computational properties of feature structures. The connection also provides a means to see the implications of supplementing feature structure formalisms with new theoretical machinery. The addition of boxlabels, for example, requires a different modal structure, namely hybrid logic.