Ontology (information science)
From Wikipedia, the free encyclopedia
In both computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.
Ontologies are used in artificial intelligence, the Semantic Web, software engineering, biomedical informatics, library science, and information architecture as a form of knowledge representation about the world or some part of it. Common components of ontologies include:
- Individuals: instances or objects (the basic or "ground level" objects)
- Classes: sets, collections, concepts or types of objects[1]
- Attributes: properties, features, characteristics, or parameters that objects (and classes) can have
- Relations: ways that classes and objects can be related to one another
- Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement
- Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input
- Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form
- Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application. This definition differs from that of "axioms" in generative grammar and formal logic. In these disciplines, axioms include only statements asserted as a priori knowledge. As used here, "axioms" also include the theory derived from axiomatic statements.
- Events: the changing of attributes or relations
Ontologies are commonly encoded using ontology languages.
Contents |
[edit] Elements
Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed. As mentioned above, most ontologies describe individuals (instances), classes (concepts), attributes, and relations. In this section each of these components is discussed in turn.
[edit] Individuals
Individuals (instances) are the basic, "ground level" components of an ontology. The individuals in an ontology may include concrete objects such as people, animals, tables, automobiles, molecules, and planets, as well as abstract individuals such as numbers and words. Strictly speaking, an ontology need not include any individuals, but one of the general purposes of an ontology is to provide a means of classifying individuals, even if those individuals are not explicitly part of the ontology.
[edit] Classes
Classes – concepts that are also called type, sort, category, and kind – are abstract groups, sets, or collections of objects. They may contain individuals, other classes, or a combination of both. Some examples of classes:[2]
- Person, the class of all people
- Vehicle, the class of all vehicles
- Car, the class of all cars
- Class, representing the class of all classes
- Thing, representing the class of all things
Ontologies vary on whether classes can contain other classes, whether a class can belong to itself, whether there is a universal class (that is, a class containing everything), etc. Sometimes restrictions along these lines are made in order to avoid certain well-known paradoxes.
The classes of an ontology may be extensional or intensional in nature. A class is extensional if and only if it is characterized solely by its membership. More precisely, a class C is extensional if and only if for any class C', if C' has exactly the same members as C, then C and C' are identical. If a class does not satisfy this condition, then it is intensional. While extensional classes are more well-behaved and well-understood mathematically, as well as less problematic philosophically, they do not permit the fine grained distinctions that ontologies often need to make. For example, an ontology may want to distinguish between the class of all creatures with a kidney and the class of all creatures with a heart, even if these classes happen to have exactly the same members. In the upper ontologies mentioned above, the classes are defined intensionally. Intensionally defined classes usually have necessary conditions associated with membership in each class. Some classes may also have sufficient conditions, and in those cases the combination of necessary and sufficient conditions make that class a fully defined class.
Importantly, a class can subsume or be subsumed by other classes; a class subsumed by another is called a subclass of the subsuming class. For example, Vehicle subsumes Car, since (necessarily) anything that is a member of the latter class is a member of the former. The subsumption relation is used to create a hierarchy of classes, typically with a maximally general class like Thing at the top, and very specific classes like 2002 Ford Explorer at the bottom. The critically important consequence of the subsumption relation is the inheritance of properties from the parent (subsuming) class to the child (subsumed) class. Thus, anything that is necessarily true of a parent class is also necessarily true of all of its subsumed child classes. In some ontologies, a class is only allowed to have one parent (single inheritance), but in most ontologies, classes are allowed to have any number of parents (multiple inheritance), and in the latter case all necessary properties of each parent are inherited by the subsumed child class. Thus a particular class of animal (HouseCat) may be a child of the class Cat and also a child of the class Pet.
A partition is a set of related classes and associated rules that allow objects to be placed into the appropriate class. For example, to the right is the partial diagram of an ontology that has a partition of the Car class into the classes 2-Wheel Drive and 4-Wheel Drive. The partition rule determines if a particular car is placed in the 2-Wheel Drive or the 4-Wheel Drive class.
If the partition rule(s) guarantee that a single Car cannot be in both classes, then the partition is called a disjoint partition. If the partition rules ensure that every concrete object in the super-class is an instance of at least one of the partition classes, then the partition is called an exhaustive partition.
[edit] Attributes
Objects in the ontology can be described by assigning attributes to them. Each attribute has at least a name and a value, and is used to store information that is specific to the object it is attached to. For example the Ford Explorer object has attributes such as:
- Name: Ford Explorer
- Number-of-doors: 4
- Engine: {4.0L, 4.6L}
- Transmission: 6-speed
The value of an attribute can be a complex data type; in this example, the value of the attribute called Engine is a list of values, not just a single value.
If you did not define attributes for the concepts you would have either a taxonomy (if hyponym relationships exist between concepts) or a controlled vocabulary. These are useful, but are not considered true ontologies.
[edit] Relationships
An important use of attributes is to describe the relationships (also known as relations) between objects in the ontology. Typically a relation is an attribute whose value is another object in the ontology. For example in the ontology that contains the Ford Explorer and the Ford Bronco, the Ford Bronco object might have the following attribute:
- Successor: Ford Explorer
This tells us that the Explorer is the model that replaced the Bronco. Much of the power of ontologies comes from the ability to describe these relations. Together, the set of relations describes the semantics of the domain.
The most important type of relation is the subsumption relation (is-superclass-of, the converse of is-a, is-subtype-of or is-subclass-of). This defines which objects are members of classes of objects. For example we have already seen that the Ford Explorer is-a 4-wheel drive, which in turn is-a Car:
The addition of the is-a relationships has created a hierarchical taxonomy; a tree-like structure (or, more generally, a partially ordered set) that clearly depicts how objects relate to one another. In such a structure, each object is the 'child' of a 'parent class' (Some languages restrict the is-a relationship to one parent for all nodes, but many do not).
Another common type of relations is the meronymy relation, written as part-of, that represents how objects combine together to form composite objects. For example, if we extended our example ontology to include objects like Steering Wheel, we would say that "Steering Wheel is-part-of Ford Explorer" since a steering wheel is one of the components of a Ford Explorer. If we introduce meronymy relationships to our ontology, we find that this simple and elegant tree structure quickly becomes complex and significantly more difficult to interpret manually. It is not difficult to understand why; an entity that is described as 'part of' another entity might also be 'part of' a third entity. Consequently, entities may have more than one parent. The structure that emerges is known as a directed acyclic graph (DAG).
As well as the standard is-a and part-of relations, ontologies often include additional types of relation that further refine the semantics they model. These relations are often domain-specific and are used to answer particular types of question.
For example in the domain of automobiles, we might define a made-in relationship which tells us where each car is built. So the Ford Explorer is made-in Louisville. The ontology may also know that Louisville is-in Kentucky and Kentucky is-a state of the USA. Software using this ontology could now answer a question like "which cars are made in America?"
[edit] Domain ontologies and upper ontologies
A domain ontology (or domain-specific ontology) models a specific domain, or part of the world. It represents the particular meanings of terms as they apply to that domain. For example the word card has many different meanings. An ontology about the domain of poker would model the "playing card" meaning of the word, while an ontology about the domain of computer hardware would model the "punch card" and "video card" meanings.
An upper ontology (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies. It contains a core glossary in whose terms objects in a set of domains can be described. There are several standardized upper ontologies available for use, including Dublin Core, GFO, OpenCyc/ResearchCyc, SUMO, and DOLCEl. WordNet, while considered an upper ontology by some, is not an ontology: it is a unique combination of a taxonomy and a controlled vocabulary (see above, under Attributes)[citation needed].
The Gellish ontology is an example of a combination of an upper and a domain ontology.
Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible. As systems that rely on domain ontologies expand, they often need to merge domain ontologies into a more general representation. This presents a challenge to the ontology designer. Different ontologies in the same domain can also arise due to different perceptions of the domain based on cultural background, education, ideology, or because a different representation language was chosen.
At present, merging ontologies is a largely manual process and therefore time-consuming and expensive. Using a foundation ontology to provide a common definition of core terms can make this process manageable. There are studies on generalized techniques for merging ontologies, but this area of research is still largely theoretical.
[edit] Ontology languages
An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based:
- OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs.
- KIF is a syntax for first-order logic that is based on S-expressions.
- The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions.
- Rule Interchange Format (RIF) and F-Logic combine ontologies and rules.
- The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language.
[edit] Relation to the philosophical term
The term ontology has its origin in philosophy, where it is the name of one fundamental branch of metaphysics, concerned with analyzing various types or modes of existence, often with special attention to the relations between particulars and universals, between intrinsic and extrinsic properties, and between essence and existence. According to Tom Gruber at Stanford University, the meaning of ontology in the context of computer science is “a description of the concepts and relationships that can exist for an agent or a community of agents.” He goes on to specify that an ontology is generally written, “as a set of definitions of formal vocabulary.” [3]
What ontology has in common in both computer science and philosophy is the representation of entities, ideas, and events, along with their properties and relations, according to a system of categories. In both fields, one finds considerable work on problems of ontological relativity (e.g. Quine and Kripke in philosophy, Sowa and Guarino in computer science (Top-level ontological categories. By: Sowa, John F. In International Journal of Human-Computer Studies, v. 43 (November/December 1995) p. 669-85.), and debates concerning whether a normative ontology is viable (e.g. debates over foundationalism in philosophy, debates over the Cyc project in AI).
Differences between the two are largely matters of focus. Philosophers are less concerned with establishing fixed, controlled vocabularies than are researchers in computer science, while computer scientists are less involved in discussions of first principles (such as debating whether there are such things as fixed essences, or whether entities must be ontologically more primary than processes). During the second half of the 20th century, philosophers extensively debated the possible methods or approaches to building ontologies, without actually building any very elaborate ontologies themselves. By contrast, computer scientists were building some large and robust ontologies (such as WordNet and Cyc) with comparatively little debate over how they were built.
In the early years of the 21st century, the interdisciplinary project of cognitive science has been bringing the two circles of scholars closer together. For example, there is talk of a "computational turn in philosophy" which includes philosophers analyzing the formal ontologies of computer science (sometimes even working directly with the software), while researchers in computer science have been making more references to those philosophers who work on ontology (sometimes with direct consequences for their methods). Still, many scholars in both fields are uninvolved in this trend of cognitive science, and continue to work independently of one another, pursuing separately their different concerns.
[edit] Resources
[edit] Examples of published ontologies
- Dublin Core, a simple ontology for documents and publishing.
- Cyc for formal representation of the universe of discourse.
- Suggested Upper Merged Ontology, which is a formal upper ontology
- Basic Formal Ontology (BFO), a formal upper ontology designed to support scientific research
- Gellish English dictionary, an ontology that includes a dictionary and taxonomy that includes an upper ontology and a lower ontology that focusses on industrial and business applications in engineering, technology and procurement.
- Generalized Upper Model, a linguistically-motivated ontology for mediating between clients systems and natural language technology
- WordNet Lexical reference system
- OBO Foundry: a suite of interoperable reference ontologies in biomedicine.
- The Ontology for Biomedical Investigations is an open access, integrated ontology for the description of biological and clinical investigations.
- COSMO: An OWL ontology that is a merger of the basic elements of the OpenCyc and SUMO ontologies, with additional elements.
- Gene Ontology for genomics
- PRO, the Protein Ontology of the Protein Information Resource, Georgetown University.
- Protein Ontology for proteomics
- Foundational Model of Anatomy for human anatomy
- SBO, the Systems Biology Ontology, for computational models in biology
- Plant Ontology for plant structures and growth/development stages, etc.
- CIDOC CRM (Conceptual Reference Model) - an ontology for "cultural heritage information".
- GOLD (General Ontology for Linguistic Description )
- Linkbase A formal representation of the biomedical domain, founded upon Basic Formal Ontology (BFO).
- Foundational, Core and Linguistic Ontologies
- ThoughtTreasure ontology
- LPL Lawson Pattern Language
- TIME-ITEM Topics for Indexing Medical Education
- POPE Purdue Ontology for Pharmaceutical Engineering
- IDEAS Group A formal ontology for enterprise architecture being developed by the Australian, Canadian, UK and U.S. Defence Depts. The IDEAS Group Website
- program abstraction taxonomy
- SWEET Semantic Web for Earth and Environmental Terminology
- CCO The Cell-Cycle Ontology is an application ontology that represents the cell cycle
[edit] Ontology libraries
The development of ontologies for the Web has led to the apparition of services providing lists or directories of ontologies with search facility. Such directories have been called ontology libraries.
The following are static libraries of human-selected ontologies.
- The DAML Ontology Library maintains a legacy of ontologies in DAML.
- SchemaWeb is a directory of RDF schemata expressed in RDFS, OWL and DAML+OIL.
The following are both directories and search engines. They include crawlers searching the Web for well-formed ontologies.
- Swoogle is a directory and search engine for all RDF resources available on the Web, including ontologies.
- The OntoSelect Ontology Library offers similar services for RDF/S, DAML and OWL ontologies.
- Ontaria is a "searchable and browsable directory of semantic web data", with a focus on RDF vocabularies with OWL ontologies.
[edit] See also
- Commonsense knowledge bases
- Controlled vocabulary
- Formal concept analysis
- Lattice
- Ontology alignment
- Ontology editor
- Ontology learning
- Open Biomedical Ontologies
- Soft ontology
- Terminology extraction
- Weak ontology
[edit] Related philosophical concepts
[edit] References
- ^ See Class (set theory), Class (computer science), and Class (philosophy), each of which is relevant but not identical to the notion of a "class" here.
- ^ Note that the names given to the classes mentioned here are entirely a matter of convention.
- ^ What is an Ontology?
[edit] External links
- What is an ontology?
- Introduction to Description Logics DL course by Enrico Franconi, Faculty of Computer Science, Free University of Bolzano, Italy
- What are the differences between a vocabulary, a taxonomy, a thesaurus, an ontology, and a meta-model?
- Metadata? Thesauri? Taxonomies? Topic Maps! - Making sense of it all
- Clay Shirky: Ontology is Overrated
- Ontolog (a.k.a. Ontolog Forum) - An open, international, virtual community of practice working on the application and adoption of ontological engineering and semantic technologies.
- Barry Smith's Ontology Page
- John Bateman's Ontology Portal
- Buffalo Ontology Site
- National Center for Ontological Research
- National Center for Biomedical Ontology
- The Ontology and TaxonomyCoordinating Working Group
- Bremen Ontology Research Group
- The OBO Foundry
- The Laboratory for Applied Ontology (LOA)
- ekoss.org - Expert Knowledge Ontology-based Semantic Search
- Streaming video: "How to Build an Ontology", by Barry Smith.
- Jena – A Semantic Web Framework for Java
- Soft ontologies
- The IDEAS Group Website
- InMoBio: Integration and Modularization of Bio-ontologies
- A “relativity” between Ontology and Epistemology (see part 3)