Introduction to ontologies

An ontology is a list of formal terms, i.e. identifiers for categories of objects, plus machine-understandable statements defining a formal meaning for these terms, in other words, some of their characteristics or inter-relations, rules or constraints of use. These statements permit some semantic checking on the manual introduction of new statements and some logical inferencing to generate new statements, e.g. for information retrieval purposes.

Relational database schemas may be viewed as simple forms of ontologies but, although some dependency relations are represented, semantic relations between the terms (column names) are most often not explicited, and hence knowledge representation and inferencing is very limited. In most knowledge-based systems (KBSs), the users are allowed to add new categories, relations, rules or constraints. This dynamic modification of the ontology, and the (potential) complexity of the knowledge representations and inferencing, may be the reasons why so few large-scale KBSs have been built (the notable exception being PARKA-DB). However, in less than two years, we have implemented WebKB-2, a fast and large-scale multi-user-oriented KBS permitting its users to store and retrieve very complex knowledge representations, including procedural knowledge (backward and forward chaining of rules is not yet implemented).

Some ontologies are about mathematical entities (e.g. sets, functions, numbers, sequences), about relationships from/to physical dimensions (e.g. space, time and matter), or about a particular domain (e.g. elevators and chemical elements). They are often called theories, are generally small, and may include, generalize, specialize or compete with other theories. Since 1993, the Ontolingua server has hosted a library of such ontologies and permitted Web users to add new theories or combine theories to create knowledge bases.

Some ontologies classify all the concepts of a natural language or a particular domain, via links such as "subtype of", "instance of" and "part of". They are often called lexical ontologies and may be large. For example, WordNet is a ``lexical database for English'' that was Web-accessible as early as 1990, and now connects about 337,200 words to about 109,400 concept types, and organizes these types via various kinds of links, e.g. "specialization", "exclusion", "similar", "member", "part" and "substance".

Some ontologies classify relation types (e.g. spatial/temporal/thematic relation types) and/or very general concept types (e.g. the notions of situation, state, process, spatial entity, physical_entity) mainly via "subtype of" links. They are often called top-level ontologies. Examples are John Sowa's ontologies (1984 and 2000) and the Generalized Upper Model (1994).

Top-level ontologies may be used for structuring the lop layers of lexical ontologies. For example, Sensus was created in 1994 by semi-automatically merging WordNet, LDOCE (the Longmann Dictionary of Contemporary English) and two top-level ontologies: the Generalized Upper Model and Ontos. Similarly, in 1995, we have used Sowa's first top-level ontology to structure WordNet top layers and hence permit semantic checking on the use of WordNet categories. In 1998, HPKB upper was created by combining Sensus top-level ontology with CYC top-level ontology.

Categories of lexical ontologies may be used as generalizations for the categories in theories (which are generally much more precisely defined) and hence permit the retrieval and comparison of these categories and theories. This was also a goal for our work in 1995.

Related links:   ontology in WebKB-2,   a list of ontologies,  a list of projects on ontologies.

Martin Ph. & Eklund P. (2002). Manageable Approaches to the Semantic Web.
"Practice & Experience" track of WWW 2002, 11th International World Wide Web Conference,
Honolulu, Hawaii, USA, May 7-11, 2002.