comparabilityArticle.html

Ontology Design Rules Based On Comparability Via Particular Relations ____________________________________________________________________________________________

1st author: Dr Ph. A. Martin, ^{[0000-0002-6793-8760]}
2nd author (for the writing of this article and its SPARQL+OWL related parts): Dr O. Corby ^{[0000-0001-6610-0969]},
3rd author (for the Semantics 2019 related version of this article and its slides):
Dr Catherine Faron Zucker ^{[0000-0001-5959-5561]}

This article is an extended (and HTML based) version of our article published in the proceedings of Semantics 2019 (where it is restricted to 15 pages). The parts in dark green (like this sentence) are the parts that are not included in the submitted version but cited to be retrieved here. The parts in olive color (like this sentence) are the parts that were included in the submitted version but removed from the final version to make it simpler (making it simpler was asked by the reviewers).

Another article (in preparation) applies and strongly extends some ideas of this present article by focusing on the notion of ontology completeness, and more generally, by using a new viewpoint: allowing the evaluation of connectedness (between objects) based on single important relations (plus arbitrary combinations of them) instead of on comparability between objects. Indeed, the way it is defined in this present aricle, comparability is a stronger notion than connectedness: this is convenient for this but not generic enough for evaluating an ontology completeness. More precisely, comparability is defined as i) testing the identity or equivalence relations, and then ii) testing the connectedness of objects via a given set of relation kinds. The connectedness notion does not make the first step mandatory: it allows the testing of the identity or equivalence relations only if they are included in the given set of relations kinds.

Abstract. The difficulty of representing and organizing knowledge, especially in reasonably complete and not “implicitly redundant” ways, raises at least two research questions: “how to check that particular relations are systematically used not simply whenever this is possible but whenever this is relevant for the knowledge providers?” and “how to extend best practices, ontology patterns or methodologies that advocate the systematic use of particular relations, and ease the checking of the compliance with these methods?”. As an answer to these research questions, this article proposes a generic “ontology design rule” (ODR). A first general formulation of this generic ODR is: in a given knowledge base (KB), for each pair of KB objects (types or individuals) of a given set chosen by the user of this ODR, there should be either statements connecting these objects by relations of particular given types or statements negating such relations, i.e. expressing that these relations do not or cannot occur in the given KB. This article further specifies this ODR and shows its interests for, respectively, i) subtype relations, ii) specialization relations between statements or other individuals, and iii) other relations, e.g. part relations and specialization relations with genus & differentia. This article proposes a generic implementation of this ODR in a higher-order logic based language. It also shows how this ODR can be implemented via OWL and SPARQL, at least for common or simple cases.

Keywords: ontology design pattern, knowledge organization, ontology completeness, OWL, SPARQL.

Topics (from the Semantics 2019 list): i) Data Quality Management and Assurance; ii) Terminology, Thesaurus & Ontology Management; iii) Web Semantics & Linked (Open) Data.

Table of Contents

1. Introduction 2. Comparability of Types Via Subtype Relations 2.1. Representation Via OWL 2.2. Checking Via SPARQL 2.3. Advantages For Reducing Implicit Redundancies, Detecting Inconsistencies and Increasing Knowledge Querying Possibilities 3. Other Interesting Cases of Comparability 3.1. Comparability of Statements Or Individuals Via Specialization Relations 3.2. Comparability Via "Definition Element" Relations 3.3. Comparability Via Other Transitive Relations, Especially Part Relations 3.4. Comparability Via Transitive Relations Plus Minimal Differentia 3.5. Comparability Via Non-transitive Relations 4. Comparisons With Other Works and Conclusion 5. Appendix: One Logic-based Function For Checking Comparability 6. References

1. Introduction

Representing and organizing knowledge within or across knowledge bases (KBs) is a fundamental and difficult task for knowledge sharing and inferencing, and thereby for knowledge retrieval and exploitation. At least three kinds of research avenues (relevant to refer to in this article) guide this task. The first are ontologies made for reuse purposes (with methodologies implicitly or explicitly based on these ontologies, e.g. the Ontoclean methodology): foundational ontologies such as DOLCE, Onto-UML and BFO; task-oriented ones such as OWL-S; general ones such as DBpedia and Schema.org; domain-oriented ones such as those from BioPortal. The second are catalogs of best practices [PlanetData D2.1: Mendes et al., 2012; pp. 26-29] [Farias et al., 2017: W3C Recommendation] and ontology patterns [ODP catalog: Presutti & Gangemi, 2008] [Dodds & Davis, 2012] or anti-patterns [Ruy et al., 2017] [Roussey et al., 2007]. The third are ontology/KB evaluation criteria and measures [Zaveri et al., 2016], e.g. for knowledge interlinking/connectedness, accuracy/precision, consistency, conciseness and completeness. The results of these three kinds of research avenues are especially helpful for building reusable ontologies.

These three kinds of research avenues advocate the use of relations of particular types between objects of particular types. In the RDF terminology, one would say that these three kinds of research avenues advocate the use of properties to connect resources of particular classes – e.g. the use of subClassOf or equivalentClass relations between classes or other objects, whenever this is relevant. However, often, only a knowledge provider knows when it is relevant to use a particular relation type. This limits the possibilities of checking or guiding the use of the advocated properties. Furthermore, it may also be useful that the knowledge provider represents when the advocated properties do not or cannot occur. E.g., representing disjointWith or complementOf relations between classes to express that subClassOf or equivalentClass relations cannot occur between these classes has many advantages that Section 2 illustrates. Using all these relations is especially useful between top-level classes since many inference engines can exploit the combination of these relations, e.g. via inheritance mechanisms. Finally, checking that particular relations are represented as either existing or forbidden can be done automatically. Thus, as an answer to the research questions “how to check that particular relations are systematically used not simply whenever this is possible but whenever this is relevant for the knowledge providers?” and “how to extend best practices, ontology patterns or methodologies that advocate the systematic use of particular relations, and make the application of these methods easier to check?”, this article proposes the following generic “ontology design rule” (ODR). A first general formulation of this generic ODR is: in a given KB, for each pair of objects of a given set chosen by the user of this ODR, there should be either statements connecting these objects by relations of particular given types or statements negating such relations, i.e. expressing that these relations do not or cannot occur in the given KB. A negated relation can be expressed directly via a negated statement or indirectly, e.g. via a disjointWith relation that forbids the existence of such a relation.

In its more precise version given further below, we call this ODR the “comparability via particular relation types” ODR, or simply the “comparability ODR”. We call it an ODR, not a pattern nor a KB evaluation criteria/measure because this is something in between. As above explained, it is always automatically checkable. It is also reusable for evaluating a KB for example by applying it to all its objects and dividing the number of successul cases by the number of objects. An example of KB evaluation criteria that can be generalized by a reuse of this ODR is the “schema-based coverage” criteria of [PlanetData D2.1: Mendes et al., 2012; pp. 26-29] which measures the percentage of objects using the relations that they should or could use according to schemas or relation signatures. Examples of methodologies, best practices or ontology patterns that can be generalized via the use of this ODR are those advocating the use of tree structures or of genus & differentia when organizing or defining types. (Section 2.3 and Section 3.4 detail this last point.)

Before formulating this ODR more precisely, it seems interesting to further detail its application to the OWL properties subClassOf or equivalentClass – along with the properties that negate or exclude them, e.g. disjointWith and complementOf. Using all these properties whenever relevant, as this applied ODR encourages, will for example lead the authors of a KB to organize the direct subtypes of each class – or at least each top-level class – into “complete sets of exclusive subtypes” (each of such set being a subtype partition, or in other words, a disjoint union of subtypes equivalent to the subtyped class), and/or “incomplete sets of exclusive subtypes”, and/or “(in-)complete sets of subtypes that are not exclusive but still different and not relatable by subClassOf relations”, etc. The more systematic the organization, the more a test of whether a class is subClassOf_or_equivalent (i.e. is subClassOf, equivalentClass or sameAs) another class will lead to a true/false result, not an “unknown” result. In other words, the more such a test will lead to a true/false result without the use of “negation as failure” (e.g. via the “closed-world assumption” with which any statement not represented in the KB is considered to be false) or the use of the “unique name assumption” (with which different identifiers are supposed to refer to different things). Since most inferences are based on such subClassOf_or_equivalent tests, the more systematic the organization, the more inferences will be possible without having to use negation as failure. This is interesting since using negation as failure implies making an assumption about the content of a whole KB whereas adding subClassOf or disjointWith relations is adding information to a KB.

The next two sections illustrate some of the numerous advantages of the more systematic organization resulting from the application of this ODR: for inferencing, for querying, for avoiding what could have otherwise been implicit redundancies or inconsistencies and, more generally, for improving the completeness, consistency and precision of a KB. These advantages are not restricted to subClassOf_or_equivalent relations. They apply to all specializationOf_or_equivalent relations, i.e. specializationOf relations (which, as defined below, generalize subClassOf relations), equivalence relations or sameAs relations. As we shall see, these advantages also apply – although to a lesser extent – to other transitive relations such as partOf_or_equivalent (i.e. isSubPartOf, equivalentClass or sameAs). We call “specialization of an object” any other object that represents or refers to more information on the same referred object. This covers all subtype relations as well as specialization relations between statements or other individuals, e.g. between the statements “some cars are red” and “John's car is dark red” or between the cities “Seattle” and “Seattle-between-2010-and-2015”. We call “statement” a relation or a set of connected relations.

We adopt the following “comparability” related definitions. Two objects are “comparable via a relation of a particular type” (or, more concisely, “comparable via a particular relation type”) if they are either identical (sameAs), equivalent (by intension, not extension) or connected by a relation of this type. Two objects are “uncomparable via a relation of a particular relation type” (or, more concisely, “uncomparable via a particular property”) if they are different and if some statement in the KB forbids a relation of this type between these two objects. Given these definitions, the comparability ODR can be defined as testing whether “each object (in the KB or a part of the KB selected by the user) is defined as either comparable or uncomparable to each other object (or at least some object, if the user prefers) via each of the tested relation types”. In a nutshell, the comparability ODR checks that between particular selected objects there is “either a comparability or an uncomparability via particular relations”. This ODR does not rely on particular kinds of KBs or inference engines but some engines may be more relevant to use for checking a KB if they infer more relations.

Stronger versions of this ODR can be used. E.g., for a more organized KB, some users may wish to have “either comparability or strong uncomparability” via relations of particular types between any two objects. Two objects are “strongly uncomparable via a relation of a particular type” if they are different and some statement in the KB forbids the existence of a relation of this type between the two objects as well as between their specializations. E.g., disjoint classes are strongly uncomparable since they cannot have shared instances or shared subclasses (except for owl:Nothing).

A more general version of this ODR could also be defined by using “equivalence by intension or extension” instead of simply “equivalence by intension”. In this article, “equivalence” means “equivalence by intension” and “specialization” is also “specialization by intension”. This article also assumes that equivalence or specialization relations (or their negations) which are automatically detectable by the used inference engine are made explicit by KB authors and thus can be exploited via SPARQL queries. In a description logics based KB, this can be achieved by performing type classification and individual categorization before checking the ODR.

Figure 1 shows a simple graphic user interface for selecting various options or variants for this ODR. With the shown selected items (cf. the blue items in Figure 1 and the words in italics in the rest of this sentence), this interface generates a function call or query to check that each object (in the default KB) which is instance of owl:Thing is either comparable or uncomparable via specialization relations and part relations to each other object in the default KB. Figure 1 shows a function call. After the conversion of its last three parameters into formal types, a similar call can be made to a generic function. This function is generic in the sense that it exploits formal types and that most of what its specification – hence most of what it achieves – comes from these formal types. In other words, by specializing the expected parameter types, the person using this function can specify or adapt most of its behaviour. Section 5 (Appendix) defines this generic function and the relations types it exploits, and thus shows how its parameters can be taken into account. To achieve this, these definitions are written in a higher-order logic based language. An OWL-focused variant of this interface, i.e. a less generic interface but one adapted to what can be represented with OWL, has also been created.

With the comparability_or_uncomparability option (hence with the comparability ODR), equivalence or sameAs relations are always exploited in addition to the specified relations. When it is not relevant to also exploit equivalence or sameAs relations, the connectability_or_un-connectability option shown in Figure 1 should be selected.

The next two sections show the interests of this ODR for, respectively, i) subtype relations, and ii) other relations, e.g. partOf relations and specializationOf relations with genus & differentia. When relevant, these sections present type definitions in OWL, as well as SPARQL queries or update operations and SHACL constraints, to illustrate how this ODR can be implemented with these languages. Section 4 provides more comparisons with other works and concludes.

Figure 1. A simple interface for object comparability evaluation

2. Comparability of Types Via Subtype Relations

2.1. Representation Via OWL

In this document, OWL refers to OWL-2 (OWL-2 DL or OWL-2 Full) [Hitzler et al., 2012] and OWL entities are prefixed by “owl:”. All the types that we propose in this article are declared or defined in http://www.webkb.org/kb/it/o_KR/d_odr_content/sub/ and the “sub” namespace is here used to abbreviate this URL. Unless otherwise specified, the syntax used for defining these types is Turtle, and the syntax used for defining queries or update operations is SPARQL. SPARQL uses Turtle for representing relations. For clarity purposes, identifiers for relation types have a lowercase initial while other identifiers have an uppercase initial. Relation types identifiers that do not include “Of” could have been – but are not – prefixed by “has_”, e.g. “sub:has_part” could have also been used instead of “sub:part”.

To illustrate the interest of representing exclusion relations between classes whenever possible – and, more generally, to give a first concrete example of the interest of making types “uncomparable via subClassOf relations” whenever possible – here is an example in two parts. The first part is composed of the following RDF+OWL/Turtle statements. They do not represent any exclusion relation. They represent a few relations from WordNet 1.3 (not the current one, WordNet 3.1). According to these relations, Waterloo is both a battle and a town, any battle is a (military) action, any town is a district, and any district is a location.

wn:Waterloo rdf:type wn:Battle, wn:Town. wn:Battle rdfs:subClassOf wn:Military_action. wn:Military_action rdfs:subClassOf wn:Action. wn:Town rdfs:subClassOf wn:District. wn:District rdfs:subClassOf wn:Location.

Now, as a second part of the example, a disjointWith relation is added between two top-level classes: the one for actions and the one for locations. Such an exclusion relation between actions and locations has not been made explicit in WordNet but is at least compatible with the informal definitions associated to categories in WordNet. Given all these relations, an OWL inference engine (that handles disjointWith relations) detects that the categorization of Waterloo as both a battle and a town is inconsistent. As illustrated in Section 2.3, many other possible problems in WordNet 1.3 were similarly detected. Most of them do not exist anymore in the current WordNet.

wn:Action owl:disjointWith wn:Location. #-> exclusion relations between actions and locations

OWL DL is sufficient for representing statements implying that particular classes are “comparable via subClassOf (relations)” or “strongly uncomparable via subClassOf”. For this second case, which amounts to state that two classes are disjoint, the properties owl:AllDisjointClasses, owl:complementOf, owl:disjointWith and owl:disjointUnionOf can be used. OWL Full [Hitzler et al., 2012] is necessary for setting owl:differentFrom relations between classes, and hence, as shown below, for defining the property sub:different_and_not_subClassOf as a subproperty of owl:differentFrom. In turn, this property is necessary for representing statements implying that particular classes are weakly uncomparable, i.e. uncomparable but not strongly uncomparable (hence not disjointWith nor complementOf). OWL Full is also necessary for defining the properties sub:different_and_not_equivalentClass and sub:proper-subClassOf (alias, sub:subClassOf_and_not-equivalentClass). With all the above cited types, it is possible for KB authors to express any relationship of “comparability or uncomparability via subClassOf”.

OWL inference engines generally cannot exploit OWL Full and hence do not enforce nor exploit the semantics of definitions requiring OWL Full. When inference engines do not accept OWL Full definitions, the above cited “sub:” properties have to be solely declared (as being properties) instead of being defined via relations (hence by a logic formula). However, when inference engines do not accept or do not exploit OWL Full definitions, the loss of inferencing possibilities due to the non-exploitation of the above cited “sub:” properties is often small. When the goal is simply to detect whether the comparability ODR is followed, if the SPARQL query proposed in the next subsection is used to achieve that goal, it does not matter whether the above cited “sub:” properties are declared or defined.

Making every pair of classes in a KB comparable or uncomparable via subClassOf is cumbersome without the use of properties that create (in-)complete sets of (exclusive) subclasses. We propose such properties, e.g. sub:complete_set_of_uncomparable-subClasses, sub:incomplete_set_of_uncomparable-subClasses and sub:proper-superClassOf_uncomparable_with_its_siblings. Such complex properties cannot be defined in OWL. However, as illustrated below, SPARQL update operations can be written to replace the use of these complex properties by the use of simpler properties that OWL inference engines can exploit.

sub:proper-subClassOf rdfs:subPropertyOf rdfs:subClassOf; owl:propertyDisjointWith owl:equivalentClass . #a "proper subclass" is a "strict subclass" (a direct or indirect one) sub:proper-subPropertyOf rdfs:subPropertyOf rdfs:subPropertyOf; owl:propertyDisjointWith owl:equivalentProperty . sub:different_and_not_subClassOf rdfs:subPropertyOf owl:differentFrom; owl:propertyDisjointWith rdfs:subClassOf . sub:different_and_not_equivalentClass rdfs:subPropertyOf owl:differentFrom; owl:propertyDisjointWith owl:equivalentClass . sub:proper-superPropertyOf owl:inverseOf sub:proper-subPropertyOf . sub:proper-superClassOf owl:inverseOf sub:proper-subClassOf . sub:proper-superClassOf_uncomparable_with_its_siblings #partial definition since OWL does not rdfs:subPropertyOf sub:proper-superClassOf . # support the definition of the # "uncomparable_with_its_siblings" part #Example of a SPARQL update operation to replace the use of # sub:proper-superClassOf_uncomparable_with_its_siblings relations by simpler relations: DELETE { ?c sub:proper-superClassOf_uncomparable_with_its_siblings ?sc1, ?sc2 } INSERT{ ?c sub:proper-superClassOf ?sc1, ?sc2 . ?sc1 sub:different_and_not_subClassOf ?sc2 . ?sc2 sub:different_and_not_subClassOf ?sc1 } WHERE { ?c sub:proper-superClassOf_uncomparable_with_its_siblings ?sc1, ?sc2 . FILTER (?sc1 != ?sc2) }

Similarly, to state that particular properties are (strongly or at least weakly) “uncomparable via rdfs:subPropertyOf relations”, OWL DL is sufficient. For strong uncomparability, owl:propertyDisjointWith relations can be used. Defining that particular properties are only weakly uncomparable, i.e. uncomparable but not strongly uncomparable, is possible in OWL Full, exactly as for subClassOf relations: to define these properties, it is sufficient to replace every occurence of “class” by “property” in the above code. As for classes too, if these “sub:” properties are only declared instead of being defined, the loss of inferencing possibilities is small.

2.2. Checking Via SPARQL

Using SPARQL (1.1) [Harris & Seaborne, 2013] to check the “comparability of classes via subClassOf relations” means finding each class that does not follow this ODR, i.e. that each class that is neither comparable nor uncomparable via subClassOf relations to each/some other class in selected KBs (“each/some” depending on what the user wishes to test).

Below is a SPARQL query for the “each other class” choice, followed by a SPARQL query for the “some other class” choice. In any case, if instead of the “comparability_or_uncomparability” option (the default option selected in Figure 1), the user prefers the “comparability_or_strong-uncomparability” option, the two lines about sub:different_and_not_subClassOf relations should be removed. For the “connectability_or_un-connectability” option, the line about owl:equivalentClass and owl:sameAs relations should instead be removed.

SELECT distinct ?c1 ?c2 WHERE #query for the "each other class" choice { ?c1 a owl:Class. ?c2 a owl:Class. #for each pair of classes ?c1 and ?c2 #skip comparable objects (here, classes ?c2 comparable to ?c1): FILTER NOT EXISTS{ ?c1 rdfs:subClassOf ?c2 } FILTER NOT EXISTS{ ?c2 rdfs:subClassOf ?c1 } FILTER NOT EXISTS{ ?c1 owl:equivalentClass ?c2 } FILTER NOT EXISTS{ ?c1 owl:sameAs ?c2 } #skip strongly uncomparable objects: FILTER NOT EXISTS{ ?c1 owl:complementOf ?c2 } FILTER NOT EXISTS{ ?c1 owl:disjointWith ?c2 } FILTER NOT EXISTS{ [] a owl:AllDisjointClasses; owl:members/rdf:rest*/rdf:first ?c1,?c2 } FILTER NOT EXISTS{ [] owl:disjointUnionOf/rdf:rest*/rdf:first ?c1,?c2 } #skip remaining uncomparable objects that are only weakly uncomparable: FILTER NOT EXISTS{ ?c1 owl:differentFrom ?c2 } } #no need to use sub:different_and_not_subClassOf here since, # at this point, subClassOf relations have already been filtered out SELECT distinct ?c1 WHERE #query for the "some other class" choice { ?c1 a owl:Class. #for each class ?c1 #skip comparable objects FILTER NOT EXISTS { ?c1 owl:sameAs|rdfs:subClassOf|owl:equivalentClass ?c2 FILTER ((?c1!=?c2) && (?c2!=owl:Nothing)) } #skip strongly uncomparable objects FILTER NOT EXISTS { ?c1 owl:complementOf|owl:disjointWith ?c2 FILTER ((?c1!=?c2) && (?c2!=owl:Nothing)) } FILTER NOT EXISTS{ [] a owl:AllDisjointClasses; owl:members/rdf:rest*/rdf:first ?c1,?c2 } FILTER NOT EXISTS{ [] owl:disjointUnionOf/rdf:rest*/rdf:first ?c1,?c2 } #skip remaining uncomparable objects that are only weakly uncomparable: FILTER NOT EXISTS{ ?c1 owl:differentFrom ?c2 } }

This query can be reused to create a constraint in SHACL (Shapes Constraint Language, a language ontology – such as OWL – proposed by the W3C to enable the definition of constraints in RDF) [Knublauch & Kontokostas, 2017]:

sub:Shape_for_the_comparability_of_each_class_to_each_other_class_via_subClass_relations a sh:NodeShape; sh:targetClass owl:Class; sh:sparql [ sh:message "This class '?c1' is neither comparable nor uncomparable to the class '?c2' " ; sh:prefixes sub: ; sh:select """ SELECT distinct $this ?value WHERE { #the body of the above SPARQL query should be copied here # with (every occurrence of) "?c1" replaced by "$this" and "?c2" replaced by "?value" } """ ; ].

For the “some other class” choice, it is possible to use SHACL Core, instead of SHACL-SPARQL (i.e., it is possible not to use an embedded SPARQL query):

sub:Shape_for_the_comparability_of_each_class_to_some_other_class_via_subClass_relations a sh:NodeShape; sh:targetClass owl:Class; sh:or (#comparable objects: [sh:path rdfs:subClassOf; sh:minCount 1; sh:class owl:Class] [sh:path sub:superClassOf; sh:minCount 1; sh:class owl:Class] [sh:path owl:equivalentClass; sh:minCount 1; sh:class owl:Class] [sh:path owl:sameAs; sh:minCount 1; sh:class owl:Class] #strongly uncomparable objects: [sh:path owl:complementOf; sh:minCount 1; sh:class owl:Class] [sh:path owl:disjointWith; sh:minCount 1; sh:class owl:Class] [sh:path ( [sh:inversePath rdf:first] [sh:zeroOrMorePath rdf:rest] owl:members ); sh:minCount 1; sh:class owl:AllDisjointClasses ] [sh:path ( [sh:inversePath rdf:first] [sh:zeroOrMorePath rdf:rest] owl:members ); sh:minCount 1; sh:class owl:owl:disjointUnionOf ] #only weakly uncomparable: [sh:path sub:different_and_not_subClassOf; sh:minCount 1; sh:class owl:Class] [sh:path [sh:inversePath sub:different_and_not_subClassOf]; sh:minCount 1; sh:class owl:Class] ).

Checking the “comparability of properties via subPropertyOf relations” is similar to checking the “comparability of classes via subClassOf relations”. The above presented SPARQL query can easily be adapted. The first adaptation to make is to replace every occurence of “class” by “property” and to replace “disjointWith” by “propertyDisjointWith”. The second adaptation to make is to remove the lines about “complementOf”, “AllDisjointClasses” and “disjointUnionOf” since in OWL these types do not apply to properties and have no counterpart for properties.

Dealing with several datasets. A KB may reuse objects defined in other KBs; object identifiers may be URIs which refer to KBs where more definitions on these objects can be found. We abbreviate this by saying that these other KBs or definitions are reachable from the original KB. Similarly, from this other KB, yet other KB can be reached. One feature proposed in Figure 1 is to check all objects “in the KB and those reachable from the KB”. Since comparability checking supports the detection of particular inconsistencies and redundancies (cf. next subsection and next section), the above cited feature leads to the checking that a KB does not have particular inconsistencies or redundancies with the KBs reachable from it. This feature does not imply fully checking these other KBs. The above presented SPARQL query does not support this feature since it checks classes in the dataset of a single SPARQL endpoint. Implementing this feature via SPARQL while still benefiting from OWL inferences unfortunately requires the SPARQL engine and the exploited OWL inference engine to work on a merge of all datasets reachable from the originally queried dataset. For small datasets, one way to achieve this could be to perform such a merge beforehand via SPARQL insert operations. However, when it is not problematic to give up OWL inferences based on knowledge from other datasets, an alternative is to use a SPARQL query where i) “SPARQL services” are used for accessing objects in other datasets, and ii) transitive properties such as rdfs:subClassOf are replaced by property path expressions such as “rdfs:subClassOf+”.

2.3. Advantages For Reducing Implicit Redundancies, Detecting Inconsistencies and Increasing Knowledge Querying Possibilities

Within or across KBs, hierarchies of types (classes or properties) may be at least partially redundant, i.e. they could be at least partially derived from one another if particular type definitions or transformation rules were given. Implicitly redundant type hierarchies, i.e. non-automatically detectable redundancies between type hierarchies, are reduced and easier to merge (manually or automatically) when types are related by subtypeOf_or_equivalent relations, e.g. subClassOf, subPropertyOf, equivalentClass or equivalentProperty relations. Using such relations is also a cheap and efficient way of specifying the semantics of types.

Relating types by not_subtypeOf-or-equivalent relations – e.g. disjointWith or complementOf relations – permits the detection or prevention of incorrect uses of such relations and of instanceOf relations. These incorrect uses are generally due to someone not knowing some particular semantics of a type, because this someone forgot this semantics or because this semantics was never made explicit. The two-point list below gives some examples extracted from [Martin, 2003]. In this article, the author – who is also the first author of the present article – reports on the way he converted the noun related part of WordNet 1.3 into an ontology. Unlike for other such conversions, the goal was to avoid modifying the meanings the conceptual categories of WordNet as specified by their associated informal definitions and informal terms. The author reports that, after adding disjointWith relations between top-level conceptual categories which according to their informal definitions seemed exclusive, his tool automatically detected 230 violations of these exclusions by lower-level categories. In the case of WordNet, what these violations mean is debatable since it is not an ontology. However, like all such violations, they can at least be seen as heuristics for bringing more precision and structure when building a KB. The authors of WordNet 1.3 were sent the list of the 230 detected possible problems. Due to this sending or not, most of these possible problems do not occur anymore in the current WordNet (3.1).

Many of the 230 possible problems were detected via the added exclusion relations between the top-level category for actions and other top-level categories which seemed exclusive with it, based on their names, their informal definitions and those of their specializations. Via the expression “informal definition” we refer to the description in natural language that each WordNet category has. Via the expression “categorized as” we refer to the generalization relations that an object has in WordNet. The above mentioned added exclusion relations led to the discovery of categories – e.g. those for some of the meanings of the words “epilogue” and “interpretation” – which were i) categorized and informally defined as action results/attributes/descriptions, ii) seemingly exclusive with actions (given how they were informally defined and given they were not also informally defined as actions), and iii) (rather surprisingly) also categorized as actions. Given these last three points, [Martin, 2003] removed the “categorization as action” of these action result/attribute/description categories. Based on the content of WordNet 3.1, it appears that the authors of WordNet then also made this removal.
Other causes for the 230 violations detected via the added exclusion relations between top-level categories came from the fact that WordNet uses generalization relations between categories instead of other relations. E.g., instead of location/place relations: in WordNet 1.3, many categories informally defined as battles were classified as both battles and cities/regions (this is no more the case in WordNet 3.1). E.g., instead of member relations: in WordNet, the classification of species is often intertwined with the classification of genus of species.

Several research works in knowledge acquisition, model-driven engineering or ontology engineering, e.g. [Marino, Rechenmann & Uvietta, 1990] [Bachimont, Isaac & Troncy, 2002] [Dromey, 2006] [Rector et al., 2012], have advocated the use of tree structures when designing a subtype hierarchy, hence the use of i) single inheritance only, and ii) multiple tree structures, e.g. one per view or viewpoint. They argue that each object of the KB has a unique place in such trees and thus that such trees can be used as decision trees or ways to avoid redundancies (in the same sense as in the previous paragraphs), normalize KBs and ease KB searching/handling. This is true but the same advantages can be obtained by creating subtypes solely sets of disjoint (direct) subtypes. Indeed, to keep these advantages, it is sufficient (and necessary) that whenever two types are disjoint, this disjointness is specified. With tree structures, there are no explicit disjointWith relations but the disjointness is still (implicitly) specified. Compared to the use of multiple tree structures, the use of disjoint subtypes and multiple inheritance has the advantages of i) not requiring a special inference engine to handle “tree structures with bridges between them” (e.g. those of [Marino, Rechenmann & Uvietta, 1990] [Djakhdjakha, Hemam & Boufaïda, 2014]) instead of a classic ontology, and ii) generally requiring less work for knowledge providers than creating and managing many tree structures with bridges between them. Furthermore, when subtype partitions can be used, the completeness of these sets supports additional inferences for checking or reasoning purposes. The various above rationale do not imply that views or tree structures are not interesting to use, they only imply that sets of disjoint (direct) subtypes are good alternatives when they can be used instead.

Methods or patterns to fix (particular kinds of) detected conflicts are not within the scope of this article. Such methods are for example studied in the belief set/base revision/contraction as well as in KB debugging. [Corman, Aussenac-Gilles & Vieu, 2015] proposes an adaptation of base revision/debugging for OWL-like KBs. The authors of [Djedidi & Aufaure, 2009] have created ontology design patterns that propose systematic ways to resolve some particular kinds of inconsistencies, especially the violation of exclusion relations.

As illustrated in Section 2.1, the OWL properties usable to express that some types are “comparable or uncomparable via subtypeOf” – e.g. subClassOf, subPropertyOf, equivalentClass, equivalentProperty, disjointWith and complementOf relations – can be combined to define or declare properties for creating (un-)complete sets of (non-)disjoint subtypes or, more generally, for creating more precise relations which better support the detection of inconsistencies or redundancies. E.g., sub:proper-subclassOf can be defined and used to prevent unintended subClassOf cycles.

Advantages For Knowledge Querying. Alone, subtypeOf_or_equivalent relations only support the search for specializations (or generalizations) of a query statement, i.e. the search for objects comparable (via subtype relations) to the query parameter. The search for objects “not uncomparable via specialization” to the query parameter – i.e. objects that are or could be specializations or generalizations of this parameter – is more general and sometimes useful.

Assume that a KB user is searching for lodging descriptions in a KB where sports halls are not categorized as lodgings but are not exclusive with them either, based on the fact that they are not regular lodgings but that they can be used as such when natural disasters occurs. Also assume that the user intuitively shares such views on lodgings and sports halls. Then, querying the KB for (specializations of) “lodgings” will not retrieve sports halls. On the other hand, querying for objects not uncomparable to “lodgings” will return sports halls; furthermore, if lodgings have been defined as covered areas, such a query will not return uncovered areas such as open stadiums. Thus, assuming that the term "lodging" in this previous querying has been used because the author of the query was looking for covered areas only, this person will only get potentially relevant results.
More generally, when a person does not know which exact type to use in a query or does not know what kind of query to use – e.g. a query for the specializations or the generalizations of the query parameter – a query for objects “not uncomparable” to the query parameter may well collect all and only the objects the person is interested in, if in the KB all or most types are either comparable or uncomparable via subtype relations.
As another example, querying a KB about hotels for (instances of) “a room less than 100$ per night” will not retrieve rooms for which the price has not been registered and hence that may be less than 100$ per night. On the other hand, a query for objects “not uncomparable” to “a room less than 100$ per night” will also retrieve rooms with unregistered prices.

The more systematically the types of a KB are comparable via subtype relations, the more the statements of the KB – as well as other if they have a definition – will be retrievable via comparability or uncomparability based queries.

3. Other Interesting Cases of Comparability

3.1. Comparability of Statements Or Individuals Via Specialization Relations

The previous section was about the comparability of types via subtype relations. This subsection generalizes the approach to individuals, and hence studies the comparability of individuals via specialization relations, even though OWL does not provide special properties to support this generalization. These special properties cannot be defined in OWL but can still be declared, and many graph-based inference engines can exploit them. This generalization is particularly exploited in our knowledge server WebKB-2 [Martin, 2011], as illustrated below.

Everything that is not a type is an individual. Thus, ontologically speaking, a statement (e.g., an RDF triple, a set of connected relations) is an individual. However, in RDF and OWL, given particular expressiveness restrictions, the term “individuals” does not also refer to “statements”. Thus, we keep that distinction and use the expression “statements or individuals”. As noted in the introduction, statements can be connected by specialization relations (as for the statements “some cars are red” and “John's car is dark red”) and individuals too (as for the individuals “Seattle” and “Seattle-between-2010-and-2015”). Whenever there exists a specialization between statements describing individuals, there is a specialization between these individuals.

More generally, like types, statements or individuals can be connected by comparability relations such as specialization, equivalence, identity, difference and exclusion (“exclusion” is a synonym of “disjointness”). These relations can be manually set. For statements, these relations can also be inferred. For individuals, these relations can also be inferred if the individuals have definitions and if all the objects in these definitions can be compared via specialization relations. Between two existential conjunctive statements, a generalization relation is equivalent to a logical implication [Chein & Mugnier, 2008].

“Comparability via specialization” between statements or individuals has the same advantages as between types but these advantages are less well-known, probably for the next two points. One is that the comparability relations can often be automatically inferred. Another is that, nowadays, in most KBs, there are few opportunities to relate individuals by specialization relations. However, as illustrated by the next three paragraphs, there are cases where individuals are related by specialization relations.

One case to this second point is when individuals are used when types could or should rather be used, like when types of molecules are represented via individuals in chemistry ontologies.

A second case is when, for indexation purposes, i.e. for knowledge retrieval performance purposes, a specialization hierarchy on parts of the definitions of the individuals is automatically generated (e.g., via a method akin to Formal Concept Analysis): this hierarchy of generated individuals is generally large and very well organized.

A third case is when the KB has a shared KB edition protocol which, when a statement addition by one person would lead to an inconsistency or a partial/full redundancy with a statement from another person, ask the updater for more information in order to i) alert this author about the partial/full problem and ii) lead this author to relate the two statements by a special “comparability via specialization” relation which explains why the update is made, avoids the entering of any genuine inconsistency in the KB, and permits other persons to choose between the two statements if a choice is necessary. The knowledge server WebKB-2 supports such a shared KB edition protocol [Martin, 2011] and the main types of special “comparability via specialization” relations it provides are: corrective_specialization, corrective_generalization, corrective_reformulation, corrective_exclusion, corrective_alternative and statement_instantiation. The rationale for these names comes from the following fact illustrated by the next two examples below: when the statement (or “belief”) that the updater wants to add is fully/mostly “comparable by specialization” to an already stored statement – something that WebKB-2 mainly detects by graph matching – this means that i) the updater (at least partially) disagree with this stored statement and hence corrects it using one of the five above cited “corrective relations”, or ii) wants to enter a full/partial instantiation of it, e.g. an example of it.

For a first example, assume that one author has entered that (he believes that) “every bird flies”, or a more formal version of this belief. This one is erroneous since there are birds that do not fly. Also assume that another author wants to enter that (he believes that) “every healthy adult carinate bird can fly”, where “carinate” refers to “birds that can fly”. If WebKB-2 detects that these two beliefs are “comparable by specialization”, WebKB-2 will alert the second author and this one will be led to enter that (he believes that) ` “every healthy adult carinate bird can fly” is a corrective_specialization of “every bird flies” ´. This has several advantages, as summarized by the next two points.

First, even for an inference engine that does not handle beliefs or that does not handle contexts (i.e. meta-statements specifying when or for whom the statement is true), the whole phrase entered by the second author is not technically inconsistent with “every bird flies” since the whole phrase does not assert the statement “every healthy adult carinate bird can fly”. Thus, such beliefs can be still be accepted and can be exploited by most inference engines for supporting knowledge retrieval like purposes. E.g., the answers to queries such as “do birds fly?” (or `what are the specializations of “every bird flies” ´) can provide the corrective specialization/generalization hierarchy of the answers instead of a sequential list of answers, something which is not as easy to exploit mentally when there are many answers. Furthermore, if there are still too many answers, the user may want to see only the statements that are “un-corrected”, i.e. that do not have one of the five above cited corrective relations from them. To that end, the new query can for example be `give me the un-corrected statements about “every bird flies” that have been authored by persons having a degree in ornithology´. In other words, the approach enforce knowledge organization and hence support knowledge filtering on this organization. This approach supports “semantically structured debates”, i.e. the cooperative building of well-organized argumentation structures on a subject. Since this knowledge organization approach does not require the statement to be formal, it can be used in semantic wikis or argumentation tools, e.g. to avoid edit wars or discussion loops.
Second, when a choice between conflicting beliefs is required for some deeper inferencing, this choice can be made automatically, given the preference of each user. E.g., one user may specifies that, in case of conflicts, only the un-corrected statements should be exploited for inferences, or only the most specialized ones. When such filters are not sufficient to eliminate conflicts, the inference engines can alert the user.

As another example, if one author has entered that “there is a bird that flies” and another author wants to enter that “Tweety is a bird that flies”, and if WebKB-2 detects that these two statements are “comparable by specialization”, WebKB-2 will alert the second author and this one will be led to enter that ` “Tweety is a bird that flies” is a statement_instantiation of “there is a bird that flies” ´. For knowledge retrieval, synthesis or argumentation purposes, the advantages are the same as introduced above. For knowledge exploitation purposes, “Tweety is a bird that flies” is also automatically asserted separately since there is explicitly no belief conflict here.

3.2. Comparability Via “Definition Element” Relations

In this article, an object definition is a logic formula that all specializations of the object must satisfy. A full definition specifies necessary and sufficient conditions that the specializations must satisfy. In OWL, a full definition of a class is made by relating this class to a class expression via an owl:equivalentClass relation. Specifying only necessary conditions – e.g. using rdfs:subClassOf instead of owl:equivalentClass – means making only a partial definition. An “element of a definition” is any target domain object which is member of that definition, except for objects of the used language (e.g. quantifiers and logical operators). A “definition element” relation is one that connects the defined object to an element of the definition. E.g., if a Triangle is defined as a “Polygon that has as part 3 Edges and 3 Vertices”, Triangle has as definition elements the types Polygon, Edge, Vertex and part as well as the value 3. The property sub:definition_element – one of the types that we propose – is the type of all “definition element” relations that can occur with OWL-based definitions. We have fully defined sub:definition_element below, based on the various ways definitions can be made in OWL; one of its subtypes is sub:proper-subClassOf. This subsection generalizes Section 2 since a definition may specify other relations than subClassOf relations, as illustrated by the above definition of Triangle. A “definition-element exclusion” relation is one that connects an object O to another one that could not be used for defining O. This property can be defined based on the “definition element” relation type. E.g.:

sub:definition-element_exclusion #reminder: "has_" is implicit rdfs:subPropertyOf owl:differentFrom ; owl:propertyDisjointWith sub:definition_element ; owl:propertyDisjointWith [owl:inverseOf sub:definition_element] .

As explained in Section 2.3, ensuring that types in a KB are either comparable or uncomparable via subtype relations reduce implicit redundancies between type hierarchies. As illustrated by the paragraph titled “Example of implicit potential redundancies” below, this checking is not sufficient for finding every implicit potential redundancy resulting from a lack of definition, hence for finding every specialization hierarchy that could be derived from another one in the KB if particular definitions were given. However, this new goal can be achieved by generalizing the previous approach since this goal implies that for every pair of objects (in the KB or a selected KB subset), either one of these objects is defined using the other or none can be defined using the other. In other words, this goal means checking that for every pair of objects in the selected set, these two objects are either comparable or uncomparable via “definition element” relations. This is a generalization of the previous approach since subtypeOf relations are “definition element” relations. To express that two objects are strongly uncomparable in this way – and hence not potentially redundant – “definition-element exclusion” relations can be used. Here is a query that detects potential redundancies.

SELECT distinct ?subC1 ?subC2 ?x WHERE { ?subC1 sub:definition_element ?r1 ?c2; rdfs:subClassOf ?c1. ?subC2 sub:definition_element ?r2 ?c2; rdfs:subClassOf ?c1. FILTER( (?c1 != ?c2) && ((?r1 = ?r2)||(EXISTS{ ?r1 rdfs:subClassOf ?r2 }) ) #the next lines check that ?c2 is semantically a destination of ?r1 and ?r2 FILTER( (EXISTS {?r1 rdfs:subClassOf sub:definition_element}) ||(EXISTS {?subC1 ?pr1 [rdf:type owl:Restriction; owl:onProperty ?r1; owl:someValuesFrom ?c2]})) FILTER( (EXISTS {?r2 rdfs:subClassOf sub:definition_element}) ||(EXISTS {?subC2 ?pr2 [rdf:type owl:Restriction; owl:onProperty ?r2; owl:someValuesFrom ?c2]})) }

The above cited new goal implies that, from every object, every other object in the KB is made comparable or uncomparable via “definition element” relations. This is an enormous job for a KB author and very few current KBs would satisfy this ODR. However, given the reasons and techniques given in the two points below, a KB contributor/evaluator may choose to assume that for avoiding a good enough amount of implicit potential redundancies between type hierarchies, it is sufficient to check that from every object, at least one other object in the KB is made comparable or uncomparable via “definition element” relations (thus, using the “some other object” option given in Figure 1, instead of the “every other object” option). Here are the underlying ideas:

When an object O has been given at least one definition (hence possibly more than one), or at least one full definition if the user of this ODR wants to be slightly more cautious, this user may choose to assume that, for his goals, the definitions of O can be exploited via SPARQL queries to find enough potential redundancies between type hierarchies in the KB or in the tested set of objects. For instance, imagine a KB where i) a class A is defined wrt. a class B, ii) A has a subclass A' that only differs from A by the fact its instances are defined to have one more attribute C, e.g., the color blue, and iii) B has a subclass B' that only differs from B by the fact its instances are defined to have the attribute C. Then, there is a type hierarchy potential redundancy in the KB – since A' could be generated from B' instead of being manually declared, if the user prefers not to have this potential redundancy – and it is detected via the above cited SPARQL queries.
From the definitions of an object O, it is possible – e.g., via SPARQL, when a KB is loading – to i) collect the list of the definition elements of O (the direct ones and the indirect ones since the direct ones may themselves be defined), and ii) automatically generate a “definition-element exclusion” relation between O and each object in the KB that is not in this list. Here is such a SPARQL query:
INSERT { ?o sub:definition-element_exclusion ?x } WHERE { ?o sub:definition_element+ ?o_def_elem . ?x rdf:type owl:Thing . FILTER (?x != ?o_def_elem) }
The user of this ODR may choose to assume that such temporarily generated relations are correct, hence that the objects they connect are not potentially redundant. This means that, if we take cases like those of the last above example except that now the class A is not defined wrt. the class B, the user assumes that A' is not potentially redundant with B'. E.g., if the class Tree is not defined wrt. the class Sea, similarities between their subtyping would not indicate potential redundancies. This assumption and its associated generation of “definition-element exclusion” relations save KB authors a lot of work or permit the evaluation of KBs. They may also avoid generating a large number of “definition-element exclusion” relations.

Example of implicit potential redundancies. It is often tempting to specialize particular types of processes or types of physical entities according to particular types of attributes, without explicitly declaring these types of attributes and organizing them by specialization relations. E.g., at first thought, it may sound reasonable to declare a process type Fair_process without relating it to an attribute type Fairness (or Fair) via a definition such as “any Fair_process has as attribute a Fairness”. However, Fair_process may then be specialized by types such as Process_fair_for_utilitarianism, Fair_process_for_prioritarianism, Fair_process_wrt_Pareto_efficiency, Fair_process_wrt_monotonicity_on_utility_profiles, Fair_bargaining, Fair_distribution, Fair_distribution_wrt_utilitarianism, Fair_distribution_for_prioritarianism, Fair_distribution_wrt_Pareto_efficiency, etc. It soon becomes apparent that this approach is not relevant since i) every process type can be specialized wrt. a particular attribute type or any combination of particular attribute types, and ii) similar specializations can also be made for function types (e.g. starting from Fair_function) and attribute types (starting from Fairness). Even if the KB is not a large KB shared by many persons, many beginnings of such parallel categorizations may happen, without them being related via definitions. Indeed, the above example with process types and attribute relations to attributes types can be replicated with any type and any relation type, e.g. with process types and agent/object/instrument/time relation types or with physical entity types and mass/color/age/place relation types.

Ensuring that objects are either comparable or uncomparable via “definition element” relations is a way to prevent such (beginnings of) implicitly potentially redundant type hierarchies: all/most/many of them depending on the chosen option and assumption. As with disjointWith relations, the most useful “definition-element exclusion” relations are those between some top-level types, e.g. between the types “Attribute” and “Process”. To normalize definitions in the KB, e.g. to ease logical inferencing, a KB owner may also use “definition-element exclusion” relations to forbid particular kinds of definitions, e.g. forbid processes to be defined wrt. attributes or physical entities. Each definition for a type T sets “definition element” relations to other types, and these relations also apply to the subtypes of T. A special “definition element” relation type may also be used to reach not just the above cited other types but their subtypes too. Otherwise, most types would need to be defined if few “definition-element exclusion” relations are set between top-level types.

The property sub:definition_element is the type of all “definition element” relations that can occur with OWL-based definitions. Below, sub:definition_element is defined by its subtypes and their definitions. Each subtype corresponds to a way a definition – or a portion of a definition – can made in OWL. Complex definitions are combinations of such portions. In other words, all these subtypes may be seen as a kind of meta-ontology of OWL, with each subtype corresponding to a relation in the chain of relations that can occur between a type and a “definition element”. The type sub:proper-subClassOf is specified as a subtype of sub:definition_element but the types rdfs:subClassOf and owl:equivalentClass are not specified as a subtype of sub:definition_element because this would allow a class to be a sub:definition_element of itself. However, definitions via rdfs:subClassOf and owl:equivalentClass can still taken into account: see the subtypes defined below as chains (cf. owl:propertyChainAxiom) of rdfs:subClassOf property and another property. Only rdfs:subClassOf needs to be used for specifying such chains, not owl:equivalentClass, because rdfs:subClassOf is its supertype. More precisely, rdfs:subClassOf is a disjoint union of owl:equivalentClass and sub:proper-subClassOf.

sub:definition_element rdfs:subPropertyOf owl:differentFrom ; a owl:TransitiveProperty . sub:class-definition_element rdfs:subPropertyOf sub:definition_element . sub:list-definition_element rdfs:subPropertyOf sub:definition_element . sub:datatype-definition_element rdfs:subPropertyOf sub:definition_element . sub:property-definition_element rdfs:subPropertyOf sub:definition_element . sub:instance-definition_element rdfs:subPropertyOf sub:definition_element . sub:proper-subClassOf sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_union_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_intersection_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_onProperty_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_onClass_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_someValuesFrom_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_allValuesFrom_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_hasValue_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_hasSelf_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_oneOf_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_cardinality_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_qCardinality_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_maxCard_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_minCard_part sub:proper-subPropertyOf sub:class-definition_element . sub:class-def_union_part owl:propertyChainAxiom ( rdfs:subClassOf owl:unionOf ). sub:class-def_intersection_part owl:propertyChainAxiom ( rdfs:subClassOf owl:intersectionOf ). sub:class-def_onProperty_part owl:propertyChainAxiom ( rdfs:subClassOf owl:onProperty ). sub:class-def_onClass_part owl:propertyChainAxiom ( rdfs:subClassOf owl:onClass ). sub:class-def_someValuesFrom_part owl:propertyChainAxiom ( rdfs:subClassOf owl:someValuesFrom ). sub:class-def_allValuesFrom_part owl:propertyChainAxiom ( rdfs:subClassOf owl:allValuesFrom ). sub:class-def_hasValue_part owl:propertyChainAxiom ( rdfs:subClassOf owl:hasValue ). sub:class-def_hasSelf_part owl:propertyChainAxiom ( rdfs:subClassOf owl:hasSelf ). sub:class-def_oneOf_part owl:propertyChainAxiom ( rdfs:subClassOf owl:oneOf ). sub:class-def_cardinality_part owl:propertyChainAxiom ( rdfs:subClassOf owl:cardinality ). sub:class-def_qCardinality_part owl:propertyChainAxiom (rdfs:subClassOf owl:qualifiedCardinality). sub:class-def_maxCard_part owl:propertyChainAxiom (rdfs:subClassOf owl:maxQualifiedCardinality). sub:class-def_minCard_part owl:propertyChainAxiom (rdfs:subClassOf owl:minQualifiedCardinality). sub:list_member1 sub:proper-subPropertyOf sub:list-definition_element . sub:list_member2 sub:proper-subPropertyOf sub:list-definition_element . sub:list_member3 sub:proper-subPropertyOf sub:list-definition_element .#and so on, as many as needed sub:list_member1 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:first ). sub:list_member2 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:rest rdf:first ). sub:list_member3 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:rest rdf:rest rdf:first ). sub:datatype-def_minRestr_part sub:proper-subPropertyOf sub:datatype-definition_element . sub:datatype-def_maxRestr_part sub:proper-subPropertyOf sub:datatype-definition_element . sub:datatype-def_minRestr_part owl:propertyChainAxiom ( owl:withRestrictions xsd:minInclusive ). sub:datatype-def_maxRestr_part owl:propertyChainAxiom ( owl:withRestrictions xsd:maxInclusive ). sub:chain_member1 sub:proper-subPropertyOf sub:property-definition_element . sub:chain_member2 sub:proper-subPropertyOf sub:property-definition_element . sub:chain_member3 sub:proper-subPropertyOf sub:property-definition_element . #and so on, as many as needed sub:chain_member1 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:first ). sub:chain_member2 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:rest rdf:first ). sub:chain_member3 owl:propertyChainAxiom ( owl:propertyChainAxiom rdf:rest rdf:rest rdf:first ). sub:instance sub:proper-subPropertyOf sub:instance-definition_element; owl:inverseOf rdf:type. owl:hasKey sub:proper-subPropertyOf sub:instance-definition_element .

For checking the “comparability of types via sub:definition_element relations”, the SPARQL query given in Section 2.2 can be adapted. Here is this adapted query for the “each other class” choice, with the new parts in bold. The adaptation to make for the "some other object" choice has been indicated in Section 2.2.

SELECT distinct ?c1 ?c2 WHERE { ?c1 a owl:Class. ?c2 a owl:Class. FILTER ( ?c1 != ?c2 ) #skip comparable objects: FILTER NOT EXISTS { ?c1 rdfs:subClassOf ?c2 } FILTER NOT EXISTS { ?c2 rdfs:subClassOf ?c1 } FILTER NOT EXISTS { ?c1 owl:equivalentClass ?c2 } FILTER NOT EXISTS { ?c1 owl:sameAs ?c2 } FILTER NOT EXISTS { ?c1 sub:definition_element ?c2 } #skip strongly uncomparable objects (there is no weak uncomparability via sub:definition_element): FILTER NOT EXISTS { ?c1 sub:definition-element_exclusion ?c2 } }

3.3. Comparability Via Other Transitive Relations, Especially Part Relations

Ensuring that objects are either comparable or uncomparable via specialization relations has many advantages which were illustrated in Section 2.3 and Section 3.1. Similar advantages exist with all transitive relations, not just specialization relations, although to a lesser extent since less inferences – and hence less error detection – can be made with other transitive relations. There are no uncomparabilities via total-order relations. E.g., an object is not a predecessor-process of another object if it is its successor-process or if it is not a process.

Part properties – e.g. for spatial parts, temporal parts or sub-processes – are partial-order properties that are often exploited. Unlike subtype relations, they connect individuals. Nevertheless, for checking the “comparability of individuals via part relations (let us assume sub:part relations)”, the SPARQL query given in Section 2.2 can be adapted. Below is this adapted query for the “each other object” choice, followed by a SPARQL query for the “some other class” choice. Two objects that are “comparable via part relations” if one is fully part of the other (or if they are identical). They are “strongly uncomparable via part relations” if they do not share any part (and hence the respective parts of these two objects do not have shared parts either). Two objects that are “weakly uncomparable via part relations” share some parts but none is fully part of the other.

SELECT distinct ?i1 ?i2 WHERE { ?i1 a ?c1. FILTER NOT EXISTS { ?c1 rdfs:subClassOf owl:Class } #?i1 is an individual ?i2 a ?c2. FILTER NOT EXISTS { ?c2 rdfs:subClassOf owl:Class } #?i2 is an individual #the following is shared with the "some other class" choice: #skip comparable objects: FILTER NOT EXISTS {?i1 owl:sameAs|sub:part+|(^sub:part)+ ?i2} #skip strongly uncomparable objects: FILTER NOT EXISTS {?i1 sub:part_exclusion ?i2} #skip remaining uncomparable objects that are only weakly uncomparable: FILTER NOT EXISTS {?i1 owl:differentFrom ?i2} #same note as in Section 2.2 } #with: sub:part rdfs:subPropertyOf owl:differentFrom ; # a owl:TransitiveProperty . # sub:part_exclusion rdfs:subPropertyOf owl:differentFrom ; # owl:propertyDisjointWith sub:part . SELECT ?i1 WHERE #query for the "some other class" choice { ?i1 a ?c1. FILTER NOT EXISTS { ?c1 rdfs:subClassOf owl:Class } #?i1 is an individual FILTER EXISTS { ?i2 a ?c2. FILTER NOT EXISTS { ?c2 rdfs:subClassOf owl:Class } #?i2 is an individual FILTER (?c1!=?c2) ... #the part shared with the "each other class" choice } }

This query can be reused to create a constraint in SHACL, in the way ilustrated for the query in Section 2.2.

3.4. Comparability Via Transitive Relations Plus Minimal Differentia

When defining a type, a good practice is to specify i) its similarities and differences with each of its direct supertypes (e.g., as in the genus & differentia design pattern), and ii) its similarities and differences with each of its siblings for these supertypes. This is an often advocated best practice to improve the understandability of a type, as well as enabling more inferences. E.g., this is the “Differential Semantics” methodology of [Bachimont, Isaac & Troncy, 2002]. Several ODRs can be derived from this best practice, depending on how “difference” is defined. In this article, the term “minimal-differentia” refers to a difference of at least one (inferred or not) relation in the compared type definitions: one more relation, one less or one with a type or destination that is different (semantically, not just syntactically). Furthermore, to check that a class is different from each of its superclasses (i.e. to extend the genus & differentia method), an rdfs:subClassOf relation between the two classes does not count as “differing relation”.

For the “comparability relation type”, Figure 1 proposes the option “comparability-or-uncomparability_with-minimal-differentia”. In the modular or generic way used by Section 5 (Appendix) to formalize and implement comparability ODRs, this option is dealt with via the definition of the relation type sub:comparable-or-uncomparable_to_each_other_object_with-minimal-differentia_via-and-for. For supporting this option when checking “comparability via subClassOf relations” between any pair of classes in a KB, the code of the SPARQL query of Section 2.2 can be adapted by adding some lines before the filters testing whether the classes are comparable or uncomparable: below, see the FILTER block from the 3rd line to the “...”. This block checks that there is a “minimal-differentia” between the tested classes. The retrieval of automatically inferred relations relies on the use of a relevant entailment regime.

SELECT distinct ?c1 ?c2 WHERE { ?c1 a owl:Class. ?c2 a owl:Class. FILTER ( ?c1 != ?c2 ) FILTER (! #keep no(!) class satisfying the following conditions: ( (EXISTS { ?c1 ?p1 ?v1 . #?c1 has at least one property FILTER(?p1!=rdfs:subClassOf) # that is not rdfs:subClassOf FILTER #and ( NOT EXISTS { ?c2 ?p1 ?v2 } # ?p1 is not in ?c2 || EXISTS { ?c2 ?p1 ?v2 #or ?p1 ?v1 is not in ?c2 FILTER (?v1 != ?v2) } || EXISTS { ?c2 ?p2 ?v2 #or ?p2 ?v2 is not in ?c1 FILTER NOT EXISTS { ?c1 ?p2 ?v2 } } ) }) || ((NOT EXISTS #or { ?c1 ?p1 ?v1 # ?c1 has no property, except may be FILTER((?p1!=rdfs:subClassOf) # an rdfs:subClassOf property && (?v1 != ?c2))} # to ?c2 ) && EXISTS{?c2 ?p2 ?v2}) # and ?c2 has (other) properties )) ... #same filtering for (un-)comparable objects as in the query of Section 2.2 }

When relevant, this ODR can be generalized to use other transitive relations between named objects, e.g. partOf relations. In the same way that an rdfs:subClassOf relation between two compared classes does not count as “differing relation”, if this ODR is applied to any generalization relation, a generalization relation between two compared objects does not count as “differing relation”.

3.5. Comparability/Connectability Via Non-transitive Relations

As explained in the introduction, our “comparability ODR” permits someone to check that particular properties are systematically used not simply whenever this is possible (e.g. when the signatures associated to these properties allow their use) but whenever this is relevant for the knowledge providers, i.e. whenever they think their use are true. Thus, whether the properties are transitive or not, comparability ODRs are one way to ensure “schema-based coverage” [PlanetData D2.1: Mendes et al., 2012; pp. 26-29], i.e. the completeness of the KB with respect to particular sets of relations or “knowledge entering constraints” (like database schemas or SHACL constraints).

Our “comparability ODR via particular relations” also systematically checks that objects in a KB are comparable or uncomparable via equivalence or sameAs relations. When this is not relevant, checking the “connectability or un-connectability” of the objects by the selected relations is sufficient. This is why Figure 1 proposes this option and Section 5 represents it. To use it with the above proposed SPARQL queries, it is sufficient to remove the lines related to sameAs or equivalence relations (equivalentClass and equivalentProperty relations) and differentFrom relations (e.g. disjointWith and complementOf relations).

4. Other Comparisons With Other Works and Conclusion

As previously illustrated, the “comparability ODR” generalizes – or permits one to generalize – some best practices, ontology patterns or methodologies that advocate the use of particular relations between particular objects, and supports an automated checking of the compliance with these practices, patterns or methodologies. This leads to the representation of knowledge that is more connected and precise, or with less redundancies. Since the comparability ODR can be used for evaluating a KB – e.g. by applying it to all its objects and dividing the number of successful cases by the number of objects – it can also be used to create KB evaluation criteria/measures, typically for measuring the (degree of) completeness of a KB, with respect to some criteria.

As noted in [Zaveri et al., 2016], a survey on quality assessment for Linked Data, (dataset) completeness commonly refers to a degree to which the “information required to satisfy some given criteria or a given query” are present in the considered dataset. To complement this very general definition, we distinguish two kinds of (dataset) completeness.

Constraint-based completeness measures the percentage of elements in a dataset that satisfy explicit representations of what – or how – information must be represented in the dataset. These representations are constraints such as integrity constraints or, more generally, constraints expressed by database schemas, structured document schemas, or schemas enforcing ontology design patterns. E.g., in a particular dataset, the constraint that at least one movie must be associated to each movie actor, or the constraint that all relations must be binary.
Real-world-based completeness measures the degree to which particular kinds of real-world information are represented in the dataset. E.g., regarding movies associated to an actor, calculating this completeness may consist in dividing “the number of movies associated to this actor in the dataset” by “the number of movies he actually played in, i.e. in the real world”. Either the missing information are found in a gold standard dataset or the degree is estimated via completeness oracles [Galárraga & Razniewski, 2017], i.e. rules or queries estimating what is missing in the dataset to answer a given query correctly. Tools such as SWIQA and Sieve help perform measures for this kind of completeness.

All the completeness criteria/measures collected by [Zaveri et al., 2016] – schema/property/population/interlinking completeness – “assume that a gold standard dataset is available”. Hence, they are all subkinds of real-world based completeness. However, constraint-based completeness is equally interesting and, for its subkinds, categories named schema/property/population/interlinking completeness could also be used or have been used [PlanetData D2.1: Mendes et al., 2012; pp. 26-29] [Dodds & Davis, 2012]. What the comparability ODR can be reused for to ease the measure of completeness is about constraint-based completeness. As illustrated in this article, checking such a completeness may lead the KB authors to represent information that increase the KB precision and then enable the finding of yet-undetected problems. Increasing such a completeness does not mean increasing inferencing performance. Missing information found by checking constraint-based completeness might automatically be findable in some other datasets; some techniques to find such information can be borrowed from research related to real-world-based completeness.

This article showed how SPARQL queries could be used for implementing comparability ODRs. More generally, most transformation languages or systems that exploit KRs could be similarly reused. [Zamazal & Svátek, 2015] and [Corby & Faron-Zucker, 2015] present such systems. The proposed SPARQL queries have been validated experimentally (using Corese [Corby & Faron-Zucker, 2015], a tool which includes an OWL-2 inference engine and a SPARQL engine). Unsurprisingly, in the tested existing ontologies, many objects were not compliant with the ODRs.

5. Appendix: One Logic-based Function For Checking Comparability

Figure 1 lists ODR options and shows a function call with selected ODR options as parameters. After the conversion of its last three parameters into formal types, a similar call can be made to a function that exploits these formal types. Below are specifications for such a function – sub:object_comparability-or-uncomparability_to_other_objects – and for the types that it exploits. These specifications use FL, a concise higher-order logic based knowledge representation language. All types below are in the “sub” namespace.

object_comparability-or-uncomparability_to_other_objects //function type .(?objType, ?relTypes, ?comparability_relation, ?kb) //input variables -: ?a //result variable :=> //":=>" means "has for definition by necessary conditions defined by the next statement: [...]" [any ^(?objType attr: ?a, member of: ?kb), //any instance of ?objType that has for attribute ?a and ?comparability_relation: // that is member of ?kb is in ?comparability_relation .{?relTypes,?kb,?objType,?kb} ], // with this set of variables \. //"\." means "has for subtype(s)"; here there is only one subtype (cf. next line) (object_comparability-or-uncomparability_to_each_other_object .(relTypes, ?objectType, //and again, this new function type has these input variables comparable-or-uncomparable_to_each_other_object_via-and-for ?comparability_relation, ?kb) -: ?a //and this result variable := //":=>" means "has for definition by necessary and sufficient conditions" //the next line calls the first function defined above (which returns an attribute type) object_comparability-or-uncomparability_to_other_objects _(?relTypes,?objectType, ?comparability_relation,?kb), \. (object_comparability-or-uncomparability_to_each_other_object_via_subClass_in_the_KB ?a := [?a = object_comparability-or-uncomparability_to_other_objects _(rdfs#subClassOf, //in Turtle, this would be "rdfs:subClassOf" rdfs#Class,comparable-or-uncomparable_to_each_other_object_via-and-for, (the KB ?kb member: (the object i_attr: ?a)) )] ) ). comparable-or-uncomparable_to_each_other_object_via-and-for .(?x, .{?relTypes, ?kb, ?objectType, ?inKb} ) := [ [^y member of: ?kb, != ?x, type: ?objectType] //"^y": universally quantified variable => [?x comparable-or-uncomparable_via: .{?y,?relTypes} ] ]_[description_content-or-instrument-or-container: ?inKb], \. (connected_or_un-connectable_to_each_other_object_via-and-for .(?x, .{?relTypes, ?kb, ?objectType, ?inKb}) := [ [^y member of: ?kb, != ?x, type: ?objectType] => [?x directly-connected_or_un-connectable_via-and-for: .{?y,?relTypes}] ]_[description_content-or-instrument-or-container: ?inKb] ) (comparable-or-uncomparable_to_each_other_object_with-minimal-differentia_via-and-for .(?x, .{?trRelTypes, ?kb, ?objectType, ?inKb}) := [ [^y member of: ?kb, != ?x, type: ?objectType] => [?x comparable-or-uncomparable_with-minimal-differentia_via-and-for: .{?y, ?trRelTypes} ] ]_[//"_[...] is a meta-statement on the previous statement ([...]) description_content-or-instrument-or-container: ?inKb] ). comparable-or-uncomparable_via-and-for .(?x, .{?y, ?relTypes}) := [ [^r member of: ?relTypes] => [?x comparable-or-uncomparable_via: .{?y, ^r}] ], \. (connected_or_un-connectable_via-and-for .(?x, .{?y, ?relTypes}) := [ [^r member of: ?relTypes] => [?x connected_or_un-connectable_via: .{?y, ^r}] ] ) (comparable-or-uncomparable_with-minimal-differentia_via-and-for .(?x, .{?y, ?relTypes}) := [ [^r member of: ?relTypes] => [?x comparable-or-uncomparable_with-minimal-differentia_via: .{?y,^r}] ] ). comparable-or-uncomparable_via .(?x, .{?y, ?r}) //at least weakly (un-)comparable := [ OR_{ [?x equal_or_equivalent: ?y], //e.g.: ?x = ?y, ?x <=> ?y [?x connected_or_un-connectable_via: .{?y, ?r}] } ], \. (connected_or_un-connectable_via .(?x, .{?y, ?r}) := OR_{ [?x ?r: ?y], //e.g.: ?x \. ?y, ?x part: ?y, ?x |. ?y [?x ?r of: ?y], //e.g.: ?x /^ ?y, ?x part of: ?y, ?x |^ ?y [?x !?r: ?y], [?x !?r of: ?y] } ) //reminder: `!' is open-world_not (strongly_comparable-or-uncomparable_via .(?x, .{?y, ?r}) := [ [?x comparable-or-uncomparable_via: .{?y, ?r}], [ [?x !equal_or_equivalent: ?y] => OR_ //warning: with gSpec, next line ok (!ok for cSpec_excl exactly) { [ [^ix type: (^sx ?r of: (?x type: Type)), //e.g.: ?x cSpec_excl: ?y != (^iy type: (^sy ?r of: ?y)) ] => [ [?ix != ?iy] [?sx != ?sy] ] ] [ [^ix cSpec of: (^sx ?r of: (?x !type: Type)), //e.g.: ?x part_excl: ?y != (^iy cSpec of: (^sy ?r of: ?y)) ] => [ [?ix != ?iy] [?sx != ?sy] ] ] } ] ] ) (comparable-or-uncomparable_with-minimal-differentia_via .(?x, .{?y, ?r}) //generalization of genus & partial-differentia via subtype relations := "?x is comparable-or-not_to ?y and either ?x = ?y or they differ by at least 1 relation of a type != ?r" [ [?x comparable-or-uncomparable_via: .{?y, ?r}] [ [?x !equal_or_equivalent: ?y] => OR_{ [ [?x (?r2!=?r): a ?z] [?x not ?r2: a ?z] ] [ [?y (?r2!=?r): a ?z] [?x not ?r2: a ?z] ] } ] ] ).

6. References

Bachimont B., Isaac A., Troncy R. (2002). Semantic Commitment for Designing Ontologies: A Proposal. In: EKAW 2002, Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web, LNCS, volume 2473, pp. 114–121, Springer Berlin, Siguenza, Spain.
Chein M., Mugnier M. (2008). The BG Family: Facts, Rules and Constraints. Graph-based Knowledge Representation - Computational Foundations of Conceptual Graphs. Chapter 11 (pp. 311–334), Springer-Verlag London, 428p.
Corby O., Faron-Zucker C. (2015). STTL: A SPARQL-based Transformation Language for RDF. In: WEBIST 2015, 11th International Conference on Web Information Systems and Technologies, Lisbon, Portugal.
Corman J., Aussenac-Gilles N., Vieu L. (2015). Prioritized Base Debugging in Description Logics. In: JOWO@IJCAI 2015.
Djakhdjakha L., Mounir H., Boufaïda Z. (2014). Towards a representation for multi-viewpoints ontology alignments. In: IJMSO, International Journal of Metadata, Semantics and Ontologies, 9(2), pp. 91–102, Inderscience Publishers, Geneva.
Djedidi R. & Aufaure M. (2009). Ontology Change Management. In: I-SEMANTICS 2009, pp. 611–621
Dodds L., Davis I. (2012). Linked Data Patterns – A pattern catalogue for modelling, publishing, and consuming Linked Data, http://patterns.dataincubator.org/book/, 56 pages, 2012-05-31.
Dromey R.G. (2006). Scaleable Formalization of Imperfect Knowledge. In: AWCVS 2006, 1st Asian Working Conference on Verified Software, pp. 29–31, Macao SAR, China.
Farias Lóscio B., Burle C., Calegari N. (2017). Data on the Web Best Practices. W3C Recommendation 31 January 2017. Web document: https://www.w3.org/TR/dwbp/
Galárraga L., Hose K., Razniewski S. (2017). Enabling completeness-aware querying in SPARQL. In: WebDB 2017, pp. 19–22, Chicago, IL, USA.
Harris S., Seaborne A. (2013). SPARQL 1.1 Overview. W3C Recommendation 21 March 2013. Web document: https://www.w3.org/TR/2013/REC-sparql11-overview-20130321/
Hitzler P., Krötzsch M., Parsia B., Patel-Schneider P.F., Rudolph S. (2012). OWL 2 Web Ontology Language Primer (Second Edition). W3C Recommendation 11 December 2012. Web document: https://www.w3.org/TR/owl2-primer/
Knublauch H., Kontokostas D. (2017). Shapes Constraint Language (SHACL). W3C Recommendation 20 July 2017. Web document: https://www.w3.org/TR/shacl/
Marino O., Rechenmann F., Uvietta P. (1990). Multiple Perspectives and Classification Mechanism in Object-Oriented Representation. In: ECAI 1990, pp. 425–430, Pitman Publishing London, Stockholm, Sweden.
Martin Ph. (2003). Correction and Extension of WordNet 1.7. In: ICCS 2003 (Springer, LNAI 2746, pp. 160–173), Dresden, Germany, July 21-25, 2003.
Martin Ph. (2011). Collaborative knowledge sharing and editing. International Journal on Computer Science and Information Systems (IJCSIS; ISSN: 1646-3692), Volume 6, Issue 1, pp. 14–29, 2011.
Martin Ph. (2018). Evaluating Ontology Completeness via SPARQL and Relations-between-classes based Constraints. In: IEEE QUATIC 2018 (pp. 255–263), 11th International Conference on the Quality of Information and Communications Technology, Coimbra, Portugal, September 4-7, 2018.
Mendes P.N., Bizer C., Miklos Z., Calbimonte J.P., Moraru A., Flouri G. (2012). D2.1 Conceptual model and best practices for high-quality metadata publishing. Delivery 2.1 of PlanetData, FP7 project 257641
Presutti V., Gangemi A. (2008). Content Ontology Design Patterns as practical building blocks for web ontologies. In: ER 2008, Spain (2008). See http://ontologydesignpatterns.org
Rector A., Brandt S., Drummond N., Horridge M., Pulestin C., Stevens R. (2012). Engineering use cases for modular development of ontologies in OWL. Applied Ontology, 7(2), pp. 113–132, IOS Press.
Roussey C., Corcho Ó, Vilches Blázquez L.M. (2009). A catalogue of OWL ontology antipatterns. In: K-CAP 2009, Redondo Beach, CA, USA, pp. 205–206.
Ruy F.B., Guizzardi G., Falbo R.A., Reginato C.C., Santos V.A. (2017). From reference ontologies to ontology patterns and back. Data & Knowledge Engineering, Volume 109, Issue C, May 2017, pp. 41–69, DOI: https://doi.org/10.1016/j.datak.2017.03.004.
Zamazal O., Svátek V. (2015). PatOMat – Versatile Framework for Pattern-Based Ontology Transformation. Computing and Informatics, 34(2), pp. 305–336.
Zaveri A., Rula A., Maurino A., Pietrobon R., Lehmann J., Auer S. (2016). Quality assessment for linked data: A survey. Semantic Web, 7(1), pp. 63–93.