Sharing and Comparing Information about Knowledge Engineering

Dr Philippe Martin
Griffith University
School of ICT
PMB 50 Gold Coast MC, QLD 9726            
AUSTRALIA
wseas@phmartin.info
http://www.phmartin.info
Dr Michel Eboueya
University of La Rochelle
Laboratoire Informatique Image et Interaction
Avenue Michel Crépeau, 17042 La Rochelle Cedex 1
FRANCE
mike@univ-lr.fr
http://www.univ-lr.fr/labo/l3i/site_statique/


Abstract: - Nowadays, researchers and developers in knowledge engineering do not add information about their ideas and tools into a shared semantic network. They use documents (articles, emails, documentations, etc.). Therefore, finding and comparing tools or techniques for learning purposes or for solving a problem is a lengthy process (with most often sub-optimal results) that involves reading many documents partly redundant with each other. Our knowledge server WebKB-2 supports the collaborative building of a formal or semi-formal semantic network, and we have begun creating such a network to permit a scalable sharing of information about knowledge engineering. This article illustrates this work, its principles, and an approach to ease the representation and comparison of tools or techniques.

Key-Words: - Knowledge engineering, Knowledge sharing, Knowledge retrieval, Ontology, CSCW



Introduction

Nowadays, as in any other domain, publishing information about knowledge engineering (KE) most often involves writing sentences in a document. This is a lengthy process which implies summarizing or describing ideas or facts that have already been presented by countless other persons and also implies making rather arbitrary choices and compromises about which information to describe, at which level of detail, in which order, etc. Furthermore, the result of this exercise only adds to the volume of poorly structured and heavily redundant data that she and other persons later have to sift through to find information.

The problem is that information about KE is currently not structured into a semantic network of techniques or ideas that a Web user could (i) navigate to get a synthetic view of a subject or quickly find its path to relevant information as in a decision tree, and (ii) easily update to publish a new idea (or the explanation of an idea at a new level of detail) and link it to other ideas via semantic relations. Various small steps toward that goal can be observed.

The most well known is that Wikipedia has a page about KE and many pages about KE related objects. However, using Wikipedia (in connection with other wikis since the content of Wikipedia is meant to remain of "encyclopaedic" nature, that is, not too technical) is not a scalable approach. Indeed, current wikis, even semantic wikis such as Semantic MediaWiki, do not provide minimal supports for the collaborative building of a large well organized semantic network: no initial large lexical ontology, no intuitive expressive notation, no structural and ontological guidelines, no editing/sharing protocols, and extremely limited knowledge checking, querying and browsing features. Thus, current semantic wikis remain mostly informal and poorly structured. For example, the knowledge representation language (KRL) of Semantic MediaWiki does not permit to express quantifiers, collections, meta-information (even to represent the author of a statement, a kind of information that is essential to support editing/sharing protocols and filtering mechanisms) and it only permits to represent relations within hyperlinks and with source the object of the page (hence, for example, to represent the semantic content of a table, a user would have to create as many pages as there are columns or rows in the table).

The same restricted approach (and similar KRL within hyperlinks) was used in the well-publicized KA2 project [1][Benjamins & al., 1998] which re-used Ontobroker and aimed to let Knowledge Acquisition (KA) researchers index their KA resources within their Web pages. (The pages of the registered researchers were loaded from time to time into Ontobroker and the various bits of knowledge were then aggregated when this was possible). Furthermore, the provided ontology was extremely small (only 37 domain names) and could not be directly updated by users. Thus, this approach was extremely limiting, was not followed by many KA researchers, and could not support the representation or indexation of research ideas.

Finally, Fact Guru (the commercial successor of CODE4 [11][Skuce and Lethbridge, 1995]), a knowledge base (KB) server with a semi-formal English-like syntax supporting minimal knowledge processing, once proposed users to access and complement a small KB on Object-Oriented Software Engineering. There are many informal states of the art about KE, some home pages gathering information about projects related to KE (e.g., [2][Clark, 2005]) and also surveys about tools (e.g., [3][Denny, 2004]) but we found no KB server (nor static ontology) about KE research ideas, technique or tools.

[10][Martin & al., 2006] showed how our KB server WebKB-2 provides the above cited minimal supports for the collaborative building of a large well organized KB or semantic network (with formal or informal nodes) and how the approach advantageously compares with less structured ones (e.g., [14][Stutt and Motta, 2004]) for knowledge retrieval and comparison, or for supporting learning and research. [10][Martin & al., 2006] used examples from our representation of teaching materials. In this article, after a short summary of WebKB-2's approach, we illustrate the ontology that we have begun to permit a scalable sharing of information about KE. More precisely, we illustrate each of the sections which, to support readability, search, checking and systematic input, we used to modularise the input files that we created for this ontology. These sections have names such as "Domains and Theories", "Tasks and Methodologies", "Structures and Languages", "Tools", "Journals, Conferences and Mailing Lists", "Articles, Books and other Documents" and "People: Researchers, Specialists, Teams/Projects, ...". The input files [9][classif] have names such as "Fields of study", "Systems of logic", "Information Sciences", "Knowledge Management", "Conceptual Graph" and "Formal Concept Analysis" (the last three files specialize the others). Finally, we show how tables can be generated to ease the representation and comparison of tools or techniques.



Summary of WebKB-2's approach

[5][Martin, 2002] introduces three notations used by WebKB-2  - FL (For-links), Formalized English (FE) and FCG (Frame-CG) - derived from the Conceptual Graph linear form (CGLF) [13][Sowa, 1984] to improve on its readability, expressivity and "normalizing" characteristics (their combination is what made Conceptual Graphs famous). Their expressivities are respectively similar to RDF+OWL, CGLF and KIF.  FL is adapted to the case of "links" (simple relations between categories or statements) and permits to represent a large volume of knowledge in a structured way and a small amount of space, which is important for browsing a large KB. In the three notations, the connected objects can be formal statements (written in FE or FCG) as well as informal statements (mere strings of characters), thus permitting the users to choose the level of detail that suits their goals and to refine their representations incrementally (if and when they wish to).

The example below is needed for the understanding of later examples. It shows translations of English (E) sentences into FL (note: "<" means "subtype of" and ">" means "subtype"). The first example uses informal terms. The second example shows the creator of each formal term and relation. For example, "wn#body" is an identifier for the Wordnet concept that has for names "body", "organic_structure" and "physical_structure". Hence, another identifier for this concept is "wn#body__organic_structure__physical_structure". Since a name (an informal term) can have many meanings, it can be shared by many categories (concepts or relations). The KB of WebKB-2 was created by transforming WordNet 1.7 into a genuine lexical ontology and extending it with several top-level ontologies and domain-related ontologies [7][Martin, 2003b]. In WebKB-2, the "wn" creator may be left implicit (it will be omitted in all other examples).

E: Any human_body is a body and has at most 2 arms, 2 legs and 1 head.
   Any arm, leg and head belongs to at most 1 human body.
   Male_body and female_body are exclusive subtypes of human_body
   and so are juvenile_body and adult_body.
FL: human_body  < body,
      part: arm (any->0..2, 0..1<-any)  leg (any->0..2, 0..1<-any)
            head (any->1, 1<-any),
      > {male_body female_body}(exclusive)
        {juvenile_body adult_body}(exclusive);

E:  According to Jun Jo (who has for user id "jj"), 
    a  body (as understood in WordNet 1.7) may have for
    part (as  understood by "pm") a leg or two ("leg", as defined by "fg")
    and has for part exactly 1 head (as understood by "oc").
FL:  wn#body  pm#part:  fg#leg (jj,any->0..2)  oc#head (jj,any->1);
FCG: [any wn#body, pm#part: {0 to 2 fg#leg,  1 oc#head}](jb);
FE:  `A wn#body has for pm#part 0 to 2 fg#leg and for pm#part 1 oc#head'(jb).

The FL example below shows two small extracts from a "structured discussion" about the use of XML for knowledge representation, a topic that leads to recurrent debates on many KE related mailing lists. The parenthesis are used for two purposes: (i) allowing the direct representation of links from the destination of a link, and (ii) representing meta-information on a link, such as its creator (for example, the user registered as "pm") or a link on this link (e.g., an objection by "pm" on the use of an objection link by "fg", without stating anything about the destination of this link). The content of the sentences and their indentation should permit the understanding of these two different uses. The use of dashes to list joint arguments/objections (e.g., a rule and its premise) should also be self-explanatory. The use of specialization links between informal statements may seem odd but such links several argumentation systems use them: they are needed for modularising purposes and for checking the updates of argumentation structures, and hence guiding or exploiting these updates (e.g., the (counter-)arguments for a statement also apply to its specializations and the (counter-)arguments of the specializations are (counter-)examples for their generalizations). Few argumentation systems allow links on links (ArguMed is one of the exceptions) and hence most of these systems force incorrect representations of discussions. Even fewer provide a textual notation that is not XML-based, hence a notation readable and usable without an XML editor or a graphical interface. All our structured discussions are in [9][classif].

"knowledge_representation_or_exchange_with_XML is useless"
   argument: ("the use_of_XML_tools_by_KBSs is a useless additional task"
                 argument: "the internal_use_of_XML_by_a_KBS is useless" (pm,
                   objection: "knowledge_representation_or_exchange_with_XML is possible" (fg,
                     objection: "knowledge_representation_or_exchange_with_non-XML-languages
                                 is possible" (pm),
                     objection: "knowledge_representation_in_a_KBS_with_a_non-XML_language 
                                 is necessary" (pm)))
             )(pm);

"knowledge_representation_or_exchange_with_XML is possible"
   argument: - "the re-use_of_a_classic_XML_tool (parser, XSLT, ...) is permitted by
                the use_of_an_XML_notation" (pm)
             - "the re-use_of_a_classic_XML_tool is possible even when a graph-based
                model is used" (pm),
   argument of: ("a KR_language should have at least one XML_notation for input/output format"
                   specialization: "the Semantic_Web_KRL should have an XML_notation" (pm),
                   specialization of: `a KR_language can have for notation an XML_notation' (pm)
                )(pm);
The last sentence in the above example is in FE. The other sentences are informal but all the terms that include an underscore can automatically be associated to formal terms such as km#use_of_XML_tools_for_KBSs which, given its definition in FCG below, could be retrieved by conceptual navigation/query via the informal terms "use" and/or "KBS" and/or "XML" and/or "tool" and/or any recorded synonym for them. (Note: spaces after a backslash within a term are ignored). Sentences using formal terms are retrievable via them. Furthermore, writing sentences by beginning with their main object (generally, a term for a process) considerably reduces the number of ways a sentence can be written, helps making it non-contextual (i.e., leads to explicit details) and eases its comparison to other related sentences.
km#use_of_XML_tools_by_a_KBS = [a wn#use, agent: a km#KBS, object: several km#XML_tool];

The approach of WebKB-2, which is based on a KB shared by all its users, supports and encourages knowledge re-use, precision and connectivity, more than any other current approach [6][Martin, 2003a]. Here is a summary of its principles.

Each category has an associated creator who is also represented by a category and thus may have associated statements. Each statement also has an associated creator and hence, if it is not a definition, may be considered as a belief. Any object (category or statement) may be re-used by any user within her statements. Only the creator of an object may remove it but any user may "correct" a belief by connecting it to another belief via a "corrective relation" (e.g., pm#corrective_restriction). (Definitions cannot be corrected since they cannot be false; similarly, definitions from different users cannot be inconsistent with each other, they simply define different categories/meanings). If entering a new belief introduces a redundancy or an inconsistency that is detected by the system, it is rejected. The user may either modify her belief or re-enter it again connected by a "corrective relation" to each belief it is redundant or inconsistent with: this makes explicit the disagreement of one user with (her interpretation of) the belief of another user. Knowledge filters exploiting those relations and details about the creators may then be specified by a user for an application or to ease browsing. For example, a user may specify that during her browsing of the KB, she does not want to see statements that have been corrected nor those from people belonging to certain organizations.

Finally, in order to encourage users to enter precise and original statements, in [10][Martin & al., 2006] we proposed an algorithm to evaluate the popularity and originality of each contribution and contributor based on votes on statements and argumentation relations from them. Ideally, this algorithm is used with parameters given by each user to specify her own view about which statements or users are interesting to view, and hence better filter the KB during her browsing.

The notations, protocols and large ontology proposed by WebKB-2 are necessary to ease and normalize the cooperative construction of a KB but are insufficient: an initial ontology for the targeted domain is also necessary for people to know how to represent their pieces of information so that the KB remains well organized. The next sections discuss this initial ontology for KE.



Domains and Theories

Names used for domains ("fields of study") are very often also names for tasks. Task categories are more convenient for representing knowledge than domain categories because (i) organizing them is easier and less arbitrary, and (ii) many relations (e.g., case relations) can then be used. Since for normalization purposes a choice must be made, whenever suitable we have represented tasks instead of domains. When names are shared by domain categories and task categories (in WebKB-2, categories can share names but not identifiers), we advise the use of the task categories for indexing or representing resources.

When studying how to represent and relate document subjects/topics (e.g., technical domains), [15][Welty_andJenkins, 1999] concluded that representing them as types was not semantically correct but that mereo-topological relations between individuals were appropriate. Our own analysis confirmed this and we opted for (i) an interpretation of theories and fields of study as large "propositions" composed of many sub-propositions (this seems the simplest, most precise and most flexible way to represent these notions), and (ii) a particular part relation that we named ">part" (instead of "subdomain") for several reasons: to be generic, to remind that it can be used in WebKB-2 as if it was a specialization relation (one of the advantages is that the destination category needs not be already declared) and to make clear that our replacement of WordNet hyponym relations between synsets about fields of study by ">part" relations refines WordNet without contradicting it. Our file on "Fields of study" [9][classif] details these choices. Our file on "Systems of logics" [9][classif] illustrates how for some categories the represented field of study is a theory (not a reference to it) thus simplifying and normalizing the categorization. Below is an example of relations from WordNet category #computer_science, followed by an example about logical domains/theories. When introducing general categories in Information Sciences and Knowledge Management, and links that do not come from WordNet, we used the "generic users" "is" and "km" (anyone can add knowledge for these users).

#computer_science__computational_science
  annotation: "engineering science that ...",
  >part:    #artificial_intelligence,
  >part: is#software_engineering_science (is),
  >part: is#database_management_science (is),
  >part of: #engineering_science
  part:    #information_theory,
  part of: #information_science;
km#substructural_logic
 annotation: "system of ...",
 >part of: km#intuitionist_logic,
 >part: km#relevance_logic
        km#linear_logic;
km#CG_domain__Conceptual_Graphs
 >part of: km#knowledge_management_science,
 object: km#CG_task  km#CG_structure 
         km#CG_tool  km#CG_mailing_list,
 url: http://www.jfsowa.com/cg/;

To provide a core ontology that will guide the sharing, indexation or representation of techniques in Knowledge Management, hundreds of categories will need to be represented. We have only begun this work. In the KA2 project [1][Benjamins & al., 1998], the ontology was predefined and a good part of it was a hierarchy of 37 Knowledge Acquisition (KA) domains, the names of which also allude to tasks, structures, methods (PSMs) and experiments. E.g., this hierarchy included:
reuse_in_KA > ontologies PSMs;
PSMs > Sysiphus-III_experiment;



Tasks and Methodologies

In most model libraries for KA (e.g., the library of KADS), each non-primitive task is linked to techniques that can be used for achieving this task, and conversely, each technique combines the results of more primitive tasks. We tried this organization but at the level of generality of our current modelling it turned out to be inadequate: it led (i) to arbitrary choices between representing sometimes as a task (a kind of process) or a technique (a kind of process description), or (ii) to the representation of both notions and thus to introduce categories with names such as KA_by_classification_from_people; both cases are problematic for readability and normalization. Similarly, instead of representing methodologies directly, that is, as another kind of process description, it seems better to represent the tasks advocated by a methodology (including their uppermost supertask: following the methodology). Furthermore, with tasks, many relations can then be used directly: similar relations do not have to be introduced for techniques or methodologies (the relation hierarchy should be kept small, if only for normalization purposes). Hence, we represented all these things as tasks and used multi-inheritance. This considerably simplified the ontology and the source files. Below are some extracts. (Note: in FL, FE and FCG, relation names may be used instead of relation identifiers when there is no ambiguity; in this example, the curly brackets enclose open subtype partition of exclusive subtypes.)
km#KM_task__knowledge_management_task
 < is#information_sciences_task,
 > km#knowledge_representation
   km#knowledge_extraction_and_modelling  
   km#knowledge_comparison 
   km#knowledge_retrieval_task 
   km#knowledge_creation  km#classification 
   km#KB_sharing_management 
   km#mapping/merging/federation_of_KBs 
   km#knowledge_translation
    km#knowledge_validation  
   {km#monotonic_reasoning
    km#non_monotonic_reasoning}
   {km#consistent_inferencing
    km#inconsistent_inferencing}
   {km#complete_inferencing 
    km#incomplete_inferencing}
   {km#structure-only_based_inferencing
    km#rule_based_inferencing}
   km#language/structure_specific_task
   km#teaching_a_KM_related_subject
   km#KM_methodology_task,
 object of: km#knowledge_management_science,
 object: km#KM_structure; //Note: the relation "object" has
  //different meanings depending on the connected categories
   km#knowledge_retrieval_task  < is#IR_task,
    > {km#specialization_retrieval
       km#generalization_retrieval}
      km#analogy_retrieval 
      km#structure_only_based_retrieval 
      {km#complete_retrieval 
       km#incomplete_retrieval}
      {km#consistent_retrieval
       km#inconsistent_retrieval};



Structures and Languages

In WebKB-2's top-level ontology [7][Martin, 2003b], pm#description_medium (top supertype of concept types for languages, data structures, ...) and pm#description_content (top supertype for fields of studies, theories, document contents, softwares, ...) have for supertype pm#description because (i) such a general type grouping both notions is needed for the signatures of many basic relations, and (ii) classifying WordNet categories according to the two notions would have often led to arbitrary choices. We chose to represent the default ontology of WebKB-2 as being "a part of" WebKB-2 and hence we allowed pieces of information to be related by part relations. To further ease knowledge entering, WebKB-2 allows the use of generic relations such as part, object and support when the intended more precise relations (e.g., pm#subtask or pm#physical_part) can be automatically found.

For similar reasons, to represent "sub-versions" of ontologies, softwares, and more generally, documents, we use types connected by subtype relations. Thus, for example, km#WebKB-2 is a type (not an individual) and hence can be used with quantifiers.

km#KM_structure  < is#symbolic_structure,
 > {km#base_of_facts/beliefs  km#ontology
    km#KB_category  km#KB_statement}
   km#KB  km#KA_model  km#KR_language
   km#language_specific_structure;

  km#ontology
   > km#domain_ontology km#top_level_ontology 
     km#lexical_ontology km#language_ontology 
     km#concept_ontology km#relation_ontology
     km#multi_source_ontology__MSO,
     part: 1..* km#KB_category
           1..* km#category_definition;
  km#KR_language__KRL__KR_model_or_notation
   > {km#KR_model/structure  km#KR_notation}
     km#frame_oriented_language
     km#predicate_logic_oriented_language
     km#graph_oriented_language
     km#KR_language_with_query_commands
     km#KR_language_with_scripting_features,
   attribute: km#semantics;

  km#language_specific_structure > km#CG_structure;

   km#CG_structure > km#CG_statement  km#CG_language;



Tools

We first illustrate some specialization relations between tools. Then, we use the FCG notation to give some details on WebKB-2 and Ontolingua (the FL notation does not yet permit to enter such details).
km#CG_related_tool
 < km#language/structure_specific_tool,
 > km#CG-based_KBMS km#CG_graphical_editor
   km#NL_parser_with_CG_output;

   km#CG-based_KBMS < km#KBMS,
    > {km#CGWorld  km#PROLOG\+CG
       km#CoGITaNT  km#Notio  km#WebKB};

      km#WebKB > {km#WebKB-1  km#WebKB-2},
               url: http://www.webkb.org;

km#input_language (*x,*y) =
 [*x, may be support of: (a km#parsing,
         input: (a statement, formalism: *y))];

[any pm#WebKB-2,              //", part:": has for part
  part:(a is#user_interface,  //"a ": existential quantifier
         part:{a is#API, a is#HTML_based_interface, 
               a is#CGI-accessible_command_interface,
               no is#graph_visualization_interface}),
  part: {a is#FastDB, a km#default_MSO_of_WebKB-2},
  input_language: a km#FCG,
  output_language: {a km#FCG, a km#RDF},
  support of: a is#regular_expression_based_search,
  support of: a km#specialization_structural_retrieval,
  support of: a km#generalization_structural_retrieval,
  support of: (a km#specialization_structural_retrieval,
    kind: {km#complete_inferencing, km#consistent_inferencing},
    input: (a km#query, expressivity: km#PCEF_logic),
    object: (several km#statement, expressivity: km#PCEF_logic)
              )];       //"PCEF": positive conjunctive existential formula

[any km#Ontolingua, 
  part: {a is#HTML_based_interface,
         no is#graph_visualization_interface,
         no DBMS, a km#ontolingua_library},
  input_language: a km#KIF,
  output_language:{a km#KIF, no km#RDF},
  support of: a is#lexical_search];

To permit the comparison of tools, many more details should be entered and similar structures or relations should be used by the various contributors, for example when expressing what the input languages of a tool can be. To that end, we re-used basic relations as much as possible (we did not introduce relations with names such as "re-used_DBMS" or "default_ontology"). The above examples show that for many features a simple normalized form can be found. However, for many other features this is more difficult. For example, consider the fact the special features of WebKB-2 to support the storage, search and exploitation of relations between categories and their creators or various names. We have not yet found a satisfactory way to represent these features nor that Ontolingua only offer syntactic support for them: Ontolingua permits to represent the above cited relations but the user has to define them in KIF and then define their exploitation in Lisp. Representing such information in detail is not only time consuming but the representations from different persons will unlikely be matchable and will also be very difficult to use for comparing the tools via a generated table (as illustrated in the Section "Tool comparison"). Hence, less detailed descriptions using normalised simple relations should (instead or in addition) be provided. For the above cited features, a short FCG representation could be [any WebKB-2, special_support: a support_for_link_from_category_to_names] even though this would lead to introduce many categories for such "supports" in the ontology: from other viewpoints, it would have been preferable to re-use existing relations such as km#category_name.



Conferences, Journals, Publishers and Mailing Lists

Here are a few examples.

km#CG_mailing_list < km#KM_mailing_list,
 url: majordomo@cs.uah.edu;

km#ICCS__International_Conference_on_Conceptual_Structures
 instance: km#ICCS_2001 km#ICCS_2002 km#ICCS_2003 km#ICCS_2003 km#ICCS_2005;

is#publisher_in_IS  < #publishing_house,
 instance: is#Springer_Verlag  is#AAAI/MIT_Press  is#Cambridge_University_Press,
 object of: #information_science;



Articles and other Documents

This example shows a simple document indexation using Dublin Core relations. We have done this work for all the articles of ICCS 2002 (see [5]Martin_-2002). Representing ideas from the articles would be more valuable. Examples of representations of conferences, publishers, mailing lists, researchers and research teams are in [9][classif].
[an #article,
 dc#Coverage: km#knowledge_representation,
 pm#title: "What is a Representation?",
 dc#Creator: "Randall Davis, Howard E.
              Shrobe and Peter Szolovits",
 pm#object of: (a #publishing, pm#time:1993,
 pm#place:(the #object_section"14:1 p17-33",
               pm#part of: is#AI_Magazine)),
 pm#url:medg.lcs.mit.edu/ftp/psz/k-rep.html];



Example of comparison of two ontology-related tools

For representing certain comparisons of objects, such as the comparison of the features of certain techniques or tools, it is useful to use tables as format supports. Such tables can be formal or semi-formal and can be used as input or outputs. Manually creating detailed tool comparison tables is often a presentation challenge and involves a person's knowledge of which features are difficult or important and which are not. Furthermore, it would be too restricting to use predefined tables for easing the entering of tool features and then compare them. Hence, generating tables from the KB is needed. Then, modifying the tables should lead to a modification of the KB.

Fact Guru [11][Skuce and Lethbridge, 1995] is one of the rare KB servers that generate comparison tables. More precisely, it permits the comparison of two objects by generating a table with the object identifiers as column headers, the identifiers of all their attributes as row headers, and for each cell either a mark to signal that the attribute does not exist for this object or a description of the destination object. The common generalizations of the two objects are also given. However, Fact Guru's approach is not structured enough to be scalable: the list of features/relations from the compared objects is not structured and the cells are allowed to be informal descriptions of the destinations of the relations. A more scalable approach is to organize the features of the compared objects into a specialization hierarchy and to use the cells only for indicating whether each compared object has or has not (or will have and when) each feature. Below is an example of table generation query, followed by its result and then by the FL and FCG statements used for generating the result. In the cells, '+' means "yes" (the tool has the feature), '-' means "no", and '.' means that the information has not been represented. Each of the two entries within parenthesis refers to a set of features that has not yet been named (i.e., no category has yet been entered to represent this particular set) but that is generated to permit the comparison of the tools. The prefixes for the relations are left implicit because this does not lead to any ambiguity, that is, WebKB-2 can find the correct relations.

compare pm#WebKB-2 km#Ontolingua on 
    (support of: a is#IR_task, output_language: a km#KR_notation,
     part: a is#user_interface), maxdepth 5

                                           WebKB-2  Ontolingua
support of:
is#IR_task                                    +         +
  is#lexical_search                           +         + 
    is#regular_expression_based_search        +         .   
  km#knowledge_retrieval_task                 +         .
    km#specialization_structural_retrieval    +         .
      (kind: {km#complete_inferencing,
              km#consistent_inferencing},
       input: (a km#query, 
               expressivity: km#PCEF_logic),
       object: (several statement,
                expressivity: km#PCEF_logic)) +         .
    km#generalization_structural_retrieval    +         .

output_language: 
km#KR_notation                                +         +
  (expressivity: km#FOL)                      +         +          
    km#NxCG                                   +         .
    km#KIF                                    .         +
  km#XML-based notation                       +         .
    km#RDF                                    +         -

part:
is#user_interface                             +         +
  is#HTML_based_interface                     +         + 
  is#CGI-accessible_command_interface         +         .
  is#OKBC_interface                           .         .
  is#API                                      +         .         
  is#graph_visualization_interface            -         -        


km#CG_related_tool  < km#language/structure_specific_tool,
 > km#CG-based_KBMS  km#CG_graphical_editor  km#NL_parser_with_CG_output;

   km#CG-based_KBMS < km#KBMS,
    > {km#CGWorld  km#PROLOG\+CG  km#CoGITaNT  km#Notio  km#KSx};

      km#KSx  > {km#KSx1  km#KSx2},  url: http://www.ksx.org;

km#input_language (*x,*y) = [*x, may be support of: (a km#parsing,
                                       input: (a statement, formalism: *y))];
[any ph#KSx2,
  part: (a is#user_interface, part: {a is#API, a is#HTML_based_interface, 
                                     a is#CGI-accessible_command_interface,
                                     no is#graph_visualization_interface}),
  part: {a is#FastDB, a km#default_MSO_of_KSx2},
  input_language: a km#NxCG,   output_language: {a km#NxCG, a km#RDF},
  support of: a is#regular_expression_based_search,
  support of: a km#specialization_structural_retrieval,
  support of: a km#generalization_structural_retrieval,
  support of: (a km#specialization_structural_retrieval,
                  kind: {km#complete_inferencing, km#consistent_inferencing},
                  input: (a km#query, expressivity: km#PCEF_logic),
                  object: (several km#statement, expressivity: km#PCEF_logic)
              )];          //"PCEF": positive conjunctive existential formula

[any km#Ontolingua, 
  part: {a is#HTML_based_interface, no is#graph_visualization_interface},
  input_language: a km#KIF,  output_language: a km#KIF,
  part: {a km#ontolingua_library, no DBMS}, support of: a is#lexical_search];

In the general case, the above approach where the descriptions are put in the rows and organized in a hierarchy is likely to be more readable, scalable and easier to specify via a command than when the descriptions are put in the cells, as in Fact Guru. However, for simple cases, putting descriptions into cells may be envisaged as a shortcut, for example to display {FCG, KIF} instead of '+' for the output_language relation.

In addition to generalization relations, "part" relations could also be used, at least the ">part" relation. For example, assume that a third entry in the above table is a tool that has a complete and consistent structure-based and rule-based mechanism to retrieve the specializations of a simple CG in a base of simple CGs and rules using simple CGs. Then, we would expect the entry ending by km#PCEF_logic to be specialized by an entry ending by km#PCEF_and_rules_logic.



Conclusion

In his description of a "Digital Aristotle", [4][Hillis, 2004] describes a "Knowledge Web" in which researchers could add ideas or explanations of ideas "at the right place" (that is, without introducing redundancies), and suggests that this Knowledge Web should "include the mechanisms for credit assignment, usage tracking, and annotation that the Web lacks", thus supporting a much better re-use and evaluation of the work of a researcher than via the system of article publishing and reviewing. [4][Hillis, 2004] did not give any indication about such mechanisms but WebKB-2's approach seems to provide a template for them. However, in addition to the guidance provided by the large general ontology, checking mechanisms, edition protocols, notations and knowledge entering forms, our experiments showed that an initial domain specific ontology is also required to guide and normalize the cooperative construction of a knowledge repository in a domain such as KE.

This article illustrated the principles of our modelling and what this entails for an ontology of KE. Directly representing sentences from documents would not lead to an organised KB: categorising the underlying objects and their relationships is necessary. The approach of dividing each input file into sections corresponding to one major conceptual category eases the search, cross-checking and systematic input of knowledge. This is a scalable scheme: whenever a section grows too big it can be further divided according to subcategories.

The demand for comparing the dozens existing ontology editing tools cannot be satisfied with informal superficial surveys such as [3][Denny, 2004]. In [8][toolInformalComp] we categorized 7 CG-related tools according to 160 criteria organized by subtype relations and grouped into six sections and tables. It is stored into a wiki [8][toolInformalComp]. We plan to extend this categorization to 50 ontology tools and 250 features, and then formalize it. Beside supporting conceptual browsing, this will permit us to answer conceptual queries about these tools and generate tables to compare them and ease knowledge entering, as detailed in the previous section. Once this work is done, we shall invite KE researchers to represent or index their research tools or ideas into WebKB-2.

Similarly, in our structured discussions [9][classif], we are gathering and representing ideas on hotly debated topics, from various sources such as Wikipedia and Wikireason[12][Wikireason]. When the content of these structured discussions will be detailed and normalised enough to guide people into entering new ideas "at the right place" (that is, "in a scalable way" and hence, at least ideally, "without introducing redundancies"), and when the interface will be easy enough to use for browsing and complementing these structured discussions, we shall add hyperlinks to them in pages of Wikipedia and Wikireason in order to invite their users to organise, compare and evaluate their ideas, without fear of their additions being deleted by other users. This is not possible in current wikis, hypertext or argumentation systems and knowledge servers (other than WebKB-2), due to the lack of meta-information on each object (category or statement) and cooperation-supporting procedures exploiting such meta-information (source, source interpreter, semantic relations, votes on features such as originality and veracity, etc.).



References

[1] Benjamins V.R., Fensel D, Gomez-Perez A., Decker S., Erdmann M., Motta E. and Musen M. Knowledge Annotation Initiative of the Knowledge Acquisition Community: (KA)2. Proceedings of KAW98, Banff, Canada, April 1998.

[2] Clark P. Some Ongoing KBS/Ontology Projects and Groups. http://www.cs.utexas.edu/users/mfkb/related.html

[3] Denny M. Ontology Tools Survey, Revisited. http://www.xml.com/pub/a/2004/07/14/onto.html, July 14, 2004.

[4] Hillis W.D. "Aristotle" (The Knowledge Web). Edge Foundation, No 138, May 2004.

[5] Martin P. Knowledge representation in CGLF, CGIF, KIF, Frame-CG and Formalized-English. Proceedings of ICCS 2002, 10th International Conference on Conceptual Structures (Springer Verlag, LNAI 2393, pp. 77-91), Borovets, Bulgaria, July 15-19, 2002.

[6] Martin P. Knowledge Representation, Sharing and Retrieval on the Web. Chapter of a book titled "Web Intelligence", (Eds.: N. Zhong, J. Liu, Y. Yao; Springer-Verlag, pp. 263-297), Jan. 2003.

[7] Martin P. Correction and Extension of WordNet 1.7. Proceedings of ICCS 2003 (Springer Verlag, LNAI 2746, pp. 160-173), Dresden, Germany, July 2003.

[8] Martin P. CG tools. http://www.webkb.org/kb/it/fs/CG_tools.html

[9] Martin P. Semantic classification of some resources. http://www.webkb.org/kb/it/

[10] Martin P., Eboueya M., Blumenstein M. and Deer P. A Network of Semantically Structured Wikipedia to Bind Information. Proceedings of E-learn 2006, (pp. 1684-1702), AACE Conference on E-learning in Corporate, Government, Healthcare and Higher Education, Honolulu, Hawaii, October 13-17, 2006.

[11] Skuce D. and Lethbridge T.C. CODE4: A Unified System for Managing Conceptual Knowledge. Int. Journal of Human-Computer Studies (42), pp. 413-451, 1995.

[12] Retchless A. Wikireason: Meet, Debate, Decide. http://wikireason.net/wiki/Forum_Entrance

[13] Sowa J.F. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, MA, 1984.

[14] Stutt A. and Motta E. Semantic Learning Webs. Journal of Interactive Media in Education, Special Issue on the Educational Semantic Web, 10, 2004.

[15] Welty C.A. and Jenkins J. Formal Ontology for Subject. Journal of Knowledge and Data Engineering, 31(2), pp. 155-182, September 1999.