Documentation of the "Category Search" tool

The general documentation is assumed to have been read. Hyperlinked terms points to their definitions in this documentation.

Selection options

In most cases, you will simply enter a word (name) in the 1st text field and submit in order to see the categories representing various meanings of this word (i.e. the categories that have this word as one of their names).
Note: the cases of the letters are important. Use the normal English spellings.

If you are unsure of the name or spelling, you may use wildcards: '?' for one character, '*' for any number of characters. If you already know the creator of the categories you want to retrieve, you may specify it, as in "rdf#*" or "pm#*entity*".

You may be interested in all categories that have a link of a certain kind. In that case, you would leave the first text field empty, select the adequate link and optionally precise a destination.

These selection options may be combined. If you already know the indentifier of the category you are interested in, you do not need these selection options, you just enter the identifier in the 1st text field, adjust the display options and submit. Programs are more likely to use identifiers. To help developpers, the text field next to the submit button shows how the parameters should be encoded if the GET protocol is used. It therefore also shows that any category in the knowledge base may be refered via a URL. Some associated knowledge may also be refered.

Display options

You have the choice between seeing only the categories directly linked to the specified categories, or also all the categories indirectly connected to them via a link of a certain kind. In the second case, printed links are not displayed twice except in the "minimal, user-friendly" format when and exploration depth has been given. In both cases, some statements that use the retrieved categories may also be displayed. This happens when the category is an individual (e.g. "Paris", "Venus") or when the statements use the category with an universal quantifier (e.g. "any", "most" and "75%").

If you put some restrictions on the creators by filling the next text field, the specified categories that do not match the restrictions are not displayed (nor the categories connected to them). During the presentation of indirectly connected categories, those that do not match the restrictions are skipped but the level of indentation is nethertheless increased to show when and how many categories have not been displayed. Categories directly connected to the specified categories matching the restrictions are however always presented for not inducing the user into thinking these links do not exist.

This filtering mechanism is intended to permit users to focus on the work of some users when they need to. An example is given (click on "Example" to test it):
  Only if created by: rdf, M pm#KVO_group
  and not created by: fm, ^ #Aussie

This example means that all (and only) the categories from the user rdf or the users member of the pm#KVO_group should be shown except the categories from the user fm (user id of "Francois Modave") and the users that are Australian (instance of #Australian). This operation mainly restrict the categories to be from the creators "rdf" and "pm" (user id of "Philippe Martin").
Queries using such filters may not display a lot of categories but still take a few seconds to perform if the whole ontology (105,000 links currently) is explored and the server machine is busy; so please,
do not abort these queries.

Several notations are proposed for displaying the results. If you are a human being, you will probably prefer the default format. More information are provided with the second format which uses the FS notation. The third format, which uses RDF/XML, is quite unreadable for people. In this format, statements are not provided because the generation of expressive statements into a language as poor, ill-defined and difficult to read as RDF/XML is a complex task. For instance, in RDF/XML, there is no standard way to represent universal quantification, various kinds of contexts and various kinds of sets. We have proposed conventions for the representation of expressive statements but, unless these conventions are widely adopted or alternative conventions emerge, the RDF/XML statements that we could generate would be ad-hoc and therefore probably incomparable with statements from other knowledge providers.

Finally, the categories may be presented hyperlinked or not. Clicking on an hyperlinked category is like asking for its direct links and all its supertypes. To permit the hypertextual exploration of the ontology along any kind of links (not just the supertype link), some links are hyperlinked too. For example, clicking on an hyperlinked '>' will show all direct and indirect subtypes of the category to which this link applies.

Comparison with similar tools

The WordNet Web site proposes an on-line access to their database and a WordNet browser to install on a local machine. In both tools, few search options are proposed (e.g. no possibility of search via identifiers, wildcarded names or attached links) and categories are not hyperlinked. These tools are for human consumption only (no formal notation, no category identifier provided/presented). They do not permit users to make updates (and are not completed by other tools permitting that).

Dan Brickley also implemented a server to provide the supertypes of a given WordNet category in RDF format. However, this server does not distinguish between identifiers and names. The provided information is therefore incorrect (in Dan Brickley words: "the current demo conflates 'word senses' with the words associated with those senses"). No form-based interface, search option, or format other than RDF is provided.

Philippe A. MARTIN
Last modified: Wed Sep 27 00:08:12 PDT 2000