Message 8365 of the SUO list

Subject: Re: CG: Architectures for Intelligent Systems
Date: Tue, 16 Apr 2002 15:49:29 +0200
From: Jean-Luc Delatre <jld@club-internet.fr>
Follow-Up: msg08367 by Philippe Martin


Philippe,

Quoting you from: http://mars.virtual-earth.de/pipermail/cg/2002q2/004284.html

> > The solution that, to me, seems feasible is to look for a match *only* on
> > the "leaves" ...
> 
> I do not see why you want to reduce your chances of finding some matches.

Actually, it is to INCREASE the chances of finding some matches.

The way I expect this to pay off is this:

It will always be problematic to reach an agreement between a large number
of users about which characteristics of a concept are significant and allow 
to uniquely characterize the concept.

I expect it to be much less problematic to reach an agreement on very simple
concepts like 'color', 'left', 'right' etc...
JUST THE FACT that the name of the color concept is 'color', NOT what the
concept mean within any given ontology, which will again be problematic.
Then these "basic" concepts (and attributes/properties of the same simplicity)
will have to be part of a common "core" ontology, possibly split between
application domains for technical terms and with NO specific meanings attached.

Each ontology builder will ultimately define all of its concepts by some "tree" 
of concepts linked by attributes/properties and the matching of concepts from
different ontologies will proceed bottom up from the "leaf" concepts,
for which there exist an agreement at the lexical level.

Given that a *single* truly specific attribute ('pars pro toto') is enough
to recognize a concept, the "recognition" capability will "percolate" up the 
tree even when the properties attached to a concept are different between 
two ontologies (proviso a specific attribute is agreed upon, see about 
"REAL ontology builders" in http://mars.virtual-earth.de/pipermail/cg/2002q2/004265.html).

The important point is to recognize that a 'car' is a 'car' whether it is
viewed as an inventory asset, a factory product, a sociological status 
symbol etc...

> > I also acknowledge that the "most prominent" attributes or qualities of
> > a same and given concept appearing in ontologies with an interest in
> > totally different fields may have *nothing* in common.
> > This is just an extension of the case above, one or both of the ontology
> > owners will have to add some *commonly recognised* attribute to his
> > concept definition in order to communicate.
> 
> Does this mean that each author of an ontology (and only him/her) will have
> to do this difficult manual work (hand-crafting) for all the unmatched concepts
> in all the ontologies s/he wants to connect to?
> If so, this is a business to business model. Enormous waste of time,
> precision, completeness and re-use+retrieval possibilities for everyone.
> Compare that with users inserting their categories (with or without definitions)
> into a shared ontology (with some automated checking and guidance). It is not
> more difficult and it is optimal: each user benefits from all the links set by
> other users without having to re-create them.

May I remind you that even in France, where the "Academie Francaise" is supposed
to care for such definitions, the *real* language is grown by the users community
and in spite of a few misunderstandings here and there this does quite well create
and define new words and their meanings.
I propose to devise a mechanism that will allow the "computers" (actually the
ontology maintainers) to interact in similar ways for the same purpose.

It seems to me that *you* are proposing something quite close with your
"Cooperatively-built large heterogeneous KBs", we certainly have different
visions about the means and requirements for such a purpose, but the 
ultimate goal is about the same: having a large *consistent* knowledge base.

I see mine as more "distributed" and more flexible with respect to point 
to point interactions but two main characteristics seem identical:

- having a minimal "core" of primitives.
- having the content built by (consistency checked) consensual interaction.

> > > Manual ontology merging is not far better: although humans have the advantage of
> > > having a huge background knowledge and the ability to understand informal
> > > descriptions, they too make dubious connections or are not able to make
> > > connections if they do not have enough information to exploit.
> >
> > THANK YOU VERY MUCH for this remark!
> >
> > I have been bitterly arguing with naysayers on the SUO list about feasibility
> > of my approach, but it is NO MORE INFEASIBLE than a hand-crafted approach.
> 
> Except that humans, especially the authors of the ontologies, have more information
> about the meaning of the(ir) categories. They can exchange information (this is the
> inefficient and short-term way) or, when they see that their categories have been
> mis-used (and hence mis-interpreterpreted), they can specialize the definitions of
> their categories for these mis-uses to be disallowed automatically. (Hence, there
> may be the need for manually or automatically "cloning" the mis-used categories
> when/before they are refined; see my ICCS'01 paper for detais:
> http://webkb.org/doc/papers/iccs01/).
> 
> > It will just go at COMPUTER SPEED instead of COMMITTEE SPEED!
> 
> Computers won't do anything since they do not have enough information.
> Committees may do something, but are indeed inefficient.

When I said "computer" I meant computers interacting under the supervision
of their ontology maintainers. The speed come from the fact that there will
be an immediate feedback from any remote ontology when queried about a new
"proposed" concept.

> The only solution I can see is to permit the users to cooperatively built
> a shared ontology/KB, consistent and optimally connected (at least as far as
> automatic procedures can detect) and still permit users to disagree with
> each other, and above all, permit them not to wait for other users to change
> their definitions. That sounds impossible, isn't it? Yet, this is what WebKB-2
> permits with a few simple tricks. Details in http://webkb.org/doc/papers/iccs01/
> or in http://www.webkb.org/doc/papers/wi02/
> or on the WebKB-2 site (www.webkb.org/webkbShared.html).

I think we agree on this kind of approach, if not (may be) on the means.

> Regarding your approach, I think Matthew West asked all the right questions.
> I do not think you gave an "OVERLY detailed point". 

Well, no, I still maintain that this was "OVERLY detailed" with respect
to the CURRENT STAGE OF THE DISCUSSION.

This has to do with what I call "the right timing to answer questions
along a design process", trying to detail a specific point too early
will most likely lead to useless considerations.

> More of such points (and more detailed, with more examples) are what I guess 
> everyone hopes for.
> More importantly, researchers and end-users won't adopt any solution (yours,
> mine, ...) unless they have a ready-to-use efficient tool for it. I can only
> hope that you will design a tool and that it will be successful.

Good point. But not only there must be tools, there must be *users* too, 
having an *actual* problem to solve and *willing* to invest their time
in a possible solution. 

This is very close to the chicken and egg problem because they need 
to *trust* the proposed solution beforehand and, in order to devise 
a "trustworthy" solution, you must have user feedback, otherwise it 
is a "shot in the dark".

See:  http://slashdot.org/askslashdot/01/03/21/0739222.shtml#349489
Also: http://www.joelonsoftware.com/articles/fog0000000017.html
      http://www.joelonsoftware.com/articles/fog0000000054.html

Cheers.

-- Jean-Luc Delatre