Subjet: Semi-automatic and collaborative
knowledge-based indexation, structuring and sharing
of textual or multimedia information for learning or information retrieval
within a knowledge server or in a semantic grid.
Core idea. This PhD will explore how textual or multimedia information can and
should be structured to ease information sharing, retrieval and comparison for various
related purposes: learning, research and decision-making. Traditional solutions
are based on document indexation and hence are limited to document retrieval.
Knowledge-based solutions use on machine-understandable representations
of parts of the semantic content of the information sources, from very lightweight
representations to very formal and organised ones, from very distributed and
independently created representations to those created within classic knowledge base
management systems (KBMSs), from automatically created representations for information
retrieval purposes only to those carefully crafted for problem-solving purposes.
This thesis will begin by comparing the characteristics, advantages and rationales of
the well-known softwares and techniques of these traditional and knowledge-based solutions.
In order to be precise and genuinely permit the comparison of softwares and techniques
this state of the art (concepts, statements, and semantic or argumentation relations
between them) will itself be structured in a semi-formal way and usable for
for information retrieval and learning via a knowledge server.
This exercise will provide a starting point and the first test material
for the main goals of this thesis: devising methods to enhance current approaches for
representing, sharing, retrieving and comparing information. To that end, various
research avenues will be explored:
1) the design and use of various more or less expressive (and hence more or less intuitive) textual or multimedia notations/interfaces for displaying or allowing to represent, index or query textual or multimedia information,
2) the re-use and alignment of various existing ontologies (e.g., SUMO, DOLCE, OpenCYC),
3) the initialisation of a knowledge base (KB) about a subject via the automatic extraction of basic conceptual relations between concepts or statements from textual documents (Wikipedia, course materials, research articles; examples of basic relations: generalisationOf, subprocessof, physicalPartOf, agentOf, purposeOf, annotationOf, argumentationOf and objectionOf),
4) protocols permitting people to cooperatively build/edit a same KB (and hence annotate or correct information that they think incorrect) while (i) avoiding lexical and semantic conflicts, (ii) forcing them to agree with each other, and (iii) encouraging knowledge re-use and structuring,
5) the adaptation of these protocols to permit people to cooperate in a semantic grid, assuming that the existing KBs (e.g., the KBs of each user or the various KBs about a same domain) feed from each other,
6) knowledge valuation mechanisms allowing any user to attribute values to certain characteristics (e.g., originality and accuracy) of any statement, and then select or filter out information based on these values (according to statistical functions possibly defined by the user), the creators of these values, and the content of the information itself,
7) user valuation mechanisms allowing any user to attribute values to certain characteristics of other users (e.g., originality and accuracy in a certain domain) based on their votes and the information they entered.
Dr. Martin has done some preliminary work on each of these points except the third, and has designed a knowledge server based on initial ideas for the first, second and fourth point (an implementation for the last two points is also in progress). This server, named WebKB-2, will therefore be used and extended during this thesis.
The details and references are accessible from
- http://www.webkb.org/doc/papers/kmo06/mike.html (in French)
- http://www.webkb.org/kb/it/ (each of the files)