Use of Semantic Networks as Learning Material
and Evaluation of the Approach by Students

Dr Philippe Martin

Abstract. This article first summarizes reasons why current approaches supporting Open Learning and Distance Education need to be complemented by tools permitting lecturers, researchers and students to cooperatively organize the semantic content of Learning related materials (courses, discussions, etc.) into a fine-grained shared semantic network. This first part of the article also quickly describes the approach adopted to permit such a collaborative work. Then, examples of such semantic networks are presented. Finally, an evaluation of the approach by students is provided and analyzed.

Keywords. knowledge sharing, knowledge evaluation, e-learning



I.  Introduction

Most Semantic Learning projects [1]-[2] and most Learning Object related standards or practices [3]-[4]-[5]-[6] rely on a rather coarse grained indexation of informal data (natural language sentences, images, etc.) by simple meta-data. Conceptual categories or even mere keywords are manually or automatically associated to relatively big chunks of informal data (typically a whole document, and almost always more than one sentence). In fine-grained approaches, the data (e.g., learning materials or very detailed user models) are represented and organized into a formal or semi-formal semantic network without redundancies. To do so, the content of the source materials is decomposed into a set of data units which ideally are all irreducible units (i.e., conceptual categories or formal/informal stand-alone statements) and inter-related by semantic relations (e.g., relations of specialization, argumentation, instrumentation, correction, authorship, spatial/temporal location and modality). Some of these networks are fully formal, very difficult to create, and difficult to read or search directly. This is typically the case for tutor systems, for example for Halo [7] which permits to solve some chemistry test questions automatically. Most other semantic networks, especially those of projects using Concept Maps [8,9] or Topic Maps [10,11], are mostly informal and difficult to re-use for information retrieval and comparison.

In [12] I showed that none of the current main knowledge sharing/retrieval approaches permit lecturers, researchers and students to collaboratively build a "fine-grained semi-formal normalized source/creator-keeping readable semantic network" for representing the content of research/teaching fields. I showed that such a network is needed complement to traditional approaches since it is required by the following tasks which are crucial for education and research:

Section II summarizes the reasons why current approaches do not permit the creation of such a network, and introduces the approach I designed to permit the collaborative creation of such a network. In Open Learning and Distance Education, providing the students with unambiguous, well organized and not overly restricted learning materials is particularly important, if only because obtaining or providing feedback is more difficult and takes much more time than in face-to-face courses. Even a course related newsgroups (where problems or information related to a particular course are discussed) would be advantageously replaced by a cooperatively-built semantic network. Examples of rationales for this claim are: (i) a separate document (or the course itself) would not have to be updated by the lecturer for answering requests-for-clarifications from the students or for keeping an up-to-date easy-to-search repository of important information with associated rationales (announcements, problem resolutions, etc.), (ii) recurring discussions - or repetitions of previously stated information - would be avoided, and (iii) since the statements would have to be more much precise and argued, in case of conflicts the resulting "structured discussion" would permit people (the involved parties or external parties) to better compare and evaluate the respective positions than when they are scattered in the messages of a newsgroup.

Although all this is true from a theoretical perspective and thereby provides interesting lights on several topics of this conference (e.g., "E-learning and pedagogical challenges", "Impact of e-learning on social change", "Future trends", "Methods of combining traditional learning and e-learning" and "Web-based learning"), implementing this approach raises some problems. The techniques I created to solve these problems are not yet sufficient for genuine scalability, especially from a social viewpoint (usability and adoption by a great number of lecturers and researchers). This is summarized in Section II. Furthermore, not all these techniques are fully implemented in my knowledge base (KB) server WebKB-2 [13]-[14] (a "KB server" permits Web users to update one or several shared knowledge bases). The point of this article is not to discuss these techniques and their theoretical advantages (I did this in [12] and [13]; their summaries in this article are original albeit concentrating many ideas in few sentences; hence, a careful reading is encouraged; an easier-to-read but extended version can be made for a journal article). Instead, the point is to present an experiment of the use of these techniques as a complement to traditional learning materials for distance and face-to-face teaching purposes. Section III presents extracts of semantic networks that I created for three courses and that were complemented by students. Section IV gives an evaluation of this experiment by some students which shows that much work remains to be done for the approach to be viable but that hope is permitted.



II.  Background

A.  Insufficiency of Other Approaches

Approaches based on the indexation of resources are not scalable. My first argument about this point in [12] is: "The more statements a resource contains, and the more resources there are, the more these resources contain similar and/or complementary pieces of information, and hence the less the meta-data for each resource can be useful. Indeed, queries on the meta-data return lists of resources that are partially redundant or complementary with each other and that need to be manually searched, compared or aggregated by each user". Approaches based on fully formal resources are also insufficient since semi-formal or informal objects (conceptual categories or statements) are unavoidable at certain levels of generality for organizing and presenting a KB, and for supporting an incremental refinement of its content. In the general case, approaches based on mostly informal resources do not permit to manually represent or automatically extract and normalize the meaning of informally described objects and relationships, and hence do not permit to exploit them for information retrieval/comparison purposes. Similarly, ontology matching techniques [15] are intrinsically limited by the lack of information contained in the source KBs (even knowledge engineers often cannot second-guess the knowledge providers and establish precise semantic relations between objects from different KBs). Hence, usable approaches based on formal resources that were mostly independently created are not scalable either.

The insufficiency of these approaches explains why many projects nowadays try to allow the cooperative creation of large KBs (like Wikipedia but much more formal and organized), e.g., Ontowiki (ontowiki.net) and Freebase (freebase.org). However, they do not yet provide protocols that genuinely support cooperation between people: only two KB servers - Co4 [16] (not available anymore) and WebKB-2 (usable at http://www.webkb.org/) - seem to have knowledge editing/voting/evaluation protocols that support loss-less knowledge integration (i.e., an integration that does not impose the users or managers of a KB to make a choice between inconsistent statements and hence loose knowledge; such choices can often only be made in the context of particular applications). Knowledge integration methodologies (e.g., Diligent, Dogma, HCome, Methontology) or knowledge integration servers on the Web (e.g., Knowledge Zone) or in peer-to-peer networks (e.g., SomeWhere, CoAKTinG) impose choices during knowledge integration and hence are oriented toward the creation of applications rather than towards the cooperative creation of knowledge repositories.


B.  Quick Overview of the Adopted Approach

In WebKB-2, every object (word, conceptual category, formal/informal statement, relation between objects) has one or several associated origins or believers (which are recorded objects too and hence can be used in statements and queries): 1) the user who created the object, 2) the source (e.g., a person, a language, a document) where the user read and hence interpreted the object (word or statement), and 3) other users which also believe in that object (statement). Lexical conflicts are avoided by prefixing category identifiers with the identifier of their creators (e.g., wn#bird refers to the most common concept proposed by WordNet for the word "bird"). For each new KB, WebKB-2 proposes a large general default ontology which is a loss-less integration of (i) many top-level ontologies and (ii) my conversion of WordNet into a "genuine lexical ontology with intuitive identifiers" [17]. WebKB-2 also proposes various complementary notations (FCG, Formalized-English and FL) [18] that I derived from the Conceptual Graph Linear Form (CGLF) to further improve on what made its success: its readability, expressiveness and normalizing aspect (i.e., the fact that this notation helps people to represent statements in ways that ease the automatic finding of logical relations between these statements).

For redundancies or inconsistencies to be made explicit (or, from a logic-oriented viewpoint, for removing semantic conflicts) and for keeping a minimal organization in the KB, before being added to the shared KB, each new object must be connected to at least one already existing object by a "corrective" relation (to state that the new object corrects an already existing object) or a "generalization" relation. (Between simple statements, a generalization relation represents to the logic implication. However, I also defined extended generalization relations not only to take into account constructs such as numerical quantifiers, sets or contexts but also to relate formal objects with informal objects, e.g., conceptual categories with words). Using graph-matching techniques, WebKB-2 can detect many partial/complete redundancies and inconsistencies between a new statement and those already existing in the KB, and thus can ask the author of the new statement to refine it or add a corrective/generalization relation. For example, assuming that John has already entered in a semi-formal way that "all birds fly" and that John wants to enter that "most healthy French birds are able to fly", here is the semi-formal statement that Joe has to write if he uses the Formalized-English (FE) [18] notation currently usable in WebKB-2: `any bird is agent of a flight'(John) has for corrective_restriction `most healthy French birds are able to be agent of a flight' '(Joe). In other words: Joe states that his belief is a correction and restriction of John’s belief. WebKB-2 also proposes a system to evaluate contributions and contributors based on votes and the way statements have been argued for or against [14]. In the future, this evaluation system will be adaptable by each user, FE will be made more readable, and heuristics will be used for discovering semantic conflicts between statements even when they are informal.

This approach, along with the related induced use of precise formal categories (e.g., pm#Paris_in_1951 which specializes pm#Paris_between_1950_and_1960, which itself specializes pm#Paris) permits to put every imaginable belief into a same organized KB and avoid the problems related to integrations that are not loss-less (see Section II-A) and hence problems related to version control or truth-maintenance. When choices between conflicting beliefs have to be made, which is typically the case for applications but rarely for information retrieval, a selection of the knowledge to use can be made using queries and according to the characteristics of each application. For example, one application designer may extract a consistent subset of a KB by selecting the most specialized and voted-for formal statements, and/or those that satisfy certain expressiveness constraints.

Finally, I also proposed the use of replication mechanisms between competing or partially competing knowledge servers (on the Web or within peer-to-peer networks) in such a way that it does not matter which server a user updates or queries first: the advantages of distribution and centralization are thus combined and there is only one "virtual" network [14].


C.  Adoption of the Approach by Users and Tool Providers

There are various reasons why no other systems currently uses the approach adopted in WebKB-2 despite the advantages of this approach and the direct impact that its use would have on education. First, such a tool required much research and implementation work. Second, there currently exists a lot of informal legacy data but very little well-organized explicit knowledge. Third, the above described approach suffers from two problems common to all precision-oriented knowledge acquisition/retrieval approaches, i.e., approaches where the semantic network has to be (semi-)formal and displayed to the users: (i) people need to learn how to read such networks, and (ii) entering knowledge representations requires much more intellectual rigor than writing informal sentences. The unwillingness of most people to learn new notations (e.g., musical/mathematical notations and programming languages) is well known. Furthermore, most people have not heard about knowledge representation languages nor about the usefulness of learning one.

Yet, I believe that my approach (combined with more traditional ones) has some future with researchers, teachers and students since (i) the need of using very small learning objects is now well recognized by the e-learning research community [3]-[5], (ii) the economy of time and resources brought by the use of truly re-usable learning objects will be understood by more and more e-learning/university teachers and administrators, (iii) more and more teachers are involved in e-learning, (iv) it is part of the roles of teachers and researchers to (re-)present knowledge in explicit and detailed ways, (v) my approach permits a better evaluation of the knowledge and analytic skill of the students than less precision-oriented approaches, and (vi) providing the semantic organization of the content of teaching materials (instead or in addition to these materials) help students find, compare and memorize the information scattered in these materials. As Section IV indicates, this last point was recognized by many of my students after they had learned how to read the semantic networks I prepared for them.



III.  Presentation of the Created Semantic Networks

A.  Content

During my e-learning fellowship [19], I represented the content of three courses given at Griffith Uni (Australia). These courses were "Introduction to Multimedia Development" (Multimedia; 13 sets of slides), "Systems Analysis & Design" (S.A.; 437 slides) and "Workflow Management" (WFM; on-line course based on a book and supplementary materials). Most slides and most of their statements were represented into a semantic network of tasks, data structures, properties, definitions, etc. For example, 350 out of the 437 slides of the S.A. course were represented. Only 350 because (i) the examples made via figures and tables were not represented, (ii) the redundancies were eliminated, and (iii) information solely related to tutorials and examinations was eliminated too. For the WFM course, at first, the network only included the most important WFM concepts and relationships introduced in the book. This network was particularly helpful since in this book the descriptions of those important concepts and relationships were scattered amongst hundreds of sentences, and sometimes these descriptions were very general and fuzzy. Thus, without the network it was extremely difficult to remember and correlate all these descriptions.

For evaluating the students of these courses and the proposed approach, each student was asked to add at least twenty relations to the network. Then, I evaluated these additions (did they make sense? were they interesting? etc.). The WFM students had to do this exercise three times, as a replacement for a traditional learning journal.

For each of the three courses, a relatively small list of relation types happened to be necessary for representing the content of the slides and the relations between the important concepts. Most of these types were: subtype, instance, specialization, part (physical_part or subtask), technique, tool, definition, annotation, use, purpose, rationale, role, origin, example, advantage, disadvantage, argument, objection, requirement, agent, object, input, output, parameter, attribute, characteristic, support and url. (This list is ordered topically, not by frequency of occurrence.) This list is small compared to all the basic relations that can be found in top-level ontologies or that would potentially be needed if general natural language documents had to be represented.

B.  Presentation

The input files containing the initial knowledge representations for these courses are accessible at http://www.webkb.org/kb/it/. These input files were loaded into (i.e., executed by) WebKB-2 and hence their formal objects (conceptual categories or statements) became part of the unique global semantic network that can be queried, browsed and complemented by any Web user via WebKB-2 (http://www.webkb.org). The students were given the URL of WebKB-2 and the URLs of the input files for their courses.

Fig. 1 shows an extract of an input file for the WFM course. Fig. 2 shows the result of a very simple command. Fig. 3 and Fig 4 show extracts of input files for the Multimedia course. Within each input file the formal representations are included in sections and indented. This indentation most often reflects the specialization relations existing between the represented objects. The representations are enclosed within special tags (e.g., ) to isolate them from the informal elements. HTML tags may be used within representations for presentation or hyperlinking purposes (Other HTML tags are ignored by WebKB-2; indexing document elements with representations is done in another way). Fig. 1, 3 and 4 show that knowledge from different topics can be represented, normalized and organized in very similar ways.

These input files use FL [17] because this notation was designed to be the most structured and concise possible formal notation that is as expressive as RDF+OWL-Full [20]. (RDF+OWL is the knowledge representation language which has been recommended by the W3C and which has become the de-facto standard for the Semantic Web [21] and hence the Semantic Learning Web too [1]-[2]). FL is similar to N3 [22] but has a more regular structure. (N3 is a readable notation often used in W3C documents to avoid using the XML-based notation for RDF+OWL.) FL is much more concise than other notations, especially graphic notations, and hence reduces the needs for scrolling or browsing. This permits people to see many relations between the formal objects, and hence better compare and understand these objects. This also eases the integration and exploitation of knowledge representations within textual documents or their interconnections with textual elements [23]. An originality of the research work on WebKB-2 is its focus on handy textual interfaces (notations, commands, and automatically generated textual interfaces). In the future, traditional solutions based on applets may also be used.

In the following examples (figures 1 to 4), no cardinalities are explicitly associated to the relations between the objects. Thus, most statements in these figures follow the generic schema "CONCEPT1 RELATION1: CONCEPT2 CONCEPT3, RELATION2: CONCEPT4, ...;". Such a statement should be read: "any CONCEPT1 may have for RELATION1 one or many CONCEPT2, and may have for RELATION1 one or many CONCEPT3, and may have for RELATION2 one or many CONCEPT4, ...". Some comments within the figures explain how the creators of each object are made explicit. As examples of additions made by students, please note the relations created by the student "s162557" in the figures 1, 2 and 4.

Figure 1
Fig. 1.  Extract from a file representing statements from a book in
Workflow Management (here referred to by the variable $book)
      Figure 2
Fig. 2.  Command to display the specializations of a type, followed
by its first result: wfm#workflow_management (here, this type
is displayed along with some of its related objects using an
informal format looking like FL)

Figure 3
Fig. 3.  Organization of guidelines about the creation of Web pages