In the frame of a research project funded by the "Training and Mobility of Researchers" program, we are seeking to apply a new approach, based on semantic analysis of icons, to the field of CALL (Computer Assisted Language Learning).
The application prototype that is currently being developed, GLOTTAI 1, aims at allowing learners of German as a second language to practice communication in that language at home or in tutorial classes. The users first tell the computer what they intend to express by pointing to icons. The system interprets these icons semantically, and proposes its formulation (a) in the form of a conceptual graph, and (b) as a full German sentence. The users are then allowed to "play" with the graph to discover how to express variations or refinements.
In this report, we describe in more detail the ideas underlying this system. In Section 2, we indicate how GLOTTAI may be situated, in terms of theoretical choices and practical approach. In Section 3, we give an overview of the specifications of the system, expose its architecture and its different modules. In Section 4, we mention the lessons which may be drawn at this stage from the experiment.
The system described addresses the situation of adults, or young adults, who are learning German as a second language (L2) with communicative purposes. This can include people who live in a German-speaking country and have immediate need and use of a communication ability. It might come as an extra learning practice to people who already take German lessons at institutions such as the Goethe Institute or the German Volkshochschulen .
The system aims at bridging the gap between immediate communication desire and reduced linguistic competence, that beginners in a second language experience.
Our approach is guided by two observations:
In this situation, the main question they are frequently faced to (how do I express my ideas in L2? ) obviously requires to be answered with the help of a native speaker of L2. As a native speaker is not always at hand to answer the questions, the computer would very profitably be used to that purpose.
It would allow a teacher in the language lab to concentrate on one learner at a time, while the others explore the system--thus multiplying the possibility of practicing for many language students of one asme teacher.
It would also allow the learners to practice alone, outside the language class, without necessarily harrassing native speakers nearby.
In this context, is the computer likely to be readily accepted? Studies on the correlation between the use of computers and the learners' motivation predict so.
A recent study (Warschauer, 1996) on experiments using electronic mail in courses of English as Second Language showed strong positive reaction from all types of students. The main motivation emerging from the use of computer in this study was, it is true, linked to the e-mail communicative aspect (as a factorial analysis showed); but it was not the only factor: the feeling of empowerment given by the computer (ease to manipulate knowledge) also had an important score.
Another recent study was conducted among older students who were less familiar with the computer world (Rézeau, 1999): some of them were studying modern languages, others musicology or art history. This study showed a reserved evaluation of computer use at the beginning of the term, but in all cases a positive progression of this evaluation at the end of the term (as well as a positive reevaluation of the interest of language learning itself!). It also showed an interesting result: while students who have chosen to specialize in modern languages often have a very good auditory feeling, students in other branches tend to have more a visual-oriented, analytical mind. For them, and for the other people who share the same character, it is easier to study linguistic structures by visualizing them than to absorb them from a flow of auditory input.
This data suggests that a relevant use of the computer may not only be useful, but foster even more interest in the language learning process.
Jimmy Backer (1995) has presented a good typology of CALL systems, taking into account important criteria that previous theoretical works (esp. Kemmis, Higgins) had already partially pointed out. Following Kemmis' terms he distinguishes instructional , revelatory , conjectural and emancipatory CALL (we will keep the terms, although some do not seem self-explanatory, since we don't want to favour terminological proliferation).
Instructional CALLS (CALL Systems) give rules, and propose sets of exercises to assimilate them. Revelatory CALLS allow learners to use their notions of L2 to manipulate virtual worlds simulating the real world. Conjectural CALLS offer trial and error tasks during which learners play with language. Emancipatory CALLS simply facilitates the use of L2 by relieving the learners of other, more fastidious tasks 2.
Instructional CALL is hence based on explicit transmission of linguistic knowledge: it relies on the principle of deductive learning (from rules to facts), and uses drill and practice to reinforce the acquired knowledge.
Revelatory and conjectural CALL (let us keep "emancipatory" CALL aside), rather, are based on implicit transmission of linguistic knowledge. Confronted with a mass of linguistic data, the learner finally internalizes the rules: this is inductive learning (from facts to rules).
Instructional CALLS tend to lie in the metaphorical field of "computer as magister", whereas the others lie in the field of "computer as pedagogue" (following Higgins' terminology, quoted by Backer [ibid.]).
What are the judgements implicitly lying behind these distinctions, and how does this relate to the current trends in CALL? It must be noticed that to this day, instructional CALLS are at the same time: (a) the most widespread in real use (in Second Language Teaching), and (b) the most widely criticized by specialists, who put forward their lack of interest.
I believe that instructional CALL is depreciated simply because it is basically dull. However, its critique in the scientific literature is generally based on a misunderstanding. It has in fact been criticized for being behaviourist , by researchers who favour a mentalist theory of knowledge (mainly in the mainstream Chomskyan linguistics).
It is known that `behaviourist' is probably the worst possible insult in the mind of classical cognitivists; however if we cling to the technical meaning of the term (reinforcement of a stimulus-response association), it seems that any language learning process, "instructional", "revelatory" or "conjectural" CALL, as well as any L1 or L2 acquisition, spontaneous or directed, at home or at school, must fit into this description. I would even like to add: how can implicit and inductive approaches possibly be less behaviourist than the explicit and deductive approach on which instructional tutoring is based? The whole judgement actually lies on a prejudice based on a reductive and simplistic view of behaviourism: behaviourism is rat training, it consists in always repeating the same things, it is dull and boring; in contrast, mentalism represents intelligence.
But actually inductive approaches demand as much reinforcement work, and perhaps more, to be efficient.
On the whole, this behaviourist/mentalist opposition applied to discriminate Second Language Acquisition (SLA) strategies seems to be an irrelevant line of debate. A sounder theoretical background to approach this question can be found in the newer distinction between objectivist and constructivist learning (Wendt, 1996), which argues that knowledge structures are subjective, individually built representations.
To conclude on this point, we think it is likely that "communicative" CALLS (revelatory or conjectural), which place the learners in a communication situation instead of making them drill grammar, will seem more interesting to the users--and hence have more chances of success--because they are centered on the learners' interests, and arouse much more motivation, than traditional systems. We will then adopt most of Underwood's (1984) "premises for communicative CALL"--although we do not adhere to their theoretical justification (critique of behaviourism), but merely find them based on sound principles of adult psychology:
- Acquisition rather than learning. Focus on communication rather than form. No drills.
- Implicit grammar rather than explicit grammar. Explanations will be optional.
- Allow and encourage original use of language, not merely manipulate prefabricated language.
- Computer will not evaluate everything. The students will evaluate their own work.
- No use of the "Wrong. Try Again." format of feedback. Either model the correct usage or give gentle hints.
- No "reward" with message, graphics, or sound. The achievement of the goal will be sufficient.
- No cuteness is needed. (example: inserting student's name)
- Use the target language exclusively.
- Must be flexible enough for more than one response only.
- Allow student exploration and discovery. No one right answer. Possibly no answers will be given.
- Create natural environment off the screen as well as on it. Generate interaction among users as well as between users and the computers.
- Never use a computer for something a book can do better. Do not create electronic workbooks.
- CALL must be fun, optional, and supplementary to regular classwork. There should be no record keeping of a student's activities or progress.
Underwood (1984) quoted in (Backer, 1995).
Research in the field of CALL in the mainstream, classical, cognitivist AI has put emphasis on:
These approaches generally tend to favour the metaphorical approach of the "computer as a pedagogue".
However in the last five years a great number of projects have essentially been trying to explore the possibility of applying expanding new technologies, generally related to what is called "Information Society", to the field of CALL (Rüschoff & Wollf, 1998). These new technologies go down to two big domains: (a) the Internet, and (b) Multimedia.
Software integrating these new technologies are generally found to arouse interest among the learners, even if they do not implement particularly new pedagogic approaches--the mere attractive power of computer mediated communication on the one side, of sounds and images on the other, tend to increase motivation in themselves. Besides, it can be noticed, not surprisingly, that software integrating these new technologies prolong the main previous pedagogical trends, i.e. that most of them still lie in the "instructional CALL" category: they are organized around lessons and exercises, even if those lessons and exercises comprise multimedia data.
Interesting exceptions are for example The Rosetta Stone (inductive learning organized around visual stimulation), or COMPAL's Storyboard (based on the storyboard concept--i.e. rather in the "revelatory" category--, but with a comprehensive multimedia context).
From the theoretical reflexions and past experience, it can be remembered that:
Adult learners have to become aware of the language structures: they have at least partially lost the ability to learn only from exposure to an input flow (like in the case of L1 acquisition).
(This favours the conjectural CALL approach, where the learner can bring the structures to the fore by "watching them work" [Zock, 1994]).
Learners learn implicitly from the input they are exposed to, not from their own output (which can help them become aware of their deficiencies, and gain confidence in practicing, but can in no case give "positive reinforcement").
(This is somewhat contradictory with the communicative motivation, and somehow a bridge has to be built over this gap between the desire to express oneself and the ability to do so in L2).
Hence the ideal communicative learning system in this perspective would:
This approach is not so common, since it calls for a variety of cognitive meta-knowledge. It has been advocated by Zock (1992), who has proposed a generative system called SWIM, and by Wilks (1992).
The keypoint here is the point (1) mentioned above: how can the system become aware of what the learner wants to express? Imaginative answers have to be envisioned in this context: iconic entry, robust parsing, graph editing.
We propose to use a methodology we have already defined and used in the frame of an Augmentative and Alternative Communication (AAC) system for speech-impaired people (Vaillant, 1998): semantic analysis of icon sequences.
The GLOTTAI system is built on the following ground principles:
These three big functions correspond to three different stages of the message processing: respectively, semantic analysis, sentence generation, and lexical choice. The order in which these three stages are presented here corresponds to the order in which they are a priori first performed. This does not imply that they must necessarily be sequentially ordered: for example, after a stage of lexical refinement, the system reperforms sentence generation. The user also can decide to go back to icon selection/deselection and to reanalyze the sequence. The three modules cooperate simultaneously, operating on a pivot data structure: the graph.
This order also does not correspond to a logical processing order, which in this case would be: semantic analysis (operating on the icon sequence, producing the graph), lexical choice (operating on the graph), generation (operating on the graph, producing the sentence). At the first run however, the generation stage can operate directly on the raw graph produced by the analysis, before any further lexical refinement has been made; this explains the order above.
To implement these global functions, the system is organized in two modules: a linguistic module, performing the linguistic operations (semantic analysis of icons, lexical choice, generation) with access to specialized databases (icon lexicon, TELEX database of sememes, lexico-syntactic database of German); and a Human-Machine Interface module allowing the user to input the icons and manipulate the linguistic structures.
![]() |
The linguistic module is implemented by a PROLOG process, and the Human-Machine interface by a JAVA program. The two processes are independent. Formally, the JAVA program acts as a client to the PROLOG server, which processes information and yields results only when it is requested to. The two processes communicate through sockets (managed by a local patch of code in the case of the PROLOG process), which allows the interface program, if needed, to be downloaded on any computer with JAVA capability over the Internet, and then to communicate with the server through a TCP/IP connection.
Three different databases are needed at the three different functional stages of the system: one icon database for the basic concepts represented iconically, one lexical database containing a formal description of the nuances between sememes of the target language (here German), and one lexico-syntactic database used for the construction of the sentence.
For the icon database, we have chosen to follow the theme/rubric classification proposed in (Rohrer, 1991). The icons are not isomorphic to the words of the German language, but as far as precise concepts are concerned, the classification given in the Grundwortschatz provides a good starting point for selecting the most frequent and useful concepts. The information concerning the lists of the 2000/4000 most frequent words of the German language has been collated with the information contained in the CELEX database (Centre for Lexical Information, University of Nijmegen, 1993) to get the most significant results.
Examples of theme/rubric: "Der Mensch/Positive und neutrale Gefühle ", "Handlungen und Aktivitäten/Umgang mit Dingen und Lebewesen ", "Alltagswelt/Obst und Gemüse ", " Öffentliches Leben/Staat und Politik ".
Inside every rubric, the concepts are grouped into smaller categories of functionally equivalent items, which, following the differential semantics tradition of (Rastier, 1987), we label "taxemes". These taxemes provide the base-level at which icons are distinguished during the semantic analysis.
At the rubric level, as well as at the taxeme level, semantic feature structures model the essential semantic contents common to the icons in the category described.
Individually, the icons have the following properties: a unique identifier string, an image file, and a list of specific semantic features which define them in contrast to other icons of the same taxeme.
The TELEX database is a lexical database based on works on the functional structuration of the verb fields (Kunze 1991, 1993). It provides a differential description of verbs which allows to distinguish items of the same taxeme at the level of perspective with which they linguistically realize their basic meaning (Grundform ).
Verb fields in general are defined by a minimal Grundform, possibly refined by complementary descriptions at the taxeme-specific level, e.g. to distinguish buy , rent and borrow from mere take. The Grundform consists in a generic decomposition of the common meaning in predicates with uninstantiated variables. For example, for the class of verbs expressing a property transfer:
(somebody (1) sees to it that somebody (2) gets something (3) while somebody (4) does not have something (5) anymore).
To distinguish, in a generative perspective, the different sememes that relate to the same Grundform, TELEX mentions the specifications of the generic structure which may occur by:
Moreover, it is also specified which is the implicit communicative "directionality" internally conveyed by each sememe: on which parts of its internal description lie semantic emphasis (semantische Emphase ). This allows for example the distinction of nuances like between give and give away.
All the combinatory possibilities are naturally not necessarily filled in the language. However it is important to distinguish them, because they are precisely what makes linguistic sememes different from generic concepts. Learning to understand which possibilities the target language L2 offers in this respect, and how far it differs with L1, is an important step towards a real mastering of L2.
The GLOTTAI approach aims at understanding what learners want to say, or at least getting a first interpretation of it, without demanding a previous knowledge of the grammar and vocabulary of L2. Hence approaches based on parsing are not adapted. We need to have a robust semantic parser, based on semantic knowledge rather than on morphological information (absent in icons) or syntactical information (not supposed to be acquired, and anyway weakly constraining in the case of German!)
We perform a semantic analysis of the input sequence of icons using an algorithm similar to the one described in (Vaillant, 1998): when an icon in the input sequence has a "predicative" structure (it may become the head of at least one dependency relation to another node, labeled "actant"), the other icons around it are checked for compatibility. Compatibility is measured as a unification score between two sets of feature structures: the intrinsic semantic features of the candidate actant, and the "extrinsic" semantic features of the predicative icon attached to a particular semantic role (i.e. the properties "expected" from, say, the agent of kiss , the direct object of drink , or the concept qualified by the adjective fierce ).
The result yielded by the semantic parser is the graph that maximizes the sum of the compatibilities of all its dependency relations. It constitutes, with no particular contextual expectations, and given the state of world knowledge stored in the iconic database in the form of semantic features, the "best" interpretation of the users' input.
Yet that best interpretation possibly has to be refined by the user, who perhaps has meant something else with the same concepts, or wants to express another nuance of meaning, or eventually simply wants to "play" with the possible lexical variations (our approach is here strongly inspired by the Meaning-Text Theory and its notion of semantic synthesis [Polguère, 1998]).
The lexical choice module, in this purpose, allows the users:
or, having selected a particular node:
The goal of the generation process is to transform the graph into one or more German sentences.
The first sentence is generated during a one-shot depth-first scan of the graph, following the casual relations in a forward direction, and beginning from the first predicative node in the sequence (i.e. the first in the order defined by the communicative structure).
Every node, as a lexical item, has an entry in the lexico-syntactic database. For every entry in this database, syntactic trees which represent canonical ways of putting the concept and its case-fillers into a German phrase are stored. The syntactic trees are elementary trees of the TAG formalism: the lexical entry they are stored under appears as their anchor, and the possible case-fillers as places for substitution--an indexical structure, stored in parallel, gives the mapping between the places for substitution and the cases to which they correspond.
After the first sentence has been generated with the maximum number of nodes possibly taken into account, the generation algorithm seeks, in order of priority: (a) to adjoin remaining information into the current sentence using tree-adjunction operations; (b) if necessary, to build new sentences to convey the remaining information.
After the tree is generated, appropriate unification is performed on the morphosyntactical feature structures to determine the surface form of the sentence.
The client application, running in the JAVA language, has the main functionality of serving as the Human Machine Interface (HMI) for the PROLOG process (server application). It could possibly perform additional functions such as storing semantic information on previous sessions at the client site--this must be studied in the future.
The HMI aims at giving the learner direct access to the semantic structures defined and processed by the PROLOG module. It is structured around a main window, which contains the following parts:
![]() |
In addition to this main window, an Icon lexicon window allows the selection of icons. It is opened at the beginning, and can be re-opened at any time to reselect an icon to be added or replaced, or compose a new message. At opening the Icon lexicon displays generic icons, which correspond to rubrics. When clicking on a generic icon, the display is refreshed to present the actual icons belonging to that rubric.
The Icon lexicon window can always be closed to go back to the main HMI window.
At the beginning of the project, a workplan had been defined along the following lines:
WP1: Preliminary Study & Specification; WP2: Development; WP3: Evaluation; WP4: Extension & Improvements; WP5: Final Report & Prospects Study.
These work packages were themselves divided into the following tasks:
WP1: Task 1.1: Review of partially related works; Task 1.2: Definition of the subset of the German language covered by the system, definition of the restricted subset covered by the first release of the system; Task 1.3: Definition of the iconic sememes; Task 1.4: Complete specifications of the system.
WP2: Task 2.1: Implementation of the icon lexicon in a format homogeneous with the TELEX dictionary; Task 2.2: Implementation and Unit Test of the semantic analysis module; Task 2.3: Implementation and Unit Test of the generation module; Task 2.4: Implementation and Unit Test of the graph editing interface; Task 2.5: Integration and Test of the first release.
WP3: Task 3: Evaluation.
WP4: Task 4.1: Support to users during evaluation; Task 4.2: Improvements, bug correction; Task 4.3: Extension of the initial system's lexicon. Task 4.4: Integration and test of the second release.
WP5: Task 5: Final Report and Prospects Study
At the present date, WP1 has been completed, and the development goes on WP2. None of the tasks in WP2 has been completed yet.
During the first year of this project, I have familiarized myself with scientific domains (CALL), traditions (German linguistics) and techniques that were unknown to me. The objectives of the second year of the project should include in priority publication of the analyses and results emerging from the present research.
1 Standing for German Lexicon Opened Teaching Tongues by the Analysis of Icons .
2 Word processors, Electronic mail, or World Wide Web, for example, could be counted into this category. In general, this emancipatory CALL category can give account of any computer tool used in the purpose of language learning, if it was not designed for it.
