Learning to Communicate in German Language through an Iconic Input Interface

Pascal Vaillant

June 21st, 1999

Presentation of the GLOTTAI project

Pascal Vaillant
Humboldt-Universität zu Berlin
Institut für deutsche Sprache und Linguistik
Lehrstuhl Computerlinguistik
Jägerstraße 10/11
10117 Berlin
Allemagne

Mél :  vaillant@compling.hu-berlin.de


1. Introduction

In the frame of a research project funded by the "Training and Mobility of Researchers" program, we are seeking to apply a new approach, based on semantic analysis of icons, to the field of CALL (Computer Assisted Language Learning).

The application prototype that is currently being developed, GLOTTAI 1, aims at allowing learners of German as a second language to practice communication in that language at home or in tutorial classes. The users first tell the computer what they intend to express by pointing to icons. The system interprets these icons semantically, and proposes its formulation (a) in the form of a conceptual graph, and (b) as a full German sentence. The users are then allowed to "play" with the graph to discover how to express variations or refinements.

In this report, we describe in more detail the ideas underlying this system. In Section 2, we indicate how GLOTTAI may be situated, in terms of theoretical choices and practical approach. In Section 3, we give an overview of the specifications of the system, expose its architecture and its different modules. In Section 4, we mention the lessons which may be drawn at this stage from the experiment.

2. Context and Situation

2.1. Language Learning and its Motivation

The system described addresses the situation of adults, or young adults, who are learning German as a second language (L2) with communicative purposes. This can include people who live in a German-speaking country and have immediate need and use of a communication ability. It might come as an extra learning practice to people who already take German lessons at institutions such as the Goethe Institute or the German Volkshochschulen .

The system aims at bridging the gap between immediate communication desire and reduced linguistic competence, that beginners in a second language experience.

Our approach is guided by two observations:

  1. Those adult or young adult learners are not in the situation of children acquiring their first language (who can slowly acquire a great quantity of implicit knowledge of the language by being exposed to a constant input flow, without being expected to show an immediate functional performance). They need to understand the rules and structures of L2, since they already are conscious of language structures, and are at the same time subject to the risk of spontaneously transferring linguistic structures from their mother tongue (L1).
  2. However they want to be able to immediately begin their practical use of these structures; hence they do not wish to engage in a long-term process of acquiring explicit linguistic knowledge before they can begin to make use of it. They are guided by a strong motivation: being able to express their thoughts as soon as possible.

2.2. Evidence for the Use of Computer

In this situation, the main question they are frequently faced to (how do I express my ideas in L2? ) obviously requires to be answered with the help of a native speaker of L2. As a native speaker is not always at hand to answer the questions, the computer would very profitably be used to that purpose.

It would allow a teacher in the language lab to concentrate on one learner at a time, while the others explore the system--thus multiplying the possibility of practicing for many language students of one asme teacher.

It would also allow the learners to practice alone, outside the language class, without necessarily harrassing native speakers nearby.

In this context, is the computer likely to be readily accepted? Studies on the correlation between the use of computers and the learners' motivation predict so.

A recent study (Warschauer, 1996) on experiments using electronic mail in courses of English as Second Language showed strong positive reaction from all types of students. The main motivation emerging from the use of computer in this study was, it is true, linked to the e-mail communicative aspect (as a factorial analysis showed); but it was not the only factor: the feeling of empowerment given by the computer (ease to manipulate knowledge) also had an important score.

Another recent study was conducted among older students who were less familiar with the computer world (Rézeau, 1999): some of them were studying modern languages, others musicology or art history. This study showed a reserved evaluation of computer use at the beginning of the term, but in all cases a positive progression of this evaluation at the end of the term (as well as a positive reevaluation of the interest of language learning itself!). It also showed an interesting result: while students who have chosen to specialize in modern languages often have a very good auditory feeling, students in other branches tend to have more a visual-oriented, analytical mind. For them, and for the other people who share the same character, it is easier to study linguistic structures by visualizing them than to absorb them from a flow of auditory input.

This data suggests that a relevant use of the computer may not only be useful, but foster even more interest in the language learning process.

2.3. State of the Art in CALL

2.3.1. Theoretical discussion

Jimmy Backer (1995) has presented a good typology of CALL systems, taking into account important criteria that previous theoretical works (esp. Kemmis, Higgins) had already partially pointed out. Following Kemmis' terms he distinguishes instructional , revelatory , conjectural and emancipatory CALL (we will keep the terms, although some do not seem self-explanatory, since we don't want to favour terminological proliferation).

Instructional CALLS (CALL Systems) give rules, and propose sets of exercises to assimilate them. Revelatory CALLS allow learners to use their notions of L2 to manipulate virtual worlds simulating the real world. Conjectural CALLS offer trial and error tasks during which learners play with language. Emancipatory CALLS simply facilitates the use of L2 by relieving the learners of other, more fastidious tasks 2.

Instructional CALL is hence based on explicit transmission of linguistic knowledge: it relies on the principle of deductive learning (from rules to facts), and uses drill and practice to reinforce the acquired knowledge.

Revelatory and conjectural CALL (let us keep "emancipatory" CALL aside), rather, are based on implicit transmission of linguistic knowledge. Confronted with a mass of linguistic data, the learner finally internalizes the rules: this is inductive learning (from facts to rules).

Instructional CALLS tend to lie in the metaphorical field of "computer as magister", whereas the others lie in the field of "computer as pedagogue" (following Higgins' terminology, quoted by Backer [ibid.]).

What are the judgements implicitly lying behind these distinctions, and how does this relate to the current trends in CALL? It must be noticed that to this day, instructional CALLS are at the same time: (a) the most widespread in real use (in Second Language Teaching), and (b) the most widely criticized by specialists, who put forward their lack of interest.

I believe that instructional CALL is depreciated simply because it is basically dull. However, its critique in the scientific literature is generally based on a misunderstanding. It has in fact been criticized for being behaviourist , by researchers who favour a mentalist theory of knowledge (mainly in the mainstream Chomskyan linguistics).

It is known that `behaviourist' is probably the worst possible insult in the mind of classical cognitivists; however if we cling to the technical meaning of the term (reinforcement of a stimulus-response association), it seems that any language learning process, "instructional", "revelatory" or "conjectural" CALL, as well as any L1 or L2 acquisition, spontaneous or directed, at home or at school, must fit into this description. I would even like to add: how can implicit and inductive approaches possibly be less behaviourist than the explicit and deductive approach on which instructional tutoring is based? The whole judgement actually lies on a prejudice based on a reductive and simplistic view of behaviourism: behaviourism is rat training, it consists in always repeating the same things, it is dull and boring; in contrast, mentalism represents intelligence.

But actually inductive approaches demand as much reinforcement work, and perhaps more, to be efficient.

On the whole, this behaviourist/mentalist opposition applied to discriminate Second Language Acquisition (SLA) strategies seems to be an irrelevant line of debate. A sounder theoretical background to approach this question can be found in the newer distinction between objectivist and constructivist learning (Wendt, 1996), which argues that knowledge structures are subjective, individually built representations.

To conclude on this point, we think it is likely that "communicative" CALLS (revelatory or conjectural), which place the learners in a communication situation instead of making them drill grammar, will seem more interesting to the users--and hence have more chances of success--because they are centered on the learners' interests, and arouse much more motivation, than traditional systems. We will then adopt most of Underwood's (1984) "premises for communicative CALL"--although we do not adhere to their theoretical justification (critique of behaviourism), but merely find them based on sound principles of adult psychology:

  1. Acquisition rather than learning. Focus on communication rather than form. No drills.
  2. Implicit grammar rather than explicit grammar. Explanations will be optional.
  3. Allow and encourage original use of language, not merely manipulate prefabricated language.
  4. Computer will not evaluate everything. The students will evaluate their own work.
  5. No use of the "Wrong. Try Again." format of feedback. Either model the correct usage or give gentle hints.
  6. No "reward" with message, graphics, or sound. The achievement of the goal will be sufficient.
  7. No cuteness is needed. (example: inserting student's name)
  8. Use the target language exclusively.
  9. Must be flexible enough for more than one response only.
  10. Allow student exploration and discovery. No one right answer. Possibly no answers will be given.
  11. Create natural environment off the screen as well as on it. Generate interaction among users as well as between users and the computers.
  12. Never use a computer for something a book can do better. Do not create electronic workbooks.
  13. CALL must be fun, optional, and supplementary to regular classwork. There should be no record keeping of a student's activities or progress.

Underwood (1984) quoted in (Backer, 1995).

2.3.2. Current directions of research

Research in the field of CALL in the mainstream, classical, cognitivist AI has put emphasis on:

These approaches generally tend to favour the metaphorical approach of the "computer as a pedagogue".

However in the last five years a great number of projects have essentially been trying to explore the possibility of applying expanding new technologies, generally related to what is called "Information Society", to the field of CALL (Rüschoff & Wollf, 1998). These new technologies go down to two big domains: (a) the Internet, and (b) Multimedia.

Software integrating these new technologies are generally found to arouse interest among the learners, even if they do not implement particularly new pedagogic approaches--the mere attractive power of computer mediated communication on the one side, of sounds and images on the other, tend to increase motivation in themselves. Besides, it can be noticed, not surprisingly, that software integrating these new technologies prolong the main previous pedagogical trends, i.e. that most of them still lie in the "instructional CALL" category: they are organized around lessons and exercises, even if those lessons and exercises comprise multimedia data.

Interesting exceptions are for example The Rosetta Stone (inductive learning organized around visual stimulation), or COMPAL's Storyboard (based on the storyboard concept--i.e. rather in the "revelatory" category--, but with a comprehensive multimedia context).

2.4. Justification of the Present Approach

From the theoretical reflexions and past experience, it can be remembered that:

Yet:

And:

Hence the ideal communicative learning system in this perspective would:

  1. know what the learner wants to say, although he himself does not know how to say it;
  2. be able to express it correctly;
  3. allow the learner to play with the structures.

This approach is not so common, since it calls for a variety of cognitive meta-knowledge. It has been advocated by Zock (1992), who has proposed a generative system called SWIM, and by Wilks (1992).

The keypoint here is the point (1) mentioned above: how can the system become aware of what the learner wants to express? Imaginative answers have to be envisioned in this context: iconic entry, robust parsing, graph editing.

We propose to use a methodology we have already defined and used in the frame of an Augmentative and Alternative Communication (AAC) system for speech-impaired people (Vaillant, 1998): semantic analysis of icon sequences.

3. System Description

3.1. Overview of the system

3.1.1. Overall Functional Description

The GLOTTAI system is built on the following ground principles:

These three big functions correspond to three different stages of the message processing: respectively, semantic analysis, sentence generation, and lexical choice. The order in which these three stages are presented here corresponds to the order in which they are a priori first performed. This does not imply that they must necessarily be sequentially ordered: for example, after a stage of lexical refinement, the system reperforms sentence generation. The user also can decide to go back to icon selection/deselection and to reanalyze the sequence. The three modules cooperate simultaneously, operating on a pivot data structure: the graph.

This order also does not correspond to a logical processing order, which in this case would be: semantic analysis (operating on the icon sequence, producing the graph), lexical choice (operating on the graph), generation (operating on the graph, producing the sentence). At the first run however, the generation stage can operate directly on the raw graph produced by the analysis, before any further lexical refinement has been made; this explains the order above.

3.1.2. Architecture

To implement these global functions, the system is organized in two modules: a linguistic module, performing the linguistic operations (semantic analysis of icons, lexical choice, generation) with access to specialized databases (icon lexicon, TELEX database of sememes, lexico-syntactic database of German); and a Human-Machine Interface module allowing the user to input the icons and manipulate the linguistic structures.

Fig. 1: Overall architecture of the GLOTTAI system
Communication between the prolog module and the interface module, passing elements of type: icon sequence, graph, NL sentence

The linguistic module is implemented by a PROLOG process, and the Human-Machine interface by a JAVA program. The two processes are independent. Formally, the JAVA program acts as a client to the PROLOG server, which processes information and yields results only when it is requested to. The two processes communicate through sockets (managed by a local patch of code in the case of the PROLOG process), which allows the interface program, if needed, to be downloaded on any computer with JAVA capability over the Internet, and then to communicate with the server through a TCP/IP connection.

3.2. Prolog module

3.2.1. Databases

Three different databases are needed at the three different functional stages of the system: one icon database for the basic concepts represented iconically, one lexical database containing a formal description of the nuances between sememes of the target language (here German), and one lexico-syntactic database used for the construction of the sentence.

3.2.2. Semantic Analysis

The GLOTTAI approach aims at understanding what learners want to say, or at least getting a first interpretation of it, without demanding a previous knowledge of the grammar and vocabulary of L2. Hence approaches based on parsing are not adapted. We need to have a robust semantic parser, based on semantic knowledge rather than on morphological information (absent in icons) or syntactical information (not supposed to be acquired, and anyway weakly constraining in the case of German!)

We perform a semantic analysis of the input sequence of icons using an algorithm similar to the one described in (Vaillant, 1998): when an icon in the input sequence has a "predicative" structure (it may become the head of at least one dependency relation to another node, labeled "actant"), the other icons around it are checked for compatibility. Compatibility is measured as a unification score between two sets of feature structures: the intrinsic semantic features of the candidate actant, and the "extrinsic" semantic features of the predicative icon attached to a particular semantic role (i.e. the properties "expected" from, say, the agent of kiss , the direct object of drink , or the concept qualified by the adjective fierce ).

The result yielded by the semantic parser is the graph that maximizes the sum of the compatibilities of all its dependency relations. It constitutes, with no particular contextual expectations, and given the state of world knowledge stored in the iconic database in the form of semantic features, the "best" interpretation of the users' input.

3.2.3. Lexical Choice

Yet that best interpretation possibly has to be refined by the user, who perhaps has meant something else with the same concepts, or wants to express another nuance of meaning, or eventually simply wants to "play" with the possible lexical variations (our approach is here strongly inspired by the Meaning-Text Theory and its notion of semantic synthesis [Polguère, 1998]).

The lexical choice module, in this purpose, allows the users:

or, having selected a particular node:

3.2.4. Generation

The goal of the generation process is to transform the graph into one or more German sentences.

The first sentence is generated during a one-shot depth-first scan of the graph, following the casual relations in a forward direction, and beginning from the first predicative node in the sequence (i.e. the first in the order defined by the communicative structure).

Every node, as a lexical item, has an entry in the lexico-syntactic database. For every entry in this database, syntactic trees which represent canonical ways of putting the concept and its case-fillers into a German phrase are stored. The syntactic trees are elementary trees of the TAG formalism: the lexical entry they are stored under appears as their anchor, and the possible case-fillers as places for substitution--an indexical structure, stored in parallel, gives the mapping between the places for substitution and the cases to which they correspond.

After the first sentence has been generated with the maximum number of nodes possibly taken into account, the generation algorithm seeks, in order of priority: (a) to adjoin remaining information into the current sentence using tree-adjunction operations; (b) if necessary, to build new sentences to convey the remaining information.

After the tree is generated, appropriate unification is performed on the morphosyntactical feature structures to determine the surface form of the sentence.

3.3. Interface

The client application, running in the JAVA language, has the main functionality of serving as the Human Machine Interface (HMI) for the PROLOG process (server application). It could possibly perform additional functions such as storing semantic information on previous sessions at the client site--this must be studied in the future.

The HMI aims at giving the learner direct access to the semantic structures defined and processed by the PROLOG module. It is structured around a main window, which contains the following parts:

Fig. 2: Human-Machine Interface for GLOTTAI: Main Window
Left: Icon Toolbox; Graph Toolbox. Right: Sememe Selection Panel; General Commands. Center Up: Icon Ruler; Center Middle: Graph Display; Center Down: Text Display.

In addition to this main window, an Icon lexicon window allows the selection of icons. It is opened at the beginning, and can be re-opened at any time to reselect an icon to be added or replaced, or compose a new message. At opening the Icon lexicon displays generic icons, which correspond to rubrics. When clicking on a generic icon, the display is refreshed to present the actual icons belonging to that rubric.

The Icon lexicon window can always be closed to go back to the main HMI window.

4. Results and Conclusions

4.1. Advancement

At the beginning of the project, a workplan had been defined along the following lines:

WP1: Preliminary Study & Specification; WP2: Development; WP3: Evaluation; WP4: Extension & Improvements; WP5: Final Report & Prospects Study.

These work packages were themselves divided into the following tasks:

WP1: Task 1.1: Review of partially related works; Task 1.2: Definition of the subset of the German language covered by the system, definition of the restricted subset covered by the first release of the system; Task 1.3: Definition of the iconic sememes; Task 1.4: Complete specifications of the system.

WP2: Task 2.1: Implementation of the icon lexicon in a format homogeneous with the TELEX dictionary; Task 2.2: Implementation and Unit Test of the semantic analysis module; Task 2.3: Implementation and Unit Test of the generation module; Task 2.4: Implementation and Unit Test of the graph editing interface; Task 2.5: Integration and Test of the first release.

WP3: Task 3: Evaluation.

WP4: Task 4.1: Support to users during evaluation; Task 4.2: Improvements, bug correction; Task 4.3: Extension of the initial system's lexicon. Task 4.4: Integration and test of the second release.

WP5: Task 5: Final Report and Prospects Study

At the present date, WP1 has been completed, and the development goes on WP2. None of the tasks in WP2 has been completed yet.

4.2. Training results and objectives

During the first year of this project, I have familiarized myself with scientific domains (CALL), traditions (German linguistics) and techniques that were unknown to me. The objectives of the second year of the project should include in priority publication of the analyses and results emerging from the present research.

Notes

1 Standing for German Lexicon Opened ­ Teaching Tongues by the Analysis of Icons .

2 Word processors, Electronic mail, or World Wide Web, for example, could be counted into this category. In general, this emancipatory CALL category can give account of any computer tool used in the purpose of language learning, if it was not designed for it.

References

BACKER, Jimmy, 1995. Teaching Grammar with Call: Survey of Theoretical Literature. Jerusalem, Israel: Hebrew University of Jerusalem. Retrieved [June 1999] from the URL: http://www1.huji.ac.il/snunit/English/gramcall.htm

KILGER, Anne & BEDERSDORFER, Jochen, 1994. Grammatik und Lexikon für VM-GEN. Saarbrücken, Germany: DFKI (Deutsches Forschungszentrum für künstliche Intelligenz). Retrieved [June 1999] from the URL: http://www.dfki.de/verbmobil/tp1/tp9/publications/ghead.ps.gz

KUNZE, Jürgen, 1991. Kasusrelationen und semantische Emphase. Studia Grammatica, XXXII. Berlin, Germany: Akademie Verlag.

KUNZE, Jürgen, 1993. Sememstrukturen und Feldstrukturen. Studia Grammatica, XXXVI. Berlin, Germany: Akademie Verlag.

POLGUÈRE, Alain, 1998. "La Théorie Sens-Texte". Dialangue, 8/9, 9-30. Université du Québec à Chicoutimi, Canada. Retrieved [June 1999] from the URL: http://www.fas.umontreal.ca/ling/olst/FrEng/PolgIntroTST.ps

RASTIER, François, 1987. Sémantique interprétative. Formes sémiotiques. Paris: Presses Universitaires de France.

ROHRER, Hans-Heinrich, 1991. Vocabulaire allemand de base. Französische Ausgabe des Langenscheidt Grundwortschatz Deutsch. Übersetzung von Micheline Funke. Berlin/Munich, Germany: Langenscheidt.

RÉZEAU, Joseph, 1999. "Profils d'apprentissage et représentations dans l'apprentissage des langues en environnement multimédia". In ALSIC (Apprentissage des Langues et Systèmes d'Information et de Communication), 2 (1): 27-49. Besançon, France: Université de Franche-Comté, June 1999. Retrieved [June 1999] from the URL: http://alsic.univ-fcomte.fr

RÜSCHOFF, Bernd & WOLLF, Dieter, 1998. Fremdsprachenlernen in der Wissensgesellschaft. Munich, Germany: Max Hueber.

SWARTZ, Merryanna L, 1992. "Introduction" to M. L. Swartz, M. Yazdani (Eds.), Intelligent Tutoring Systems for Foreign Language Learning (pp. 1-6). NATO ASI Series. Berlin, Germany: Springer.

UNDERWOOD, John, 1984. Linguistics, computers and the language teacher: a communicative approach. Rowley, MA, U.S.A.: Newbury House.

VAILLANT, Pascal, 1998. "Interpretation of iconic utterances based on contents representation: semantic analysis in the PVI system". Natural Language Engineering , 4 (1): 17-40. Cambridge, United Kingdom : Cambridge University Press.

VIJAY-SHANKER, Krishnamurti & JOSHI, Aravind K., 1988. "Feature structure based tree adjoining grammars". In Proceedings of the 12th International Conference on Computational Linguistics (COLING). Budapest, Hungary.

WARSCHAUER, Mark, 1996. "Motivational aspects of using computers for writing and communication". In Mark Warschauer (Ed.), Telecollaboration in foreign language learning: Proceedings of the Hawai'i symposium (pp. 29-46). Honolulu, Hawai'i: University of Hawai'i, Second Language Teaching & Curriculum Center. Retrieved [June 1999] from the URL: http://www.lll.hawaii.edu/nflrc/NetWorks/NW1/

WENDT, Michael, 1996. Konstruktivistische Fremdsprachendidaktik. Tübingen, Germany: Gunter Narr.

WILKS, Yorick, 1992. "Building an intelligent second-language tutoring system from whatever bits you happen to have lying around". In M. L. Swartz, M. Yazdani (Eds.), Intelligent Tutoring Systems for Foreign Language Learning . NATO ASI Series. Berlin, Germany: Springer.

ZOCK, Michael, 1992. "SWIM or Sink: the problem of communicating thought". In M. L. Swartz, M. Yazdani (Eds.), Intelligent Tutoring Systems for Foreign Language Learning . NATO ASI Series. Berlin, Germany: Springer.

ZOCK, Michael, 1994. "Language in action, or learning a language by watching it work". In Proceedings of the Twente Workshop on Language Technology: Computer Assisted Language Learning . Universiteit Twente, Enschede, The Netherlands.


Pascal Vaillant, 21/06/1999