Workshop on Natural Language Processing methods and Corpora in Translation, Lexicography, and Language Learning

In conjunction with RANLP-2009
(The International Conference on Recent Advances in Natural Language Processing)

Borovets, Bulgaria
September, 2009

Workshop description Important dates Submission instructions Preliminary Programme Programme Committee Organising Committee


Corpora are now indispensable tools in research and everyday practice for translators, lexicographers, second language learners. Specialists in these areas share a general goal in using corpora in their work: corpora provide the possibility to find and analyse linguistic patterns characteristic of various kinds of language users, monitor language change, and reveal important similarities and divergences across different languages. For professional translators corpora present an invaluable linguistic and cultural awareness tools. For language learners, they serve as a means to gain insights into specifics of competent language use as well as to analyse typical errors of fellow learners. For lexicographers, corpora are key for monitoring the development of the vocabularies of languages, making informed decisions as to lexicographic relevance of the lexical material, and for general verification of all varieties of lexicographic data.

While simple corpus analysis tools such as concordancers have been long in use in these specialist areas, in the past decade there have been important developments in Natural Language Processing (NLP) technologies: it has become much easier to construct corpora and powerful NLP methods have become available that can be used to analyse corpora not only on the surface level, but also on the syntactic, and even semantic, pragmatic, and stylistic levels.

This workshop aims to bring together the developers and the users of NLP technologies for the purposes of translation, translation studies, lexicography, terminology, and language learning in order to present their research and discuss new possibilities and challenges in these fields.

Submissions are invited for the following topics of interest to the workshop:

  • NLP methodologies for processing parallel and comparable corpora
  • Context-sensitive dictionary look-up
  • Corpus-based study and identification of cognates and false friends
  • Compilation and use of corpora in translation studies
  • Corpus-based study of properties of translated text:
    • translation universals
    • phraseology
    • lexical and grammatical patterns
  • Corpora in translator training
  • Translation of terms and collocations using corpora
  • Bilingual concordancing in translation applications
  • NLP methods for Computer-Aided Translation
  • Compilation of specialised terminologies
  • Compilation of corpora for bilingual lexicography
  • Detection of gaps in bilingual dictionaries
  • Corpus-based estimation of lexicographic relevance
  • Term and collocation extraction
  • Discovery of illustrative examples and definitions of words and word senses in corpora
  • Reading and writing aid applications for language learners
  • Automated text glossing in Computer-Aided Language Learning (CALL)
  • Corpus-based design of assessment materials in CALL
  • Error detection and error analysis in CALL
  • Detection of first-language interference in learner corpora


Extended paper submission deadline: July 10, 2009
Workshop paper acceptance notification: August 10, 2009
Camera-ready papers for workshop proceedings due: August 24, 2009
Workshop date: September 17, 2009


Papers must be submitted in PDF format as e-mail attachments to iustina dot ilisei at gmail-com. The e-mail should use the subject header "RANLP-2009 workshop".

Authors are invited to submit full papers on original, unpublished work in the topic area of this workshop. Papers (in PDF format conforming to the RANLP 2009 stylefiles) should not exceed 8 pages. The RANLP 2009 stylefiles are available at:

As reviewing will be blind, the papers should not include the authors' names and affiliations. Furthermore, self-references that reveal the authors' identities should be avoided. Papers that do not conform to these requirements will be rejected without review.

Each submission will be reviewed by at least two members of the Programme Committee. Reviewers will be asked to provide detailed comments, and to score submitted papers on the following factors:

  • Relevance to the workshop
  • Significance and originality
  • Technical/methodological accuracy
  • References to related work
  • Presentation (clarity, organisation, English)

Accepted papers policy
Accepted papers will be published in the workshop proceedings. By submitting a paper at the workshop the authors agree that, in case the paper is accepted for publication, at least one of the authors will attend the workshop; all workshop participants are expected to pay the RANLP-2009 workshop registration fee.


  • 11.30 - 12.30 : Preslav Nakov (invited talk)
  • 12.30 - 14.00 : Lunch break
  • 14.00 - 14.30 : Dimitar Kazakov and Ahmad Shahid : "Unsupervised Construction of a Multilingual WordNet from Parallel Corpora"
  • 14.30 - 15.00 : Veronica Pastor and Amparo Alcina: "Search techniques in corpora for the training of translators"
  • 15.00 - 16.00 : Gloria Corpas and Ruslan Mitkov: Translation universals: experiments on simplification, convergence and transfer (invited talk)
  • 16.00 - 16.30 : Break
  • 16.30 - 17.00 : Joerg Tiedemann: "Evidence-Based Word Alignment"
  • 17.00 - 17.30 : Joerg Tiedemann and Gideon Kotze: "A Discriminative Approach to Tree Alignment"
  • 17.30 - Closing session


Marco Baroni University of Trento
Jill Burstein Educational Testing Service
Michael Carl Copenhagen Business School
Gloria Corpas Pastor University of Malaga
Le An Ha University of Wolverhampton
Patrick Hanks Masaryk University
Federico Gaspari University of Bologna
Adam Kilgarriff Lexical Computing
Marie-Claude L'Homme Université de Montréal
Ruslan Mitkov University of Wolverhampton
Roberto Navigli University of Rome 'La Sapienza'
Miriam Seghiri University of Malaga
Pete Whitelock Oxford University Press
Richard Xiao Edge Hill University
Federico Zanettin University of Perugia


Iustina Ilisei
University of Wolverhampton, United Kingdom

Viktor Pekar
Oxford University Press, United Kingdom

Silvia Bernardini
University of Bologna, Italy

Contact information: For any questions, please contact Iustina Ilisei: iustina dot ilisei at gmail-com