Workshop description Programme Submission instructions Important dates Program Committee Organizing Committee


A Workshop on Acquisition and Management
of Multilingual Lexicons


In conjunction with RANLP-2007
(The International Conference on Recent Advances in Natural Language Processing)

Sponsored by Expert System Inc.

Endorsed by the ACL Special Interest Group on the Lexicon (ACL-SIGLEX)

Borovetz, Bulgaria
September 30, 2007




AIMS

The current trend of information exchange on the Internet to become ever more multilingual has stimulated new important developments in the field of multilingual Natural Language Processing (NLP). The present workshop is concerned with the problem of automatic management of lexical resources, which lie at the heart of many multilingual technologies.

In recent years one could witness an increased interest of NLP research in the automated discovery of equivalent expressions in different languages. New interesting directions have sprung up, such as the use of the Web and comparable corpora for this task, and new kinds of lexical phenomena, such as multiword expressions and named entities, have come into focus.

This workshop will bring together researchers working on a broad range of problems related to the management of multilingual lexical resources - their acquisition, maintenance, customization, and re-use. The scope of the workshop includes evaluation of multilingual lexicons intended for a broad range of applications: from personalized glossaries for a translation project to large-scale machine-readable dictionaries and databases.

Specific topics of interest for the proposed workshop are:

  • Acquisition of lexical knowledge from parallel and comparable corpora, and from the Web
  • Porting, merging and domain customisation of existing lexical resources using NLP technologies
  • Acquisition of multilingual domain terminology
  • Acquisition of translations for multi-word expressions
  • Acquisition of translations for polysemous words
  • Acquisition of multilingual lexical taxonomies
  • Named Entity transliteration
  • Acquisition of cognates and loanwords
  • Extraction of equivalent free word combinations from comparable corpora
  • Applications of multilingual lexicons and their evaluation within:
    • Statistical Machine Translation
    • Computer-Aided Translation
    • Information Retrieval (Question Answering, Text Retrieval, Text Classification and Clustering)
    • Knowledge Management


WORKSHOP PROGRAMME

10.00 - 10.10 Opening remarks
10.10 - 11.00 Invited talk by Bruno Pouliquen (Language Technology Group, JRC). Acquisition and Use of Multilingual Name Dictionaries.
11.00 - 11.30 Ahmed Hassan, Haytham Fahmy and Hany Hassan. Improving Named Entity Translation by Exploiting Comparable and Parallel Corpora.
11.30 - 11.45 Coffee break
11.45 - 12.15 Shane Bergsma and Grzegorz Kondrak. Multilingual Cognate Identification using Integer Linear Programming.
12.15 - 12.45 Svetlin Nakov, Preslav Nakov and Elena Paskaleva. Cognate or False Friend? Ask the Web!
12.45 - 14.00 Lunch break
14.00 - 14.30 Andrea Mulloni, Viktor Pekar, Ruslan Mitkov, Dimitar Blagoev. Semantic Evidence for Automatic Identification of Cognates.
14.30 - 15.00 Veronique Hoste, Klaar Vanopstal and Els Lefever. The Automatic Detection of Scientific Terms in Patient Information.
15.00 - 15.15 Coffee break
15.15 - 15.45 Michael Carl, Oliver Culo and Sandrine Garnier. Compiling and Managing a Bilingual Lexicon in METIS-II.
15.45 - 16.15 Ismail Fahmi, Gosse Bouma and Lonneke van der Plas. Using Multilingual Terms for Biomedical Term Extraction.
16.15 - 16.30 Closing remarks.


SUBMISSION INSTRUCTIONS

Format. Authors are invited to submit full papers on original, unpublished work in the topic area of this workshop. Papers should be submitted as a PDF file, formatted according to the RANLP 2007 stylefiles and not exceeding 8 pages. The RANLP 2007 stylefiles are available at:

http://lml.bas.bg/ranlp2007/submissions.htm

As reviewing will be blind, the papers should not include the authors' names and affiliations. Furthermore, self-references that reveal the authors' identities should be avoided. Papers that do not conform to these requirements will be rejected without review.

Submission procedure. Please submit your paper at: http://quad.softconf.com/ranlp/amml07/submit.html.

Reviewing. Each submission will be reviewed at least by two members of the Program Committee. Reviewers will be asked to provide detailed comments, and to score submitted papers on the following factors:

  • Relevance to the workshop
  • Significance and originality
  • Technical/methodological accuracy
  • References to related work
  • Presentation (clarity, organisation, English)

Accepted papers policy. Accepted papers will be published in the workshop proceedings. By submitting a paper at the workshop the authors agree that, in case the paper is accepted for publication, at least one of the authors will attend the workshop; all workshop participants are expected to pay the RANLP-2007 workshop registration fee.


IMPORTANT DATES

Call for papers: March 15, 2007
Workshop paper submission deadline:
(five days after the notification of the main RANLP-2007 conference)         
July 6, 2007
 
Workshop paper acceptance notification: August 6, 2007
Camera-ready papers for workshop proceedings due: August 31, 2007
Workshop date: September 30, 2007


PROGRAM COMMITTEE

Eneko Agirre (Basque Country University, Spain)
Enrique Alfonseca (Google Inc.)
Marco Baroni (University of Trento, Italy)
Paul Buitelaar (DFKI, Germany)
Michael Carl (IAI, Germany)
Gloria Corpas (University of Malaga, Spain)
Dan Cristea (University "Al. I. Cuza" Iasi, Romania)
Gael Dias (University of Beira Interior, Portugal)
Diana Inkpen (University of Ottawa, Canada)
Dorothy Kenny (Dublin City University, Ireland)
Adam Kilgarriff (Lexicography Masterclass, UK)
Greg Kondrak (University of Alberta, Canada)
Ruslan Mitkov (University of Wolverhampton, UK)
Sebastian Pado (University of Saarland, Germany)
Reinhard Rapp (Johannes Gutenberg-Universitat Mainz, Germany)
Fatiha Sadat (University of Ottawa, Canada)
Violeta Seretan (University of Geneva, Switzerland)
Michel Simard (National Research Council of Canada)
Jörg Tiedemann (Rijksuniversiteit Groningen, Netherlands)
Takehito Utsuro (University of Tsukuba, Japan)
Piek Vossen (Vrije Universiteit Amsterdam, Netherlands)
Michael Zock (LIF-CNRS, France)


KEYNOTE SPEECH

Acquisition and Use of Multilingual Name Dictionaries
by Bruno Pouliquen (Language Technology Group, JRC)

The speaker will present work on automatically acquiring multilingual name dictionaries from news texts in 19 languages and on using such dictionaries and gazetteers to link related news over time and across languages. The purpose of this work is to allow users to navigate and explore news collections, compare news about the same subject across languages, and to collect information such as name variants and other name attributes about people and other entities from large multilingual news collections. The technology and the created lexical resources are fully integrated into the news aggregation, analysis and exploration system NewsExplorer, which is publicly accessible at http://press.jrc.it/NewsExplorer. NewsExplorer groups news about the same topic or event, displays the names of persons, organisations and locations for each news cluster, links these clusters over time and across languages, collects name variants and name attributes for currently 630,000 entities from multiple languages and displays them on dedicated person pages.

Bruno Pouliquen is a senior researcher in the Language Technology group of the European Commission’s Joint Research Centre in Ispra, Italy. He specialises in information extraction, document clustering and classification, techniques to cross the language barrier and the visualisation of automatically extracted information. He holds a Ph.D. in Computer Science since 2002.


ORGANIZING COMMITTEE

Viktor Pekar
University of Wolverhampton, UK

Diana Inkpen
University of Ottawa, Canada

Andrea Mulloni
Expert System Inc., Italy


Contact information: For questions or comments, please contact Andrea Mulloni (amulloni at expertsystem-it).