2nd CfP: 2nd Workshop on Natural Language Processing for Translation Memories (NLP4TM 2016) at LREC 2016

Orasan, Constantin C.Orasan at wlv.ac.uk
Do Jan 28 15:19:36 CET 2016


(apologies for cross-posting)

2nd Workshop on Natural Language Processing for Translation Memories
(NLP4TM 2016)
http://rgcl.wlv.ac.uk/nlp4tm2016/

to be held at LREC 2016 (Portorož, Slovenia), May 28, 2016

Submission deadline: February 10, 2016

1. Call For Papers

Translation Memories (TM) are amongst the most used tools by
professional translators, if not the most used. The underlying idea of
TMs is that a translator should benefit as much as possible from
previous translations by being able to retrieve how a similar sentence
was translated before. Moreover, the usage of TMs aims at guaranteeing
that new translations follow the client’s specified style and
terminology. Despite the fact that the core idea of these systems
relies on comparing segments (typically of sentence length) from the
document to be translated with segments from previous translations,
most of the existing TM systems hardly use any language processing for
this. Instead of addressing this issue, most of the work on translation
memories focused on improving the user experience by allowing
processing of a variety of document formats, intuitive user interfaces,
etc.

The term second generation translation memories has been around for
more than ten years and it promises translation memory software that
integrates linguistic processing in order to improve the translation
process. This linguistic processing can involve tasks such as matching
of subsentential chunks, edit distance operations between syntactic
trees, incorporation of semantic and discourse information in the
matching process. This workshop invites papers presenting second
generation translation memories and related initiatives.

Terminologies, glossaries and ontologies are also very useful for
translation memories, by facilitating the task of the translator and
ensuring a consistent translation. The field of Natural Language
Processing (NLP) has proposed numerous methods for terminology
extraction and ontology extraction. Researchers are encouraged to
submit papers to the workshop which show how these methods are being
successfully applied to Translation Memories. In addition, papers
discussing the integration of Machine Translation and Translation
Memories or studies about automatic building of translation memories
from corpora are also welcomed.

2. Topics of interest

This workshop invites original papers which show how language
processing can help translation memories. Topics of interest include
but are not limited to:

- Improving matching and retrieval of segments by using morphological,
syntactic, semantic and discourse information
- Automatic extraction of terminologies and ontologies for translation
memories
- Integration of named entity recognition and terminologies in matching
and retrieval
- Using natural language processing for automatic construction of
translation memories
- Extracting and aligning TM segments from a parallel or comparable
corpus
- Construction of translation memories using the Internet
- Corpus based studies about the usefulness of TM for specific domains
- Development of hybrid TM and MT translation systems
- Study of NLP techniques used by TM tools available in the market
- Automatic methods for TM cleaning and maintenance

3. Shared task

A shared task on cleaning translation memories will be organised. A
training set will be distributed to be used to develop and train the
participants’ systems. The testing will be done on 500 segments
distributed during the testing phase.

TASK: Automatically clean translation memories

TRAINING SET: 1,500 TM segments annotated with information on whether
they are a valid translation of each other

TEST SET: 500 TM segments

LANGUAGE PAIRS:
- English-Italian
- English-German
- English-Spanish

RELEASE OF THE TRAINING DATA: first week of February 2016

Participants are encouraged to submit working notes of their systems to
be presented during the workshop. More details, including the shared
task schedule will be announced soon in a dedicated Call for
Participation.

4. Submission information

We invite contributions of either long papers (8 pages + 2 references)
which present unpublished original research or short paper/demos of
systems which present work in progress or working systems (4 pages + 2
references). The submissions do not need to be anonymised.

All the papers will have to be submitted in PDF format via the START
system. More information about submission is available at 
http://rgcl.wlv.ac.uk/nlp4tm2016/submission-information/

5. Identify, Describe and Share your LRs

As scientific work requires accurate citations of referenced work so as
to allow the community to understand the whole context and also
replicate the experiments conducted by other researchers, LREC 2016
endorses the need to uniquely Identify LRs through the use of the
International Standard Language Resource Number (ISLRN, www.islrn.org),
a Persistent Unique Identifier to be assigned to each Language
Resource. The assignment of ISLRNs to LRs cited in LREC papers  will be
offered at submission time.

6. Important dates

Submission deadline: 10th February 2016
Acceptance notification: 7th March 2016
Camera-ready versions: 31st March 2016
Workshop date: 28th May 2016

7. Organising committee

For the workshop

- Constantin Orasan, University of Wolverhampton, UK
- Carla Parra, Hermes, Spain
- Eduard Barbu, Translated, Italy
- Marcello Federico, FBK, Italy

For the shared task

- Eduard Barbu, Translated, Italy
- Carla Parra, Hermes, Spain
- Luca Mastrostefano, Translated, Italy
- Matteo Negri, FBK, Italy
- Marco Turchi, FBK, Italy
- Luisa Bentivogli, FBK, Italy
- Constantin Orasan, University of Wolverhampton, UK

The organisers can be contacted by sending an email to nlp4tm2016 <at>
gmail.com.

8. Program Committee

- Juanjo Arevalillo, Hermes, Spain
- Yves Champollion, WordFast, France
- Gloria Corpas, University of Malaga, Spain
- Maud Ehrmann, EPFL, Switzerland
- Kevin Flanagan, Swansea University, UK
- Corina Forascu, University “Al. I. Cuza”, Romania
- Gabriela Gonzalez, eTrad, Argentina
- Rohit Gupta, University of Wolverhampton, UK
- Manuel Herranz, Pangeanic, Spain
- Samuel Läubli, Autodesk, Switzerland
- Liangyou Li, DCU, Ireland
- Qun Liu, DCU, Ireland
- Ruslan Mitkov, University of Wolverhampton, UK
- Aleksandros Poulis, Lionbridge, Sweden
- Gabor Proszeky, Morphologic, Hungary
- Uwe Reinke, Cologne University of Applied Sciences, Germany
- Michel Simard, NRC, Canada
- Mark Shuttleworth, UCL, UK
- Masao Utiyama, NICT, Japan
- Mihaela Vela, Saarland University, Germany
- Andy Way, DCU, Ireland
- Jörn Wübker, Lilt, USA
- Marcos Zampieri, Saarland University and DFKI, Germany


-- 
Dr. Constantin Orasan <C.Orasan at wlv.ac.uk>
Reader in Computational Linguistics
Coordinator of the EXPERT project
Research Group in Computational Linguistics
http://www.wlv.ac.uk/~in6093/
University of Wolverhampton



Mehr Informationen über die Mailingliste IFI-CI-Event