MT-Evaluation with DiET
Judith Klein / Sabine Lehmann

 

The project DiET (Diagnostic and Evaluation Tools for Natural Language Applications http://dylan.ucd.ie/DiET) is developing a comprehensive software-package for the construction, annotation and maintenance of structured reference data for the evaluation of NLP applications. The DiET system is implemented in a configurable, open client/server architecture with a central database system managing the data, a client integrating construction, display and editing facilities and several servers supporting customisation procedures. The project will provide a substantial amount of structured test data since it builds on the TSNLP test-suites for English, French and German and will integrate and extend these data for further syntactic phenomena but also for morphological data and some discourse phenomena.

The designed annotation schema comprises linguistic annotations but also application-specific, corpus-related and evaluation-specific attributes. Linguistic annotations are mainly used for classifying the data to make them as transparent as possible and to support optimal access and retrieval of relevant sets of test data.

Both, system developers and professional evaluators could take advantage of the DiET tool package in order to support diagnosis and performance-based system and comparison evaluations. The DiET client offers the users the possibility to easily construct and annotate their own test data by either choosing the desired annotation-types from a given annotation-type hierarchy or using the configuration mechanism to define new annotation schemata. Various means for customisation are offered to the users in order to adapt the DiET system and the data to new domains. Through a process of text profiling, links can be established between the structured test-items in the database and related phenomena occurring in domain-specific annotated corpora. Lexical replacement functions allow the user to adapt the vocabulary of the test-items to a specific terminology domain. The tools and database finally allow the user to set up evaluation scenarios and to record the results of test cycles.

In the context of MT evaluation a possible annotation schema will be presented. Furthermore, an evaluation scenario will be designed in order to demonstrate what steps the DiET system can support in a concrete evaluation experiment.