Language Technology at LTH
Our research activities are focused on semantic technologies and applications of them. Our main achievement, so far, is a line of high-performance, multilingual semantic parsers.
We have also designed new methods and algorithms for dependency parsing, the conversion of constituents into dependency graphs, coreference solving, and the ordering of temporal relations.
Carsim, a system to convert written texts into animated 3D images, is the main application our group has implemented.
Semantic parsing
We have developed a set of high-performance semantic parsers for unrestricted text in English, Chinese, German, and four other languages. They adopt the PropBank/Nombank predicates as well as the FrameNet paradigm. |
Our parsers were evaluated in the CoNLL 2008 shared task on the joint parsing of syntactic and semantic dependencies on English and CoNLL 20009 on seven languages. We also participated in the SemEval-2007 task on Frame-semantic Structure Extraction (semantic role labeling). In 2007 and 2008, our parsers obtained the best results. In 2009, we obtained the second best semantic score on the seven languages (average), with the best score for the Chinese and German data and the second best one for English.
- Online demonstration! Test the parser here: http://barbar.cs.lth.se:8081/.
- New! We have released the code as part of the Mate tools. Download it here.
- See the CoNLL-2009 evaluation figures here.
- See the CoNLL-2008 evaluation figures here.
- Download the Propbank/Nombank program: The system that participated in the CoNLL-2008 shared task.
- Download the Framenet program: A re-engineered version of LTH's system that participated in SemEval-2007.
- More on the algorithms and our results in SemEval here: [pdf]
- In November 2007, our semantic parser obtained the award of the Best university project at Microsoft's TechFest in Copenhagen [pdf]
The Carsim project
We have developed a system that
automatically converts textual descriptions of
accidents into 3D scenes, Carsim.
Carsim combines language processing and visualization techniques. It takes a written report describing an accident as input. A first module analyzes the report using information extraction techniques and produces a representation of it. A visualization module constructs a 3D scene from it and replays the accident symbolically. |
- A demonstration of Carsim
-
More on Carsim: [pdf]
The Carsim project was funded by the Vinnova program: Språkteknologi |
Constituent-to-dependency converter
The constituent-to-dependency converter automatically translates the constituent format used in the Penn Treebank into dependency trees. The tool was used to prepare the English data sets in the CoNLL Shared Tasks of 2007, 2008, and 2009.
- Download the program
- More on the new format and its benefits here: [pdf]