Workshop to be held in conjunction with the NODALIDA-2007 conference in Tartu, May 24, 2007
Annotated data with role-semantic information are becoming an ever more important resource for many semantic systems. They form the core element to develop large coverage, high-performance, and reusable semantic parsers, classifiers as well as applications that include lexicography, term and information extraction, semantic processing of the web, text-to-scene conversion systems, etc.
Existing examples of role annotated corpora/resources include for English: FrameNet, PropBank, and VerbNet, for German: Salsa, and for Spanish: Spanish FrameNet. However, the two main initiatives outside English take FrameNet as a semantic pivot and attempt to derive or adapt frames to the target language using manual work or semiautomatic systems.
As frequently observed, the itemization of frames and lexical units and their manual annotation in a corpus is an expensive task that requires a relatively long-term and dedicated commitment. Such an effort is currently beyond the reach of most research teams in the Nordic area, which could impair the quality, and possibly the existence, of future semantic applications in these languages. This makes the construction of a role-semantic annotated corpus and the design of automatic or semiautomatic transfer methods a challenge as well as an opportunity.
This workshop intends to be a forum for the research community to review current initiatives and methods as well as ideas to start the construction of a role annotated corpus and possibly share it across a family of related languages. Topics of interest include (but are not limited to):
- Transfer and adaptation of frames and role annotation from English and German
- Pilot annotation/projection studies
- Lexical inheritance mechanisms: crosslingual similarities and differences, protoframe design, variation and variant annotation
- Frame alignment and projections from English/German and across target languages
- Transfer of collocations, subcategorization, and grammatical functions to role annotation
- General methods for multilingual corpus alignment and structure transfer/projection
- "Upgrade" of existing resources: dependency annotation, small corpora, and bootstrapping using nonsemantically annotated data
- Shared annotation platforms
- Experience from other initiatives: Salsa, Romance FrameNet
- Applications of frame semantics
Deadline for submitting extended abstract: February 19
Notification of acceptance: March 1
Deadline for submitting final papers, short, max. 4 pages or full, max. 8 pages: April 20
Workshop: May 24
Participants are invited to submit a one to two page extended abstract of their research or position statements using the NODALIDA styles. Submissions must include name, affiliation, and contact address. The workshop submissions will be reviewed by the organizing committee.
If their abstract is accepted, authors will be invited to submit a short paper (max 4 pages) or a full paper (max. 8 pages) according to the authors' own choice that will be published in the workshop proceedings.
The workshop will take place in the NODALIDA 2007 conference and will include a round table where participants will be able to discuss and try to propose an agenda to build frame semantic resources for the Scandinavian and Baltic languages. Active participation and/or commitments from the presenters or the public to set up this agenda will be welcome.
All the people interested in building frame semantic resources as well as frame semantic analyzers for the Scandinavian and Baltic languages.
Pierre Nugues, Lund University,
Richard Johansson, Lund University, richard AT cs DOT lth DOT se.