The LTH System for Frame-Semantic Structure Extraction
This is a re-engineered version of LTH's system that participated in the SemEval-2007 task on Frame-semantic Structure Extraction. In short, it performs semantic analysis of English text in the FrameNet paradigm.
Prerequisites
Presently, the system needs about 2 GB of memory to run the tests. We are working on ways to compress the model files to consume less memory.
Make sure that you have a Java system (version 1.5 or newer) installed on your computer.
You also need to install FrameNet. Specifically,
you need two files: frames.xml
and
frRelation.xml
.
Installing
Download this archive and decompress it.
Test Run with Charniak's Parser
Make sure that Charniak's parser is installed on your computer.
Download this model file.
Download the test text package and decompress it, then download the command file.
Then open the command file in an editor. Fill in the paths to the following files:
- The FrameNet files:
frame-file
andframe-rel-file
- The morphological database (
lemma-file
) and (optional) a list of lexical units not listed in FrameNet (extra-lu-file
) - The model file (
model
) - Charniak's parser and parsing data directory. For instance, if the
parser executable is
/usr/local/charniak/parser05Aug16/parseIt
and the parsing data directory is/usr/local/charniak/parser05Aug16/DATA/EN
, then theargs
string inparsing-service
should be/usr/local/charniak/parser05Aug16/parseIt -K -l300 /usr/local/charniak/parser05Aug16/DATA/EN/
After editing the command file, execute the script to run the
semantic analyzer:sh run_lth_labeler.sh testrun_charniak.xml
The system will now perform semantic analysis of two files: first,
a file in the FrameNet corpus file format (testtexts/pb1.xml
);
secondly, a tokenized raw text file
(testtexts/test.txt
). The output is in FrameNet corpus
file format and end up in the
testtexts/out
directory.
Test Run with LTH's Dependency Parser
Coming soon.