This is a re-engineered version of LTH's system that participated in the SemEval-2007 task on Frame-semantic Structure Extraction. In short, it performs semantic analysis of English text in the FrameNet paradigm.
Presently, the system needs about 2 GB of memory to run the tests. We are working on ways to compress the model files to consume less memory.
Make sure that you have a Java system (version 1.5 or newer) installed on your computer.
You also need to install FrameNet. Specifically,
you need two files:
Download this archive and decompress it.
Test Run with Charniak's Parser
Make sure that Charniak's parser is installed on your computer.
Download this model file.
Then open the command file in an editor. Fill in the paths to the following files:
- The FrameNet files:
- The morphological database (
lemma-file) and (optional) a list of lexical units not listed in FrameNet (
- The model file (
- Charniak's parser and parsing data directory. For instance, if the
parser executable is
/usr/local/charniak/parser05Aug16/parseItand the parsing data directory is
/usr/local/charniak/parser05Aug16/DATA/EN, then the
/usr/local/charniak/parser05Aug16/parseIt -K -l300 /usr/local/charniak/parser05Aug16/DATA/EN/
After editing the command file, execute the script to run the
sh run_lth_labeler.sh testrun_charniak.xml
The system will now perform semantic analysis of two files: first,
a file in the FrameNet corpus file format (
secondly, a tokenized raw text file
testtexts/test.txt). The output is in FrameNet corpus
file format and end up in the
Test Run with LTH's Dependency Parser