
Download the program here. Unzip the package.
Enter the lth_srl directory.
First of all, you need an input file formatted according to the
CoNLL-2008 format. The package includes a script
scripts/preprocess.sh that tokenizes, adds part-of-speech tags,
and finds lemmas. For instance, you can download
this text file and apply the preprocessing script:
sh scripts/preprocess.sh < test.txt > test.tokens
If you prefer to use your own tokenizer or part-of-speech tagger, you have to prepare the CoNLL-2008 format on your own. In this case, don't forget to set the lemma column, at least for the predicates you are interested in.
To run the full syntactic–semantic analyzer, use the script
scripts/run.sh:
sh scripts/run.sh < test.tokens > test.output
You might need to increase the heap size declared in
run.sh if you use 64-bit machine.
The script scripts/run.sh runs the full system: the second-order
dependency parser, linguistic constraints, semantic reranking, and
syntactic–semantic integration. To save time and memory, the
system might also be run in
simpler configurations by using one of the following scripts:
scripts/run_constraints.sh runs a simplified system:
the second-order dependency parser, linguistic constraints, but no
reranking, and no syntactic–semantic integration.
scripts/run_greedy.sh is even simpler:
the second-order dependency parser, no constraints, reranking, or
syntactic–semantic integration.
scripts/run_constraints_fast.sh and
scripts/run_greedy_fast.sh use a first-order parser
(which runs in O(n^3))
instead of the second-order parser (O(n^4)).
The output of all these scripts is in the CoNLL-2008 format.
Page Manager: Pierre Nugues
Webmaster: webmaster@lth.se
Last updated: 2009-09-09