Thursday, May 23, 2013

an easy way for machine translation evaluation using multiple metrics

Kenneth Heafield's scripts that make it easy to score machine translation output using NIST's BLEU and NIST, TER, and METEOR.

Pre-requirements: bash, ruby, java
CAUTION: you have to set "export LC_ALL=" to let it work ("export LC_ALL=C" will make it crash)

Setup: run the ./setup.sh script which will automatically download necessary parts from Internet

Running:
./score.rb --print --print-header --ref tokenized-reference.txt --hyp-detok tokenized-system-outputs.txt