On Jones, TnT is stored at /usr/local/bin (with the rest of the local programs). The models and manual are at /usr/local/share/tnt.
tnt-para training.tt
This is training. It produces two files: training.lex and training.123.
tnt training test.t >test-results
This tags all the items in test.t in test-results.
- tnt-diff ? ? (I haven't tried this yet)
You may need to set the environment variable TNT_MODELS. Do this by adding the line to your .profile in your home directory:
export TNT_MODELS=/usr/local/share/tnt/models
(this is bash syntax). Then type source .profile to reload your .profile.
The standard extensions are .tt for training data and .t for test data.
A Worked Example
- cp ~ncsander/Documents/ci_data/basephone.tt and ~ncsander/Documents/ci_data/sgb20phone.t to a directory you own.
- tnt-para basephone.tt # used for training
- This produces basephone.lex and basephone.123
tnt-para basephone sgb20phone.t >sgb20out.txt
- This produces sgb20out.txt, which contains the test data with associated phonemes underneath.
The results aren't very good because the test data contains lots of items that aren't present in the training data. TODO: Probably the reverse is more interesting and more in keeping with what an you'd expect TnT to do: emit phones as tags for phonemes.
See also: ACOPOST for a tagger with a less restrictive licence.
