A Bigram Hidden Markov Model Part-Of-Speech Tagger using Viterbi for Decoding

The HMM was trained on the WSJ corpus so sentences that are too far out of context from WSJ sentences may result in garbage output. The tokenizer splits the string based on space so punctuation may not be tagged properly either.