Hidden Markov Models, how to train using Baum-Welch?

TL;DR: My uni is messing up my life with a horribly disorganized AI class with unhelpful teachers. Just want some confirmation that I’m on the right track. See my questions at the end.

I’ve been really busy with school lately due to my AI class wrapping up. The class is a clusterf*ck. The online resources are lacking at best and some of the teachers are the worst kind. One of them was asked a question about a project assignment and responded with “I wrote that project description and frankly I’m offended by that question.” and then ignored the student. The same teacher forced me to come in to school on a day I had no classes on just so he would believe that I didn’t get a mail with some instructions.

Our final assignment is to write a automatic player for a small game they designed. In addition to being vastly undocumented and using very confusing concepts (objects move in directions but don’t have positions, what?), the entire assignment reeks of “use this screwdriver to hammer in this nail”, as the assignment can be solved much more easily using Markov chains, but no, we have to use Hidden Markov Models. So here I am sitting with a completely undocumented HMM implementation they’ve supplied, a half described game concept and a VERY high-level explanation of HMMs and the Baum-Welch algorithm. It’s not entirely about game development, but since this is technically a game AI and I REALLY just want this class to be over I’m asking here anyway.

The entire problem is taking in a stream of emissions/observations and predicting the next emission. The underlying states are irrelevant. A correct guest ends the game and gives you 1 point, while every wrong guess makes you lose 1 point (you don’t have to guess if you’re not certain). So I basically have a stream of emissions that I want to train my HMM with and calculate a probability distribution for the next emission. In addition, the code skeleton we’re forced to use is not compatible with a debugger, hell, with a frigging IDE, so debugging is a huge mess. I have two questions:

  • Baum-Welch seems to suck. If I repeatedly train my HMM with the entire sequence of emissions/observations, it eventually explodes into a HMM filled with 0s. The scarce information I’ve found says that this is due to overfitting. The Baum-Welch implementation was supplied by them, and despite other people complaining about the same problem, the teachers insist that we’re “using it wrong” (again, completely undocumented). What is the optimal way of training an HMM given a stream of data?

  • Given the sequence of emissions [a, b, c], to calculate the probability of each possible next emission I compute the probability of the emission [a, b, c] occurring given the current HMM state and then the probabilities of [a, b, c, ] occurring, where X is each possible emission. The conditional probability of each emission is then
    p([a, b, c, ]) / p([a, b, c])
    Is this the correct way of calculating the probability of the next emission?