Most tutorial materials on Hidden Markov models adopt the
``urns and balls'' model of the underlying process
(see for example Krenn and
Samuelsson, Rabiner, and Huang, Ariki and Jack).
In the urn and ball model, we assume that there
are N large urns in a room. Each urn contains
a number of coloured balls. There are M different
colours, and the relative proportion of each colour may
differ from urn to urn. A genie randomly jumps from urn to urn,
randomly choosing a ball from each of the urns that it visits.
It shouts out the colour of the ball, then replaces it in the urn.
The observer hears the shouts but cannot see where the genie is, so
does not know which urn the genie is currently in. This is what makes
the HMM hidden. A graphical presentation of this view of an HMM
is found in figure 10.1
state | a | b | |
initial: | 1.0 | 0.0 | |
emission: | 0.7 | 0.5 | emits '0' |
0.3 | 0.5 | emits `1` | |
transition: | 0.4 | 1.0 | goes to a |
0.6 | 0.0 | goes to b |
The formal version of this approach is to describe
an HMM as a tuple where:
Charniak doesn't use this model, preferring the arguably simpler model of finite state transducers in which each arc generates a pair of a symbol and a probability. There is nothing wrong with this approach, but since it is non-standard, it can be hard to see how Charniak's discussion of training and decoding algorithms map on to the more conventional approach. The Charniak presentation is shown in figure 10.2 and table 10.2.
state | a | b | Next State | Symbol |
transition: | 0.4 *0.7=0.28 | 1.0 * 0.5 = 0.5 | a | '0` |
0.6*0.7=0.42 | 0.0*0.5 = 0.0 | b | '0` | |
0.4 *0.3=0.12 | 1.0 * 0.5=0.5 | a | '1` | |
0.6 * 0.3=0.18 | 0.0 * 0.5 =0.0 | b | `1` |