Package cc.mallet.fst
Class HMM
- java.lang.Object
-
- cc.mallet.fst.Transducer
-
- cc.mallet.fst.HMM
-
- All Implemented Interfaces:
java.io.Serializable
public class HMM extends Transducer implements java.io.Serializable
A Hidden Markov Model.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
HMM.Incrementor
static class
HMM.State
protected static class
HMM.TransitionIterator
class
HMM.WeightedIncrementor
-
Field Summary
-
Fields inherited from class cc.mallet.fst.Transducer
CERTAIN_WEIGHT, IMPOSSIBLE_WEIGHT, inputPipe, outputPipe
-
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addFullyConnectedStates(java.lang.String[] stateNames)
Add a group of states that are fully connected with each other, with parameters equal zero, and labels on their out-going arcs the same name as their destination state names.void
addFullyConnectedStatesForBiLabels()
void
addFullyConnectedStatesForLabels()
void
addFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)
void
addFullyConnectedStatesForTriLabels()
java.lang.String
addOrderNStates(InstanceList trainingSet, int[] orders, boolean[] defaults, java.lang.String start, java.util.regex.Pattern forbidden, java.util.regex.Pattern allowed, boolean fullyConnected)
Assumes that the HMM's output alphabet containsString
s.void
addSelfTransitioningStateForAllLabels(java.lang.String name)
void
addState(java.lang.String name, double initialWeight, double finalWeight, java.lang.String[] destinationNames, java.lang.String[] labelNames)
void
addState(java.lang.String name, java.lang.String[] destinationNames)
Add a state with parameters equal zero, and labels on out-going arcs the same name as their destination state names.void
addStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a second-order Markov model on labels, adding only those transitions the occur in the given trainingSet.void
addStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate weights for each source-destination pair of states.void
addStatesForLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a first-order Markov model on labels, adding only those transitions the occur in the given trainingSet.void
addStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate observational-test-weights for each source-destination pair of states---instead have all the incoming transitions to a state share the same observational-feature-test weights.void
estimate()
Multinomial[]
getEmissionMultinomial()
Multinomial
getInitialMultinomial()
Alphabet
getInputAlphabet()
Alphabet
getOutputAlphabet()
Transducer.State
getState(int index)
HMM.State
getState(java.lang.String name)
Multinomial[]
getTransitionMultinomial()
void
initEmissions(java.util.Random random, double noise)
java.util.Iterator
initialStateIterator()
void
initTransitions(java.util.Random random, double noise)
Separate initialization of initial/transitions and emissions.boolean
isTrainable()
int
numStates()
void
print()
void
reset()
Deprecated.boolean
train(InstanceList ilist)
Trains a HMM without validation and evaluation.boolean
train(InstanceList ilist, InstanceList validation, InstanceList testing)
Trains a HMM with evaluator set to null.boolean
train(InstanceList ilist, InstanceList validation, InstanceList testing, TransducerEvaluator eval)
void
write(java.io.File f)
-
Methods inherited from class cc.mallet.fst.Transducer
averageTokenAccuracy, canIterateAllTransitions, generatePath, getInputPipe, getMaxLatticeFactory, getOutputPipe, getSumLatticeFactory, isGenerative, label, less_efficient_sumLogProb, no_longer_needed_sumNegLogProb, setMaxLatticeFactory, setSumLatticeFactory, stateIndexOfString, sumLogProb, transduce, transduce
-
-
-
-
Method Detail
-
getInputAlphabet
public Alphabet getInputAlphabet()
-
getOutputAlphabet
public Alphabet getOutputAlphabet()
-
getTransitionMultinomial
public Multinomial[] getTransitionMultinomial()
-
getEmissionMultinomial
public Multinomial[] getEmissionMultinomial()
-
getInitialMultinomial
public Multinomial getInitialMultinomial()
-
print
public void print()
- Overrides:
print
in classTransducer
-
addState
public void addState(java.lang.String name, double initialWeight, double finalWeight, java.lang.String[] destinationNames, java.lang.String[] labelNames)
-
addState
public void addState(java.lang.String name, java.lang.String[] destinationNames)
Add a state with parameters equal zero, and labels on out-going arcs the same name as their destination state names.
-
addFullyConnectedStates
public void addFullyConnectedStates(java.lang.String[] stateNames)
Add a group of states that are fully connected with each other, with parameters equal zero, and labels on their out-going arcs the same name as their destination state names.
-
addFullyConnectedStatesForLabels
public void addFullyConnectedStatesForLabels()
-
addStatesForLabelsConnectedAsIn
public void addStatesForLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a first-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
-
addStatesForHalfLabelsConnectedAsIn
public void addStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate weights for each source-destination pair of states. Instead have all the incoming transitions to a state share the same weights.
-
addStatesForThreeQuarterLabelsConnectedAsIn
public void addStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate observational-test-weights for each source-destination pair of states---instead have all the incoming transitions to a state share the same observational-feature-test weights. However, do create separate default feature for each transition, (which acts as an HMM-style transition probability).
-
addFullyConnectedStatesForThreeQuarterLabels
public void addFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)
-
addFullyConnectedStatesForBiLabels
public void addFullyConnectedStatesForBiLabels()
-
addStatesForBiLabelsConnectedAsIn
public void addStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a second-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
-
addFullyConnectedStatesForTriLabels
public void addFullyConnectedStatesForTriLabels()
-
addSelfTransitioningStateForAllLabels
public void addSelfTransitioningStateForAllLabels(java.lang.String name)
-
addOrderNStates
public java.lang.String addOrderNStates(InstanceList trainingSet, int[] orders, boolean[] defaults, java.lang.String start, java.util.regex.Pattern forbidden, java.util.regex.Pattern allowed, boolean fullyConnected)
Assumes that the HMM's output alphabet containsString
s. Creates an order-n HMM with input predicates and output labels given bytrainingSet
and order, connectivity, and weights given by the remaining arguments.- Parameters:
trainingSet
- the training instancesorders
- an array of increasing non-negative numbers giving the orders of the features for this HMM. The largest number n is the Markov order of the HMM. States are n-tuples of output labels. Each of the other numbers k inorders
represents a weight set shared by all destination states whose last (most recent) k labels agree. Iforders
isnull
, an order-0 HMM is built.defaults
- If non-null, it must be the same length asorders
, withtrue
positions indicating that the weight set for the corresponding order contains only the weight for a default feature; otherwise, the weight set has weights for all features built from input predicates.start
- The label that represents the context of the start of a sequence. It may be also used for sequence labels.forbidden
- If non-null, specifies what pairs of successive labels are not allowed, both for constructing norder states or for transitions. A label pair (u,v) is not allowed if u + "," + v matchesforbidden
.allowed
- If non-null, specifies what pairs of successive labels are allowed, both for constructing norder states or for transitions. A label pair (u,v) is allowed only if u + "," + v matchesallowed
.fullyConnected
- Whether to include all allowed transitions, even those not occurring intrainingSet
,
-
getState
public HMM.State getState(java.lang.String name)
-
numStates
public int numStates()
- Specified by:
numStates
in classTransducer
-
getState
public Transducer.State getState(int index)
- Specified by:
getState
in classTransducer
-
initialStateIterator
public java.util.Iterator initialStateIterator()
- Specified by:
initialStateIterator
in classTransducer
-
isTrainable
public boolean isTrainable()
-
reset
@Deprecated public void reset()
Deprecated.
-
initTransitions
public void initTransitions(java.util.Random random, double noise)
Separate initialization of initial/transitions and emissions. All probabilities are proportional to (1+Uniform[0,1])^noise.- Parameters:
random
- Random object (if null use uniform distribution)noise
- Noise exponent to use. If zero, then uniform distribution.
-
initEmissions
public void initEmissions(java.util.Random random, double noise)
-
estimate
public void estimate()
-
train
public boolean train(InstanceList ilist)
Trains a HMM without validation and evaluation.
-
train
public boolean train(InstanceList ilist, InstanceList validation, InstanceList testing)
Trains a HMM with evaluator set to null.
-
train
public boolean train(InstanceList ilist, InstanceList validation, InstanceList testing, TransducerEvaluator eval)
-
write
public void write(java.io.File f)
-
-