Package cc.mallet.fst
Class HMM
- java.lang.Object
-
- cc.mallet.fst.Transducer
-
- cc.mallet.fst.HMM
-
- All Implemented Interfaces:
java.io.Serializable
public class HMM extends Transducer implements java.io.Serializable
A Hidden Markov Model.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description classHMM.Incrementorstatic classHMM.Stateprotected static classHMM.TransitionIteratorclassHMM.WeightedIncrementor
-
Field Summary
-
Fields inherited from class cc.mallet.fst.Transducer
CERTAIN_WEIGHT, IMPOSSIBLE_WEIGHT, inputPipe, outputPipe
-
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidaddFullyConnectedStates(java.lang.String[] stateNames)Add a group of states that are fully connected with each other, with parameters equal zero, and labels on their out-going arcs the same name as their destination state names.voidaddFullyConnectedStatesForBiLabels()voidaddFullyConnectedStatesForLabels()voidaddFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)voidaddFullyConnectedStatesForTriLabels()java.lang.StringaddOrderNStates(InstanceList trainingSet, int[] orders, boolean[] defaults, java.lang.String start, java.util.regex.Pattern forbidden, java.util.regex.Pattern allowed, boolean fullyConnected)Assumes that the HMM's output alphabet containsStrings.voidaddSelfTransitioningStateForAllLabels(java.lang.String name)voidaddState(java.lang.String name, double initialWeight, double finalWeight, java.lang.String[] destinationNames, java.lang.String[] labelNames)voidaddState(java.lang.String name, java.lang.String[] destinationNames)Add a state with parameters equal zero, and labels on out-going arcs the same name as their destination state names.voidaddStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)Add states to create a second-order Markov model on labels, adding only those transitions the occur in the given trainingSet.voidaddStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)Add as many states as there are labels, but don't create separate weights for each source-destination pair of states.voidaddStatesForLabelsConnectedAsIn(InstanceList trainingSet)Add states to create a first-order Markov model on labels, adding only those transitions the occur in the given trainingSet.voidaddStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)Add as many states as there are labels, but don't create separate observational-test-weights for each source-destination pair of states---instead have all the incoming transitions to a state share the same observational-feature-test weights.voidestimate()Multinomial[]getEmissionMultinomial()MultinomialgetInitialMultinomial()AlphabetgetInputAlphabet()AlphabetgetOutputAlphabet()Transducer.StategetState(int index)HMM.StategetState(java.lang.String name)Multinomial[]getTransitionMultinomial()voidinitEmissions(java.util.Random random, double noise)java.util.IteratorinitialStateIterator()voidinitTransitions(java.util.Random random, double noise)Separate initialization of initial/transitions and emissions.booleanisTrainable()intnumStates()voidprint()voidreset()Deprecated.booleantrain(InstanceList ilist)Trains a HMM without validation and evaluation.booleantrain(InstanceList ilist, InstanceList validation, InstanceList testing)Trains a HMM with evaluator set to null.booleantrain(InstanceList ilist, InstanceList validation, InstanceList testing, TransducerEvaluator eval)voidwrite(java.io.File f)-
Methods inherited from class cc.mallet.fst.Transducer
averageTokenAccuracy, canIterateAllTransitions, generatePath, getInputPipe, getMaxLatticeFactory, getOutputPipe, getSumLatticeFactory, isGenerative, label, less_efficient_sumLogProb, no_longer_needed_sumNegLogProb, setMaxLatticeFactory, setSumLatticeFactory, stateIndexOfString, sumLogProb, transduce, transduce
-
-
-
-
Method Detail
-
getInputAlphabet
public Alphabet getInputAlphabet()
-
getOutputAlphabet
public Alphabet getOutputAlphabet()
-
getTransitionMultinomial
public Multinomial[] getTransitionMultinomial()
-
getEmissionMultinomial
public Multinomial[] getEmissionMultinomial()
-
getInitialMultinomial
public Multinomial getInitialMultinomial()
-
print
public void print()
- Overrides:
printin classTransducer
-
addState
public void addState(java.lang.String name, double initialWeight, double finalWeight, java.lang.String[] destinationNames, java.lang.String[] labelNames)
-
addState
public void addState(java.lang.String name, java.lang.String[] destinationNames)Add a state with parameters equal zero, and labels on out-going arcs the same name as their destination state names.
-
addFullyConnectedStates
public void addFullyConnectedStates(java.lang.String[] stateNames)
Add a group of states that are fully connected with each other, with parameters equal zero, and labels on their out-going arcs the same name as their destination state names.
-
addFullyConnectedStatesForLabels
public void addFullyConnectedStatesForLabels()
-
addStatesForLabelsConnectedAsIn
public void addStatesForLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a first-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
-
addStatesForHalfLabelsConnectedAsIn
public void addStatesForHalfLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate weights for each source-destination pair of states. Instead have all the incoming transitions to a state share the same weights.
-
addStatesForThreeQuarterLabelsConnectedAsIn
public void addStatesForThreeQuarterLabelsConnectedAsIn(InstanceList trainingSet)
Add as many states as there are labels, but don't create separate observational-test-weights for each source-destination pair of states---instead have all the incoming transitions to a state share the same observational-feature-test weights. However, do create separate default feature for each transition, (which acts as an HMM-style transition probability).
-
addFullyConnectedStatesForThreeQuarterLabels
public void addFullyConnectedStatesForThreeQuarterLabels(InstanceList trainingSet)
-
addFullyConnectedStatesForBiLabels
public void addFullyConnectedStatesForBiLabels()
-
addStatesForBiLabelsConnectedAsIn
public void addStatesForBiLabelsConnectedAsIn(InstanceList trainingSet)
Add states to create a second-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
-
addFullyConnectedStatesForTriLabels
public void addFullyConnectedStatesForTriLabels()
-
addSelfTransitioningStateForAllLabels
public void addSelfTransitioningStateForAllLabels(java.lang.String name)
-
addOrderNStates
public java.lang.String addOrderNStates(InstanceList trainingSet, int[] orders, boolean[] defaults, java.lang.String start, java.util.regex.Pattern forbidden, java.util.regex.Pattern allowed, boolean fullyConnected)
Assumes that the HMM's output alphabet containsStrings. Creates an order-n HMM with input predicates and output labels given bytrainingSetand order, connectivity, and weights given by the remaining arguments.- Parameters:
trainingSet- the training instancesorders- an array of increasing non-negative numbers giving the orders of the features for this HMM. The largest number n is the Markov order of the HMM. States are n-tuples of output labels. Each of the other numbers k inordersrepresents a weight set shared by all destination states whose last (most recent) k labels agree. Ifordersisnull, an order-0 HMM is built.defaults- If non-null, it must be the same length asorders, withtruepositions indicating that the weight set for the corresponding order contains only the weight for a default feature; otherwise, the weight set has weights for all features built from input predicates.start- The label that represents the context of the start of a sequence. It may be also used for sequence labels.forbidden- If non-null, specifies what pairs of successive labels are not allowed, both for constructing norder states or for transitions. A label pair (u,v) is not allowed if u + "," + v matchesforbidden.allowed- If non-null, specifies what pairs of successive labels are allowed, both for constructing norder states or for transitions. A label pair (u,v) is allowed only if u + "," + v matchesallowed.fullyConnected- Whether to include all allowed transitions, even those not occurring intrainingSet,
-
getState
public HMM.State getState(java.lang.String name)
-
numStates
public int numStates()
- Specified by:
numStatesin classTransducer
-
getState
public Transducer.State getState(int index)
- Specified by:
getStatein classTransducer
-
initialStateIterator
public java.util.Iterator initialStateIterator()
- Specified by:
initialStateIteratorin classTransducer
-
isTrainable
public boolean isTrainable()
-
reset
@Deprecated public void reset()
Deprecated.
-
initTransitions
public void initTransitions(java.util.Random random, double noise)Separate initialization of initial/transitions and emissions. All probabilities are proportional to (1+Uniform[0,1])^noise.- Parameters:
random- Random object (if null use uniform distribution)noise- Noise exponent to use. If zero, then uniform distribution.
-
initEmissions
public void initEmissions(java.util.Random random, double noise)
-
estimate
public void estimate()
-
train
public boolean train(InstanceList ilist)
Trains a HMM without validation and evaluation.
-
train
public boolean train(InstanceList ilist, InstanceList validation, InstanceList testing)
Trains a HMM with evaluator set to null.
-
train
public boolean train(InstanceList ilist, InstanceList validation, InstanceList testing, TransducerEvaluator eval)
-
write
public void write(java.io.File f)
-
-