java.lang.Object
- cc.mallet.fst.TransducerTrainer
- - cc.mallet.fst.CRFTrainerByLabelLikelihood

All Implemented Interfaces:

TransducerTrainer.ByOptimization

Direct Known Subclasses:

CRFTrainerByL1LabelLikelihood
```
public class CRFTrainerByLabelLikelihood
extends TransducerTrainer
implements TransducerTrainer.ByOptimization
```
Unlike ClassifierTrainer, TransducerTrainer is not "stateless" between calls to train. A TransducerTrainer is constructed paired with a specific Transducer, and can only train that Transducer. CRF stores and has methods for FeatureSelection and weight freezing. CRFTrainer stores and has methods for determining the contents/dimensions/sparsity/FeatureInduction of the CRF's weights as determined by training data.
Note: In the future this class may go away in favor of some default version of CRFTrainerByValueGradients.

Nested Class Summary
- Nested classes/interfaces inherited from class cc.mallet.fst.TransducerTrainer
  TransducerTrainer.ByIncrements, TransducerTrainer.ByInstanceIncrements, TransducerTrainer.ByOptimization

Field Summary

Fields
Modifier and Type Field Description

boolean printGradient

Constructor Summary

Constructors
Constructor Description

CRFTrainerByLabelLikelihood(CRF crf)

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`CRF`	`getCRF()`
`double`	`getGaussianPriorVariance()`
`int`	`getIteration()`
`CRFOptimizableByLabelLikelihood`	`getOptimizableCRF(InstanceList trainingSet)`
`Optimizer`	`getOptimizer()`
`Optimizer`	`getOptimizer(InstanceList trainingSet)`
`Transducer`	`getTransducer()`
`double`	`getUseHyperbolicPriorSharpness()`
`double`	`getUseHyperbolicPriorSlope()`
`boolean`	`getUseSparseWeights()`
`boolean`	`isConverged()`
`boolean`	`isFinishedTraining()`
`void`	`setAddNoFactors(boolean flag)`	Use this method to specify whether or not factors are added to the CRF by this trainer.
`void`	`setGaussianPriorVariance(double p)`
`void`	`setHyperbolicPriorSharpness(double p)`
`void`	`setHyperbolicPriorSlope(double p)`
`void`	`setUseHyperbolicPrior(boolean f)`
`void`	`setUseSomeUnsupportedTrick(boolean b)`	Sets whether to use the 'some unsupported trick.' This trick is, if training a CRF where some training has been done and sparse weights are used, to add a few weights for feaures that do not occur in the tainig data.
`void`	`setUseSparseWeights(boolean b)`
`boolean`	`train(InstanceList trainingSet, int numIterations)`	Train the transducer associated with this TransducerTrainer.
`boolean`	`train(InstanceList training, int numIterationsPerProportion, double[] trainingProportions)`	Train a CRF on various-sized subsets of the data.
`boolean`	`trainIncremental(InstanceList training)`
`boolean`	`trainWithFeatureInduction(InstanceList trainingData, InstanceList validationData, InstanceList testingData, TransducerEvaluator eval, int numIterations, int numIterationsBetweenFeatureInductions, int numFeatureInductions, int numFeaturesPerFeatureInduction, double trueLabelProbThreshold, boolean clusteredFeatureInduction, double[] trainingProportions)`
`boolean`	`trainWithFeatureInduction(InstanceList trainingData, InstanceList validationData, InstanceList testingData, TransducerEvaluator eval, int numIterations, int numIterationsBetweenFeatureInductions, int numFeatureInductions, int numFeaturesPerFeatureInduction, double trueLabelProbThreshold, boolean clusteredFeatureInduction, double[] trainingProportions, java.lang.String gainName)`	Train a CRF using feature induction to generate conjunctions of features.

Methods inherited from class cc.mallet.fst.TransducerTrainer
addEvaluator, addEvaluators, removeEvaluator, runEvaluators, train

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - printGradient
```
public boolean printGradient
```
- Constructor Detail
  - CRFTrainerByLabelLikelihood
```
public CRFTrainerByLabelLikelihood(CRF crf)
```
- Method Detail
  - getTransducer
```
public Transducer getTransducer()
```
    Specified by:
    
    getTransducer in class TransducerTrainer
  - getCRF
```
public CRF getCRF()
```
  - getOptimizer
```
public Optimizer getOptimizer()
```
    Specified by:
    
    getOptimizer in interface TransducerTrainer.ByOptimization
  - isConverged
```
public boolean isConverged()
```
  - isFinishedTraining
```
public boolean isFinishedTraining()
```
    Specified by:
    
    isFinishedTraining in class TransducerTrainer
  - getIteration
```
public int getIteration()
```
    Specified by:
    
    getIteration in class TransducerTrainer
  - setAddNoFactors
```
public void setAddNoFactors(boolean flag)
```
    Use this method to specify whether or not factors are added to the CRF by this trainer. If you have already setup the factors in your CRF, you may not want the trainer to add additional factors.
    
    Parameters:
    
    flag - If true, this trainer adds no factors to the CRF.
  - getOptimizableCRF
```
public CRFOptimizableByLabelLikelihood getOptimizableCRF(InstanceList trainingSet)
```
  - getOptimizer
```
public Optimizer getOptimizer(InstanceList trainingSet)
```
  - trainIncremental
```
public boolean trainIncremental(InstanceList training)
```
  - train
```
public boolean train(InstanceList trainingSet,
                     int numIterations)
```
    Description copied from class: TransducerTrainer
    
    Train the transducer associated with this TransducerTrainer. You should be able to call this method with different trainingSet objects. Whether this causes the TransducerTrainer to combine both trainingSets or to view the second as a new alternative is at the discretion of the particular TransducerTrainer subclass involved.
    
    Specified by:
    
    train in class TransducerTrainer
  - train
```
public boolean train(InstanceList training,
                     int numIterationsPerProportion,
                     double[] trainingProportions)
```
    Train a CRF on various-sized subsets of the data. This method is typically used to accelerate training by quickly getting to reasonable parameters on only a subset of the parameters first, then on progressively more data.
    
    Parameters:
    
    training - The training Instances.
    
    numIterationsPerProportion - Maximum number of Maximizer iterations per training proportion.
    
    trainingProportions - Train on increasingly larger portions of the data, e.g. new double[] {0.2, 0.5, 1.0}. This can sometimes speed up convergence, similar to SGD. Be sure to end in 1.0 if you want to train on all the data in the end.
    
    Returns:
    
    True if training has converged.
  - trainWithFeatureInduction
```
public boolean trainWithFeatureInduction(InstanceList trainingData,
                                         InstanceList validationData,
                                         InstanceList testingData,
                                         TransducerEvaluator eval,
                                         int numIterations,
                                         int numIterationsBetweenFeatureInductions,
                                         int numFeatureInductions,
                                         int numFeaturesPerFeatureInduction,
                                         double trueLabelProbThreshold,
                                         boolean clusteredFeatureInduction,
                                         double[] trainingProportions)
```
  - trainWithFeatureInduction
```
public boolean trainWithFeatureInduction(InstanceList trainingData,
                                         InstanceList validationData,
                                         InstanceList testingData,
                                         TransducerEvaluator eval,
                                         int numIterations,
                                         int numIterationsBetweenFeatureInductions,
                                         int numFeatureInductions,
                                         int numFeaturesPerFeatureInduction,
                                         double trueLabelProbThreshold,
                                         boolean clusteredFeatureInduction,
                                         double[] trainingProportions,
                                         java.lang.String gainName)
```
    Train a CRF using feature induction to generate conjunctions of features. Feature induction is run periodically during training. The features are added to improve performance on the mislabeled instances, with the specific scoring criterion given by the FeatureInducer specified by gainName
    
    Parameters:
    
    training - The training Instances.
    
    validation - The validation Instances.
    
    testing - The testing instances.
    
    eval - For evaluation during training.
    
    numIterations - Maximum number of Maximizer iterations.
    
    numIterationsBetweenFeatureInductions - Number of maximizer iterations between each call to the Feature Inducer.
    
    numFeatureInductions - Maximum number of rounds of feature induction.
    
    numFeaturesPerFeatureInduction - Maximum number of features to induce at each round of induction.
    
    trueLabelProbThreshold - If the model's probability of the true Label of an Instance is less than this value, it is added as an error instance to the FeatureInducer.
    
    clusteredFeatureInduction - If true, a separate FeatureInducer is constructed for each label pair. This can avoid inducing a disproportionate number of features for a single label.
    
    trainingProportions - If non-null, train on increasingly larger portions of the data (e.g. [0.2, 0.5, 1.0]. This can sometimes speedup convergence.
    
    gainName - The type of FeatureInducer to use. One of "exp", "grad", or "info" for ExpGain, GradientGain, or InfoGain.
    
    Returns:
    
    True if training has converged.
  - setUseHyperbolicPrior
```
public void setUseHyperbolicPrior(boolean f)
```
  - setHyperbolicPriorSlope
```
public void setHyperbolicPriorSlope(double p)
```
  - setHyperbolicPriorSharpness
```
public void setHyperbolicPriorSharpness(double p)
```
  - getUseHyperbolicPriorSlope
```
public double getUseHyperbolicPriorSlope()
```
  - getUseHyperbolicPriorSharpness
```
public double getUseHyperbolicPriorSharpness()
```
  - setGaussianPriorVariance
```
public void setGaussianPriorVariance(double p)
```
  - getGaussianPriorVariance
```
public double getGaussianPriorVariance()
```
  - setUseSparseWeights
```
public void setUseSparseWeights(boolean b)
```
  - getUseSparseWeights
```
public boolean getUseSparseWeights()
```
  - setUseSomeUnsupportedTrick
```
public void setUseSomeUnsupportedTrick(boolean b)
```
    Sets whether to use the 'some unsupported trick.' This trick is, if training a CRF where some training has been done and sparse weights are used, to add a few weights for feaures that do not occur in the tainig data.
    This generally leads to better accuracy at only a small memory cost.
    
    Parameters:
    
    b - Whether to use the trick

Class CRFTrainerByLabelLikelihood

Nested Class Summary

Nested classes/interfaces inherited from class cc.mallet.fst.TransducerTrainer

Field Summary

Constructor Summary

Method Summary

Methods inherited from class cc.mallet.fst.TransducerTrainer

Methods inherited from class java.lang.Object

Field Detail

printGradient

Constructor Detail

CRFTrainerByLabelLikelihood

Method Detail

getTransducer

getCRF

getOptimizer

isConverged

isFinishedTraining

getIteration

setAddNoFactors

getOptimizableCRF

getOptimizer

trainIncremental

train

train

trainWithFeatureInduction

trainWithFeatureInduction

setUseHyperbolicPrior

setHyperbolicPriorSlope

setHyperbolicPriorSharpness

getUseHyperbolicPriorSlope

getUseHyperbolicPriorSharpness

setGaussianPriorVariance

getGaussianPriorVariance

setUseSparseWeights

getUseSparseWeights

setUseSomeUnsupportedTrick