Class NaiveBayesTrainer
- java.lang.Object
-
- cc.mallet.classify.ClassifierTrainer<NaiveBayes>
-
- cc.mallet.classify.NaiveBayesTrainer
-
- All Implemented Interfaces:
Boostable,ClassifierTrainer.ByIncrements<NaiveBayes>,ClassifierTrainer.ByInstanceIncrements<NaiveBayes>,AlphabetCarrying,java.io.Serializable
public class NaiveBayesTrainer extends ClassifierTrainer<NaiveBayes> implements ClassifierTrainer.ByInstanceIncrements<NaiveBayes>, Boostable, AlphabetCarrying, java.io.Serializable
Class used to generate a NaiveBayes classifier from a set of training data. In an Bayes classifier, the p(Classification|Data) = p(Data|Classification)p(Classification)/p(Data)To compute the likelihood:
p(Data|Classification) = p(d1,d2,..dn | Classification)
Naive Bayes makes the assumption that all of the data are conditionally independent given the Classification:
p(d1,d2,...dn | Classification) = p(d1|Classification)p(d2|Classification)..As with other classifiers in Mallet, NaiveBayes is implemented as two classes: a trainer and a classifier. The NaiveBayesTrainer produces estimates of the various p(dn|Classifier) and contructs this class with those estimates.
A call to train() or incrementalTrain() produces a
NaiveBayesclassifier that can can be used to classify instances. A call to incrementalTrain() does not throw away the internal state of the trainer; subsequent calls to incrementalTrain() train by extending the previous training set.A NaiveBayesTrainer can be persisted using serialization.
- Author:
- Andrew McCallum mccallum@cs.umass.edu
- See Also:
NaiveBayes, Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classNaiveBayesTrainer.Factory-
Nested classes/interfaces inherited from class cc.mallet.classify.ClassifierTrainer
ClassifierTrainer.ByActiveLearning<C extends Classifier>, ClassifierTrainer.ByIncrements<C extends Classifier>, ClassifierTrainer.ByInstanceIncrements<C extends Classifier>, ClassifierTrainer.ByOptimization<C extends Classifier>
-
-
Field Summary
-
Fields inherited from class cc.mallet.classify.ClassifierTrainer
finishedTraining, validationSet
-
-
Constructor Summary
Constructors Constructor Description NaiveBayesTrainer()NaiveBayesTrainer(NaiveBayes initialClassifier)NaiveBayesTrainer(Pipe instancePipe)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanalphabetsMatch(AlphabetCarrying object)AlphabetgetAlphabet()Alphabet[]getAlphabets()NaiveBayesgetClassifier()doublegetDocLengthNormalization()Multinomial.EstimatorgetFeatureMultinomialEstimator()Get the MultinomialEstimator instance used to specify the type of estimator for features.Multinomial.EstimatorgetPriorMultinomialEstimator()Get the MultinomialEstimator instance used to specify the type of estimator for priors.NaiveBayesTrainersetDocLengthNormalization(double d)NaiveBayesTrainersetFeatureMultinomialEstimator(Multinomial.Estimator me)Set the Multinomial Estimator used for features.NaiveBayesTrainersetPriorMultinomialEstimator(Multinomial.Estimator me)Set the Multinomial Estimator used for priors.java.lang.StringtoString()Create a NaiveBayes classifier from a set of training data and the previous state of the trainer.NaiveBayestrain(InstanceList trainingList)Create a NaiveBayes classifier from a set of training data.NaiveBayestrainIncremental(Instance instance)NaiveBayestrainIncremental(InstanceList trainingInstancesToAdd)-
Methods inherited from class cc.mallet.classify.ClassifierTrainer
getValidationInstances, isFinishedTraining, setValidationInstances
-
-
-
-
Constructor Detail
-
NaiveBayesTrainer
public NaiveBayesTrainer(NaiveBayes initialClassifier)
-
NaiveBayesTrainer
public NaiveBayesTrainer(Pipe instancePipe)
-
NaiveBayesTrainer
public NaiveBayesTrainer()
-
-
Method Detail
-
getClassifier
public NaiveBayes getClassifier()
- Specified by:
getClassifierin classClassifierTrainer<NaiveBayes>
-
setDocLengthNormalization
public NaiveBayesTrainer setDocLengthNormalization(double d)
-
getDocLengthNormalization
public double getDocLengthNormalization()
-
getFeatureMultinomialEstimator
public Multinomial.Estimator getFeatureMultinomialEstimator()
Get the MultinomialEstimator instance used to specify the type of estimator for features.- Returns:
- estimator to be cloned on next call to train() or first call to incrementalTrain()
-
setFeatureMultinomialEstimator
public NaiveBayesTrainer setFeatureMultinomialEstimator(Multinomial.Estimator me)
Set the Multinomial Estimator used for features. The MulitnomialEstimator is internally cloned and the clone is used to maintain the counts that will be used to generate probability estimates the next time train() or an initial incrementalTrain() is run. Defaults to a Multinomial.LaplaceEstimator()- Parameters:
me- to be cloned on next call to train() or first call to incrementalTrain()
-
getPriorMultinomialEstimator
public Multinomial.Estimator getPriorMultinomialEstimator()
Get the MultinomialEstimator instance used to specify the type of estimator for priors.- Returns:
- estimator to be cloned on next call to train() or first call to incrementalTrain()
-
setPriorMultinomialEstimator
public NaiveBayesTrainer setPriorMultinomialEstimator(Multinomial.Estimator me)
Set the Multinomial Estimator used for priors. The MulitnomialEstimator is internally cloned and the clone is used to maintain the counts that will be used to generate probability estimates the next time train() or an initial incrementalTrain() is run. Defaults to a Multinomial.LaplaceEstimator()- Parameters:
me- to be cloned on next call to train() or first call to incrementalTrain()
-
train
public NaiveBayes train(InstanceList trainingList)
Create a NaiveBayes classifier from a set of training data. The trainer uses counts of each feature in an instance's feature vector to provide an estimate of p(Labeling| feature). The internal state of the trainer is thrown away ( by a call to reset() ) when train() returns. Each call to train() is completely independent of any other.- Specified by:
trainin classClassifierTrainer<NaiveBayes>- Parameters:
trainingList- The InstanceList to be used to train the classifier. Within each instance the data slot is an instance of FeatureVector and the target slot is an instance of LabelingvalidationList- Currently unusedtestSet- Currently unusedevaluator- Currently unusedinitialClassifier- Currently unused- Returns:
- The NaiveBayes classifier as trained on the trainingList
-
trainIncremental
public NaiveBayes trainIncremental(InstanceList trainingInstancesToAdd)
- Specified by:
trainIncrementalin interfaceClassifierTrainer.ByIncrements<NaiveBayes>
-
trainIncremental
public NaiveBayes trainIncremental(Instance instance)
- Specified by:
trainIncrementalin interfaceClassifierTrainer.ByInstanceIncrements<NaiveBayes>
-
toString
public java.lang.String toString()
Create a NaiveBayes classifier from a set of training data and the previous state of the trainer. Subsequent calls to incrementalTrain() add to the state of the trainer. An incremental training session should consist only of calls to incrementalTrain() and have no calls to train(); *- Overrides:
toStringin classjava.lang.Object- Parameters:
trainingList- The InstanceList to be used to train the classifier. Within each instance the data slot is an instance of FeatureVector and the target slot is an instance of LabelingvalidationList- Currently unusedtestSet- Currently unusedevaluator- Currently unusedinitialClassifier- Currently unused- Returns:
- The NaiveBayes classifier as trained on the trainingList and the previous trainingLists passed to incrementalTrain()
-
alphabetsMatch
public boolean alphabetsMatch(AlphabetCarrying object)
-
getAlphabet
public Alphabet getAlphabet()
- Specified by:
getAlphabetin interfaceAlphabetCarrying
-
getAlphabets
public Alphabet[] getAlphabets()
- Specified by:
getAlphabetsin interfaceAlphabetCarrying
-
-