Class NaiveBayes

  • All Implemented Interfaces:
    AlphabetCarrying, java.io.Serializable

    public class NaiveBayes
    extends Classifier
    implements java.io.Serializable
    A classifier that classifies instances according to the NaiveBayes method. In an Bayes classifier, the p(Classification|Data) = p(Data|Classification)p(Classification)/p(Data)

    To compute the likelihood:
    p(Data|Classification) = p(d1,d2,..dn | Classification)
    Naive Bayes makes the assumption that all of the data are conditionally independent given the Classification:
    p(d1,d2,...dn | Classification) = p(d1|Classification)p(d2|Classification)..

    As with other classifiers in Mallet, NaiveBayes is implemented as two classes: a trainer and a classifier. The NaiveBayesTrainer produces estimates of the various p(dn|Classifier) and contructs this class with those estimates.

    Instances are assumed to be FeatureVectors

    As with other Mallet classifiers, classification may only be performed on instances processed with the pipe associated with this classifer, ie naiveBayes.getPipeInstance(); The NaiveBayesTrainer sets this pipe to the pipe used to process the training instances.

    A NaiveBayes classifier can be persisted and reused using serialization.

    Author:
    Andrew McCallum mccallum@cs.umass.edu
    See Also:
    NaiveBayesTrainer, FeatureVector, Serialized Form
    • Constructor Detail

      • NaiveBayes

        public NaiveBayes​(Pipe instancePipe,
                          Multinomial.Logged prior,
                          Multinomial.Logged[] classIndex2FeatureProb)
        Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification. A NaiveBayes classifier is generally generated from a NaiveBayesTrainer, not constructed directly by users. Proability estimates are converted and saved as logarithms internally.
        Parameters:
        instancePipe - Used to check that feature vector dictionary for each instance is the same as that associated with the pipe. Null suppresses check
        prior - Mulinomial that gives an estimate of the prior probability for each Classification
        classIndex2FeatureProb - An array of multinomials giving an estimate of the probability of a classification for each feature of each featurevector.
      • NaiveBayes

        public NaiveBayes​(Pipe dataPipe,
                          Multinomial prior,
                          Multinomial[] classIndex2FeatureProb)
        Construct a NaiveBayes classifier from a pipe, prior estimates for each Classification, and feature estimates of each Classification. A NaiveBayes classifier is generally generated from a NaiveBayesTrainer, not constructed directly by users.
        Parameters:
        dataPipe - Used to check that feature vector dictionary for each instance is the same as that associated with the pipe. Null suppresses check
        prior - Mulinomial that gives an estimate of the prior probability for each Classification
        classIndex2FeatureProb - An array of multinomials giving an estimate of the probability of a classification for each feature of each featurevector.
    • Method Detail

      • printWords

        public void printWords​(int numToPrint)
      • classify

        public Classification classify​(Instance instance)
        Classify an instance using NaiveBayes according to the trained data. The alphabet of the featureVector of the instance must match the alphabe of the pipe used to train the classifier.
        Specified by:
        classify in class Classifier
        Parameters:
        instance - to be classified. Data field must be a FeatureVector
        Returns:
        Classification containing the labeling of the instance
      • dataLogLikelihood

        public double dataLogLikelihood​(InstanceList ilist)
      • labelLogLikelihood

        public double labelLogLikelihood​(InstanceList ilist)