Package cc.mallet.topics
Class TopicInferencer
- java.lang.Object
-
- cc.mallet.topics.TopicInferencer
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
DMRInferencer
public class TopicInferencer extends java.lang.Object implements java.io.Serializable
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected double[]
alpha
protected Alphabet
alphabet
protected double
beta
protected double
betaSum
protected double[]
cachedCoefficients
protected int
numTopics
protected int
numTypes
protected Randoms
random
protected double
smoothingOnlyMass
protected int[]
tokensPerTopic
protected int
topicBits
protected int
topicMask
protected int[][]
typeTopicCounts
-
Constructor Summary
Constructors Constructor Description TopicInferencer()
TopicInferencer(int[][] typeTopicCounts, int[] tokensPerTopic, Alphabet alphabet, double[] alpha, double beta, double betaSum)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double[]
getSampledDistribution(Instance instance, int numIterations, int thinning, int burnIn)
Use Gibbs sampling to infer a topic distribution.static TopicInferencer
read(java.io.File f)
void
setRandomSeed(int seed)
void
writeInferredDistributions(InstanceList instances, java.io.File distributionsFile, int numIterations, int thinning, int burnIn, double threshold, int max)
Infer topics for the provided instances and write distributions to the provided file.
-
-
-
Field Detail
-
numTopics
protected int numTopics
-
topicMask
protected int topicMask
-
topicBits
protected int topicBits
-
numTypes
protected int numTypes
-
alpha
protected double[] alpha
-
beta
protected double beta
-
betaSum
protected double betaSum
-
typeTopicCounts
protected int[][] typeTopicCounts
-
tokensPerTopic
protected int[] tokensPerTopic
-
alphabet
protected Alphabet alphabet
-
random
protected Randoms random
-
smoothingOnlyMass
protected double smoothingOnlyMass
-
cachedCoefficients
protected double[] cachedCoefficients
-
-
Constructor Detail
-
TopicInferencer
public TopicInferencer(int[][] typeTopicCounts, int[] tokensPerTopic, Alphabet alphabet, double[] alpha, double beta, double betaSum)
-
TopicInferencer
public TopicInferencer()
-
-
Method Detail
-
setRandomSeed
public void setRandomSeed(int seed)
-
getSampledDistribution
public double[] getSampledDistribution(Instance instance, int numIterations, int thinning, int burnIn)
Use Gibbs sampling to infer a topic distribution. Topics are initialized to the (or a) most probable topic for each token. Using zero iterations returns exactly this initial topic distribution. This code does not adjust type-topic counts: P(w|t) is clamped.
-
writeInferredDistributions
public void writeInferredDistributions(InstanceList instances, java.io.File distributionsFile, int numIterations, int thinning, int burnIn, double threshold, int max) throws java.io.IOException
Infer topics for the provided instances and write distributions to the provided file.- Parameters:
instances
-distributionsFile
-numIterations
- The total number of iterations of sampling per documentthinning
- The number of iterations between saved samplesburnIn
- The number of iterations before the first saved samplethreshold
- The minimum proportion of a given topic that will be writtenmax
- The total number of topics to report per document]- Throws:
java.io.IOException
-
read
public static TopicInferencer read(java.io.File f) throws java.lang.Exception
- Throws:
java.lang.Exception
-
-