Package cc.mallet.topics
Class TopicInferencer
- java.lang.Object
-
- cc.mallet.topics.TopicInferencer
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
DMRInferencer
public class TopicInferencer extends java.lang.Object implements java.io.Serializable- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected double[]alphaprotected Alphabetalphabetprotected doublebetaprotected doublebetaSumprotected double[]cachedCoefficientsprotected intnumTopicsprotected intnumTypesprotected Randomsrandomprotected doublesmoothingOnlyMassprotected int[]tokensPerTopicprotected inttopicBitsprotected inttopicMaskprotected int[][]typeTopicCounts
-
Constructor Summary
Constructors Constructor Description TopicInferencer()TopicInferencer(int[][] typeTopicCounts, int[] tokensPerTopic, Alphabet alphabet, double[] alpha, double beta, double betaSum)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double[]getSampledDistribution(Instance instance, int numIterations, int thinning, int burnIn)Use Gibbs sampling to infer a topic distribution.static TopicInferencerread(java.io.File f)voidsetRandomSeed(int seed)voidwriteInferredDistributions(InstanceList instances, java.io.File distributionsFile, int numIterations, int thinning, int burnIn, double threshold, int max)Infer topics for the provided instances and write distributions to the provided file.
-
-
-
Field Detail
-
numTopics
protected int numTopics
-
topicMask
protected int topicMask
-
topicBits
protected int topicBits
-
numTypes
protected int numTypes
-
alpha
protected double[] alpha
-
beta
protected double beta
-
betaSum
protected double betaSum
-
typeTopicCounts
protected int[][] typeTopicCounts
-
tokensPerTopic
protected int[] tokensPerTopic
-
alphabet
protected Alphabet alphabet
-
random
protected Randoms random
-
smoothingOnlyMass
protected double smoothingOnlyMass
-
cachedCoefficients
protected double[] cachedCoefficients
-
-
Constructor Detail
-
TopicInferencer
public TopicInferencer(int[][] typeTopicCounts, int[] tokensPerTopic, Alphabet alphabet, double[] alpha, double beta, double betaSum)
-
TopicInferencer
public TopicInferencer()
-
-
Method Detail
-
setRandomSeed
public void setRandomSeed(int seed)
-
getSampledDistribution
public double[] getSampledDistribution(Instance instance, int numIterations, int thinning, int burnIn)
Use Gibbs sampling to infer a topic distribution. Topics are initialized to the (or a) most probable topic for each token. Using zero iterations returns exactly this initial topic distribution. This code does not adjust type-topic counts: P(w|t) is clamped.
-
writeInferredDistributions
public void writeInferredDistributions(InstanceList instances, java.io.File distributionsFile, int numIterations, int thinning, int burnIn, double threshold, int max) throws java.io.IOException
Infer topics for the provided instances and write distributions to the provided file.- Parameters:
instances-distributionsFile-numIterations- The total number of iterations of sampling per documentthinning- The number of iterations between saved samplesburnIn- The number of iterations before the first saved samplethreshold- The minimum proportion of a given topic that will be writtenmax- The total number of topics to report per document]- Throws:
java.io.IOException
-
read
public static TopicInferencer read(java.io.File f) throws java.lang.Exception
- Throws:
java.lang.Exception
-
-