Package cc.mallet.topics
Class WorkerRunnable
- java.lang.Object
-
- cc.mallet.topics.WorkerRunnable
-
- All Implemented Interfaces:
java.lang.Runnable
public class WorkerRunnable extends java.lang.Object implements java.lang.Runnable
A parallel topic model runnable task.- Author:
- David Mimno, Andrew McCallum
-
-
Field Summary
Fields Modifier and Type Field Description protected double[]
alpha
protected double
alphaSum
protected double
beta
protected double
betaSum
protected double[]
cachedCoefficients
static double
DEFAULT_BETA
protected int[]
docLengthCounts
protected int
numTopics
protected int
numTypes
protected Randoms
random
protected double
smoothingOnlyMass
protected int[]
tokensPerTopic
protected int
topicBits
protected int[][]
topicDocCounts
protected int
topicMask
protected int[][]
typeTopicCounts
-
Constructor Summary
Constructors Constructor Description WorkerRunnable()
WorkerRunnable(int numTopics, double[] alpha, double alphaSum, double beta, Randoms random, java.util.ArrayList<TopicAssignment> data, int[][] typeTopicCounts, int[] tokensPerTopic, int startDoc, int numDocs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildLocalTypeTopicCounts()
Once we have sampled the local counts, trash the "global" type topic counts and reuse the space to build a summary of the type topic counts specific to this worker's section of the corpus.void
collectAlphaStatistics()
int[]
getDocLengthCounts()
int[]
getTokensPerTopic()
int[][]
getTopicDocCounts()
int[][]
getTypeTopicCounts()
void
initializeAlphaStatistics(int size)
void
makeOnlyThread()
If there is only one thread, we don't need to go through communication overhead.void
resetBeta(double beta, double betaSum)
void
run()
protected void
sampleTopicsForOneDoc(FeatureSequence tokenSequence, FeatureSequence topicSequence, boolean readjustTopicsAndStats)
-
-
-
Field Detail
-
numTopics
protected int numTopics
-
topicMask
protected int topicMask
-
topicBits
protected int topicBits
-
numTypes
protected int numTypes
-
alpha
protected double[] alpha
-
alphaSum
protected double alphaSum
-
beta
protected double beta
-
betaSum
protected double betaSum
-
DEFAULT_BETA
public static final double DEFAULT_BETA
- See Also:
- Constant Field Values
-
smoothingOnlyMass
protected double smoothingOnlyMass
-
cachedCoefficients
protected double[] cachedCoefficients
-
typeTopicCounts
protected int[][] typeTopicCounts
-
tokensPerTopic
protected int[] tokensPerTopic
-
docLengthCounts
protected int[] docLengthCounts
-
topicDocCounts
protected int[][] topicDocCounts
-
random
protected Randoms random
-
-
Constructor Detail
-
WorkerRunnable
public WorkerRunnable()
-
WorkerRunnable
public WorkerRunnable(int numTopics, double[] alpha, double alphaSum, double beta, Randoms random, java.util.ArrayList<TopicAssignment> data, int[][] typeTopicCounts, int[] tokensPerTopic, int startDoc, int numDocs)
-
-
Method Detail
-
makeOnlyThread
public void makeOnlyThread()
If there is only one thread, we don't need to go through communication overhead. This method asks this worker not to prepare local type-topic counts. The method should be called when we are using this code in a non-threaded environment.
-
getTokensPerTopic
public int[] getTokensPerTopic()
-
getTypeTopicCounts
public int[][] getTypeTopicCounts()
-
getDocLengthCounts
public int[] getDocLengthCounts()
-
getTopicDocCounts
public int[][] getTopicDocCounts()
-
initializeAlphaStatistics
public void initializeAlphaStatistics(int size)
-
collectAlphaStatistics
public void collectAlphaStatistics()
-
resetBeta
public void resetBeta(double beta, double betaSum)
-
buildLocalTypeTopicCounts
public void buildLocalTypeTopicCounts()
Once we have sampled the local counts, trash the "global" type topic counts and reuse the space to build a summary of the type topic counts specific to this worker's section of the corpus.
-
run
public void run()
- Specified by:
run
in interfacejava.lang.Runnable
-
sampleTopicsForOneDoc
protected void sampleTopicsForOneDoc(FeatureSequence tokenSequence, FeatureSequence topicSequence, boolean readjustTopicsAndStats)
-
-