Package cc.mallet.topics
Class NPTopicModel
- java.lang.Object
-
- cc.mallet.topics.NPTopicModel
-
- All Implemented Interfaces:
java.io.Serializable
public class NPTopicModel extends java.lang.Object implements java.io.SerializableA non-parametric topic model that uses the "minimal path" assumption to reduce bookkeeping.- Author:
- David Mimno
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected doublealphaprotected Alphabetalphabetprotected doublebetaprotected doublebetaSumprotected java.util.ArrayList<TopicAssignment>datastatic doubleDEFAULT_BETAprotected com.carrotsearch.hppc.IntIntHashMapdocsPerTopicprotected java.text.NumberFormatformatterprotected doublegammaprotected intmaxTopicprotected intnumTopicsprotected intnumTypesprotected booleanprintLogLikelihoodprotected RandomsrandomintshowTopicsIntervalprotected com.carrotsearch.hppc.IntIntHashMaptokensPerTopicprotected LabelAlphabettopicAlphabetprotected inttotalDocTopicsprotected com.carrotsearch.hppc.IntIntHashMap[]typeTopicCountsintwordsPerTopic
-
Constructor Summary
Constructors Constructor Description NPTopicModel(double alpha, double gamma, double beta)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddInstances(InstanceList training, int initialTopics)static voidmain(java.lang.String[] args)voidprintState(java.io.File f)voidprintState(java.io.PrintStream out)voidsample(int iterations)protected voidsampleTopicsForOneDoc(FeatureSequence tokenSequence, FeatureSequence topicSequence)voidsetRandomSeed(int seed)voidsetTopicDisplay(int interval, int n)java.lang.StringtopWords(int numWords)
-
-
-
Field Detail
-
data
protected java.util.ArrayList<TopicAssignment> data
-
alphabet
protected Alphabet alphabet
-
topicAlphabet
protected LabelAlphabet topicAlphabet
-
maxTopic
protected int maxTopic
-
numTopics
protected int numTopics
-
numTypes
protected int numTypes
-
alpha
protected double alpha
-
gamma
protected double gamma
-
beta
protected double beta
-
betaSum
protected double betaSum
-
DEFAULT_BETA
public static final double DEFAULT_BETA
- See Also:
- Constant Field Values
-
typeTopicCounts
protected com.carrotsearch.hppc.IntIntHashMap[] typeTopicCounts
-
tokensPerTopic
protected com.carrotsearch.hppc.IntIntHashMap tokensPerTopic
-
docsPerTopic
protected com.carrotsearch.hppc.IntIntHashMap docsPerTopic
-
totalDocTopics
protected int totalDocTopics
-
showTopicsInterval
public int showTopicsInterval
-
wordsPerTopic
public int wordsPerTopic
-
random
protected Randoms random
-
formatter
protected java.text.NumberFormat formatter
-
printLogLikelihood
protected boolean printLogLikelihood
-
-
Constructor Detail
-
NPTopicModel
public NPTopicModel(double alpha, double gamma, double beta)- Parameters:
alpha- this parameter balances the local document topic counts with the global distribution over topics.gamma- this parameter is the weight on a completely new, never-before-seen topic in the global distribution.beta- this parameter controls the variability of the topic-word distributions
-
-
Method Detail
-
setTopicDisplay
public void setTopicDisplay(int interval, int n)
-
setRandomSeed
public void setRandomSeed(int seed)
-
addInstances
public void addInstances(InstanceList training, int initialTopics)
-
sample
public void sample(int iterations) throws java.io.IOException- Throws:
java.io.IOException
-
sampleTopicsForOneDoc
protected void sampleTopicsForOneDoc(FeatureSequence tokenSequence, FeatureSequence topicSequence)
-
topWords
public java.lang.String topWords(int numWords)
-
printState
public void printState(java.io.File f) throws java.io.IOException- Throws:
java.io.IOException
-
printState
public void printState(java.io.PrintStream out)
-
main
public static void main(java.lang.String[] args) throws java.io.IOException- Throws:
java.io.IOException
-
-