Package cc.mallet.util
Class BulkLoader
- java.lang.Object
-
- cc.mallet.util.BulkLoader
-
public class BulkLoader extends java.lang.Object
This class reads through a single file, breaking each line into data and (optional) name and label fields.
-
-
Constructor Summary
Constructors Constructor Description BulkLoader()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
generateStoplist(SimpleTokenizer prunedTokenizer)
Read the data from inputFile, then write all the words that do not occur pruneCount.value times or more to the pruned word file.static void
main(java.lang.String[] args)
static void
writeInstanceList(SimpleTokenizer prunedTokenizer)
-
-
-
Method Detail
-
generateStoplist
public static void generateStoplist(SimpleTokenizer prunedTokenizer) throws java.io.IOException
Read the data from inputFile, then write all the words that do not occur pruneCount.value times or more to the pruned word file.- Parameters:
prunedTokenizer
- the tokenizer that will be used to write instances- Throws:
java.io.IOException
-
writeInstanceList
public static void writeInstanceList(SimpleTokenizer prunedTokenizer) throws java.io.IOException
- Throws:
java.io.IOException
-
main
public static void main(java.lang.String[] args) throws java.io.IOException
- Throws:
java.io.IOException
-
-