Package cc.mallet.pipe
Class FixedVocabTokenizer
- java.lang.Object
-
- cc.mallet.pipe.Pipe
-
- cc.mallet.pipe.FixedVocabTokenizer
-
- All Implemented Interfaces:
AlphabetCarrying,java.io.Serializable
public class FixedVocabTokenizer extends Pipe implements java.io.Serializable
A simple unicode tokenizer that accepts sequences of letters as tokens.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description intminimumLength
-
Constructor Summary
Constructors Constructor Description FixedVocabTokenizer(Alphabet alphabet)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Instancepipe(Instance instance)Really this should be 'protected', but isn't for historical reasons.-
Methods inherited from class cc.mallet.pipe.Pipe
alphabetsMatch, getAlphabet, getAlphabets, getDataAlphabet, getInstanceId, getTargetAlphabet, instanceFrom, instancesFrom, instancesFrom, isDataAlphabetSet, isTargetProcessing, newIteratorFrom, preceedingPipeDataAlphabetNotification, preceedingPipeTargetAlphabetNotification, precondition, readResolve, setDataAlphabet, setOrCheckDataAlphabet, setOrCheckTargetAlphabet, setTargetAlphabet, setTargetProcessing
-
-
-
-
Constructor Detail
-
FixedVocabTokenizer
public FixedVocabTokenizer(Alphabet alphabet)
-
-