Class FixedVocabTokenizer

  • All Implemented Interfaces:
    AlphabetCarrying, java.io.Serializable

    public class FixedVocabTokenizer
    extends Pipe
    implements java.io.Serializable
    A simple unicode tokenizer that accepts sequences of letters as tokens.
    See Also:
    Serialized Form
    • Field Detail

      • minimumLength

        public int minimumLength
    • Constructor Detail

      • FixedVocabTokenizer

        public FixedVocabTokenizer​(Alphabet alphabet)
    • Method Detail

      • pipe

        public Instance pipe​(Instance instance)
        Description copied from class: Pipe
        Really this should be 'protected', but isn't for historical reasons.
        Overrides:
        pipe in class Pipe