Class SimpleTaggerSentence2StringTokenization

    • Constructor Detail

      • SimpleTaggerSentence2StringTokenization

        public SimpleTaggerSentence2StringTokenization()
        Creates a new SimpleTaggerSentence2StringTokenization instance. By default we include tokens as features.
      • SimpleTaggerSentence2StringTokenization

        public SimpleTaggerSentence2StringTokenization​(boolean inc)
        creates a new SimpleTaggerSentence2StringTokenization instance which includes tokens as features iff the supplied argument is true.
    • Method Detail

      • pipe

        public Instance pipe​(Instance carrier)
        Takes an instance with data of type String or String[][] and creates an Instance of type StringTokenization. Each Token in the sequence is gets the test of the line preceding it and once feature of value 1 for each "Feature" in the line. For example, if the String[][] is {{a,b},{c,d,e}} (and target processing is off) then the text would be "a b" for the first token and "c d e" for the second. Also, the features "a" and "b" would be set for the first token and "c", "d" and "e" for the second. The last element in the String[] for the current token is taken as the target (label), so in the previous example "b" would have been the label of the first sequence.
        pipe in class SimpleTaggerSentence2TokenSequence