Class CharSequenceRemoveHTML

  • All Implemented Interfaces:
    AlphabetCarrying, java.io.Serializable

    public class CharSequenceRemoveHTML
    extends Pipe
    This pipe removes HTML from a CharSequence. The HTML is actually parsed here, so we should have less HTML slipping through... but it is almost certainly much slower than a regular expression, and could fail on broken HTML.
    Author:
    Greg Druck gdruck@cs.umass.edu
    See Also:
    Serialized Form
    • Constructor Detail

      • CharSequenceRemoveHTML

        public CharSequenceRemoveHTML()
    • Method Detail

      • pipe

        public Instance pipe​(Instance carrier)
        Description copied from class: Pipe
        Really this should be 'protected', but isn't for historical reasons.
        Overrides:
        pipe in class Pipe
      • main

        public static void main​(java.lang.String[] args)