Package cc.mallet.extract
-
Interface Summary Interface Description ExtractionEvaluator Created: Oct 8, 2004Extractor Generic interface for objects that do information extraction.FieldCleaner Interface for functions that are used to clean up field values after extraction has been performed.FieldComparator Interface for functions that compares extracted values of a field to see if they match.Span A sub-section of a document, either linear or two-dimensional.Tokenization TokenizationFilter Created: Nov 12, 2004 -
Class Summary Class Description AccuracyCoverageEvaluator Constructs Accuracy-coverage graph using confidence values to sort Fields.BIOTokenizationFilter Created: Nov 12, 2004BIOTokenizationFilterWithTokenIndices ConfidenceTokenizationFilter Created: Oct 26, 2005CRFExtractor Created: Oct 12, 2004DefaultTokenizationFilter Created: Nov 12, 2004DocumentExtraction Created: Oct 12, 2004DocumentViewer Diagnosis class that outputs HTML pages that allows you to view errors on a more global per-instance basis.Element ExactMatchComparator Created: Nov 23, 2004Extraction The results of doing information extraction.ExtractionConfidenceEstimator Estimates the confidence in the labeling of a LabeledSpan.Field Created: Oct 12, 2004HierarchicalTokenizationFilter Tokenization filter that will create nested spans based on a hierarchical labeling of the data.LabeledSpan Created: Oct 12, 2004LabeledSpans Created: Oct 31, 2004LatticeViewer Created: Oct 31, 2004PerDocumentF1Evaluator Created: Oct 8, 2004PerFieldF1Evaluator Created: Oct 8, 2004PunctuationIgnoringComparator Created: Nov 23, 2004Record Created: Oct 12, 2004RegexFieldCleaner A field cleaner that removes all occurrences of a given regex.StringSpan A sub-section of a linear string.StringTokenization Text TransducerExtractionConfidenceEstimator Estimates the confidence in the labeling of a LabeledSpan using a TransducerConfidenceEstimator.