Package cc.mallet.extract
Class Extraction
- java.lang.Object
-
- cc.mallet.extract.Extraction
-
public class Extraction extends java.lang.ObjectThe results of doing information extraction. This is designed to handle field extraction from a single document, or relation extraction and coreference from multiple documents;
-
-
Constructor Summary
Constructors Constructor Description Extraction(Extractor extractor, LabelAlphabet dict)Creates an empty Extraction option.Extraction(Extractor extractor, LabelAlphabet dict, java.lang.String name, Tokenization input, Sequence output, java.lang.String background)Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddDocumentExtraction(DocumentExtraction docseq)voidcleanFields(FieldCleaner cleaner)DocumentExtractiongetDocumentExtraction(int idx)ExtractorgetExtractor()LabelAlphabetgetLabelAlphabet()intgetNumDocuments()intgetNumRecords()RecordgetRecord(int idx)RecordgetTargetRecord(int docnum)voidprint(java.io.PrintWriter writer)
-
-
-
Constructor Detail
-
Extraction
public Extraction(Extractor extractor, LabelAlphabet dict)
Creates an empty Extraction option. DocumentExtractions can be added later by the addDocumentExtraction method.
-
Extraction
public Extraction(Extractor extractor, LabelAlphabet dict, java.lang.String name, Tokenization input, Sequence output, java.lang.String background)
Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF. The extraction will contain a single document.
-
-
Method Detail
-
addDocumentExtraction
public void addDocumentExtraction(DocumentExtraction docseq)
-
getRecord
public Record getRecord(int idx)
-
getNumRecords
public int getNumRecords()
-
getDocumentExtraction
public DocumentExtraction getDocumentExtraction(int idx)
-
getNumDocuments
public int getNumDocuments()
-
getExtractor
public Extractor getExtractor()
-
getTargetRecord
public Record getTargetRecord(int docnum)
-
getLabelAlphabet
public LabelAlphabet getLabelAlphabet()
-
cleanFields
public void cleanFields(FieldCleaner cleaner)
-
print
public void print(java.io.PrintWriter writer)
-
-