Package cc.mallet.extract
Class Extraction
- java.lang.Object
-
- cc.mallet.extract.Extraction
-
public class Extraction extends java.lang.Object
The results of doing information extraction. This is designed to handle field extraction from a single document, or relation extraction and coreference from multiple documents;
-
-
Constructor Summary
Constructors Constructor Description Extraction(Extractor extractor, LabelAlphabet dict)
Creates an empty Extraction option.Extraction(Extractor extractor, LabelAlphabet dict, java.lang.String name, Tokenization input, Sequence output, java.lang.String background)
Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addDocumentExtraction(DocumentExtraction docseq)
void
cleanFields(FieldCleaner cleaner)
DocumentExtraction
getDocumentExtraction(int idx)
Extractor
getExtractor()
LabelAlphabet
getLabelAlphabet()
int
getNumDocuments()
int
getNumRecords()
Record
getRecord(int idx)
Record
getTargetRecord(int docnum)
void
print(java.io.PrintWriter writer)
-
-
-
Constructor Detail
-
Extraction
public Extraction(Extractor extractor, LabelAlphabet dict)
Creates an empty Extraction option. DocumentExtractions can be added later by the addDocumentExtraction method.
-
Extraction
public Extraction(Extractor extractor, LabelAlphabet dict, java.lang.String name, Tokenization input, Sequence output, java.lang.String background)
Creates an extration given a sequence output by some kind of per-sequece labeler, like an HMM or a CRF. The extraction will contain a single document.
-
-
Method Detail
-
addDocumentExtraction
public void addDocumentExtraction(DocumentExtraction docseq)
-
getRecord
public Record getRecord(int idx)
-
getNumRecords
public int getNumRecords()
-
getDocumentExtraction
public DocumentExtraction getDocumentExtraction(int idx)
-
getNumDocuments
public int getNumDocuments()
-
getExtractor
public Extractor getExtractor()
-
getTargetRecord
public Record getTargetRecord(int docnum)
-
getLabelAlphabet
public LabelAlphabet getLabelAlphabet()
-
cleanFields
public void cleanFields(FieldCleaner cleaner)
-
print
public void print(java.io.PrintWriter writer)
-
-