java.lang.Object
- cc.mallet.fst.semi_supervised.tui.SimpleTaggerWithConstraints

```
public class SimpleTaggerWithConstraints
extends java.lang.Object
```
Version of SimpleTagger that trains CRFs with expectation constraints rather than labeled data. This class's main method trains, tests, or runs a generic CRF-based sequence tagger.
Training and test files consist of blocks of lines, one block for each instance, separated by blank lines. Each block of lines should have the first form specified for the input of SimpleTagger.SimpleTaggerSentence2FeatureVectorSequence. A variety of command line options control the operation of the main program, as described in the comments for main.

Version:

1.0

Author:

Gregory Druck gdruck@cs.umass.edu

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static Sequence[]`	`apply(Transducer model, Sequence input, int k)`	Apply a transducer to an input sequence to produce the k highest-scoring output sequences.
`static CRF`	`getCRF(InstanceList training, int[] orders, java.lang.String defaultLabel, java.lang.String forbidden, java.lang.String allowed, boolean connected)`
`static void`	`main(java.lang.String[] args)`	Command-line wrapper to train, test, or run a generic CRF-based tagger.
`static void`	`test(TransducerTrainer tt, TransducerEvaluator eval, InstanceList testing)`	Test a transducer on the given test data, evaluating accuracy with the given evaluator
`static CRF`	`trainGE(InstanceList training, InstanceList testing, java.util.ArrayList<GEConstraint> constraints, CRF crf, TransducerEvaluator eval, int iterations, double var, int resets)`	Create and train a CRF model from the given training data, optionally testing it on the given test data.
`static CRF`	`trainPR(InstanceList training, InstanceList testing, java.util.ArrayList<PRConstraint> constraints, CRF crf, TransducerEvaluator eval, int iterations, double var)`	Create and train a CRF model from the given training data, optionally testing it on the given test data.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - trainGE
```
public static CRF trainGE(InstanceList training,
                          InstanceList testing,
                          java.util.ArrayList<GEConstraint> constraints,
                          CRF crf,
                          TransducerEvaluator eval,
                          int iterations,
                          double var,
                          int resets)
```
    Create and train a CRF model from the given training data, optionally testing it on the given test data.
    
    Parameters:
    
    training - training data
    
    testing - test data (possibly null)
    
    constraints - constraints
    
    crf - model
    
    eval - accuracy evaluator (possibly null)
    
    iterations - number of training iterations
    
    var - Gaussian prior variance
    
    resets - Number of resets.
    
    Returns:
    
    the trained model
  - trainPR
```
public static CRF trainPR(InstanceList training,
                          InstanceList testing,
                          java.util.ArrayList<PRConstraint> constraints,
                          CRF crf,
                          TransducerEvaluator eval,
                          int iterations,
                          double var)
```
    Create and train a CRF model from the given training data, optionally testing it on the given test data.
    
    Parameters:
    
    training - training data
    
    testing - test data (possibly null)
    
    constraints - constraints
    
    crf - model
    
    eval - accuracy evaluator (possibly null)
    
    iterations - number of training iterations
    
    var - Gaussian prior variance
    
    Returns:
    
    the trained model
  - getCRF
```
public static CRF getCRF(InstanceList training,
                         int[] orders,
                         java.lang.String defaultLabel,
                         java.lang.String forbidden,
                         java.lang.String allowed,
                         boolean connected)
```
  - test
```
public static void test(TransducerTrainer tt,
                        TransducerEvaluator eval,
                        InstanceList testing)
```
    Test a transducer on the given test data, evaluating accuracy with the given evaluator
    
    Parameters:
    
    model - a Transducer
    
    eval - accuracy evaluator
    
    testing - test data
  - apply
```
public static Sequence[] apply(Transducer model,
                               Sequence input,
                               int k)
```
    Apply a transducer to an input sequence to produce the k highest-scoring output sequences.
    
    Parameters:
    
    model - the Transducer
    
    input - the input sequence
    
    k - the number of answers to return
    
    Returns:
    
    array of the k highest-scoring output sequences
  - main
```
public static void main(java.lang.String[] args)
                 throws java.lang.Exception
```
    Command-line wrapper to train, test, or run a generic CRF-based tagger.
    Parameters:
    
    args - the command line arguments. Options (shell and Java quoting should be added as needed):
    
    --help boolean
    
    Print this command line option usage information. Give true for longer documentation. Default is false.
    
    --prefix-code Java-code
    
    Java code you want run before any other interpreted code. Note that the text is interpreted without modification, so unlike some other Java code options, you need to include any necessary 'new's. Default is null.
    
    --gaussian-variance positive-number
    
    The Gaussian prior variance used for training. Default is 10.0.
    
    --train boolean
    
    Whether to train. Default is false.
    
    --iterations positive-integer
    
    Number of training iterations. Default is 500.
    
    --test lab or seg=start-1.continue-1,...,start-n.continue-n
    
    Test measuring labeling or segmentation (start-i, continue-i) accuracy. Default is no testing.
    
    --training-proportion number-between-0-and-1
    
    Fraction of data to use for training in a random split. Default is 0.5.
    
    --model-file filename
    
    The filename for reading (train/run) or saving (train) the model. Default is null.
    
    --random-seed integer
    
    The random seed for randomly selecting a proportion of the instance list for training Default is 0.
    
    --orders comma-separated-integers
    
    List of label Markov orders (main and backoff) Default is 1.
    
    --forbidden regular-expression
    
    If label-1,label-2 matches the expression, the corresponding transition is forbidden. Default is \\s (nothing forbidden).
    
    --allowed regular-expression
    
    If label-1,label-2 does not match the expression, the corresponding expression is forbidden. Default is .* (everything allowed).
    
    --default-label string
    
    Label for initial context and uninteresting tokens. Default is O.
    
    --viterbi-output boolean
    
    Print Viterbi periodically during training. Default is false.
    
    --fully-connected boolean
    
    Include all allowed transitions, even those not in training data. Default is true.
    
    --weights sparse|some-dense|dense
    
    Create sparse, some dense (using a heuristic), or dense features on transitions. Default is some-dense.
    
    --n-best positive-integer
    
    Number of answers to output when applying model. Default is 1.
    
    --include-input boolean
    
    Whether to include input features when printing decoding output. Default is false.
    
    --threads positive-integer
    
    Number of threads for CRF training. Default is 1.
    
    Remaining arguments:
    
    training-data-file if training
    
    training-and-test-data-file, if training and testing with random split
    
    training-data-file test-data-file if training and testing from separate files
    
    test-data-file if testing
    
    input-data-file if applying to new data (unlabeled)
    
    Throws:
    
    java.lang.Exception - if an error occurs

Class SimpleTaggerWithConstraints

Method Summary

Methods inherited from class java.lang.Object

Method Detail

trainGE

trainPR

getCRF

test

apply

main