Package cc.mallet.fst

Class CRF

  • All Implemented Interfaces:
    java.io.Serializable
    Direct Known Subclasses:
    MEMM

    public class CRF
    extends Transducer
    implements java.io.Serializable
    Represents a CRF model.
    See Also:
    Serialized Form
    • Field Detail

      • inputAlphabet

        protected Alphabet inputAlphabet
      • outputAlphabet

        protected Alphabet outputAlphabet
      • states

        protected java.util.ArrayList<CRF.State> states
      • initialStates

        protected java.util.ArrayList<CRF.State> initialStates
      • name2state

        protected java.util.HashMap<java.lang.String,​CRF.State> name2state
      • featureInducers

        protected java.util.ArrayList<FeatureInducer> featureInducers
      • weightsValueChangeStamp

        protected int weightsValueChangeStamp
      • weightsStructureChangeStamp

        protected int weightsStructureChangeStamp
      • cachedNumParametersStamp

        protected int cachedNumParametersStamp
      • numParameters

        protected int numParameters
    • Constructor Detail

      • CRF

        public CRF​(Pipe inputPipe,
                   Pipe outputPipe)
      • CRF

        public CRF​(CRF other)
        Create a CRF whose states and weights are a copy of those from another CRF.
    • Method Detail

      • getInputAlphabet

        public Alphabet getInputAlphabet()
      • getOutputAlphabet

        public Alphabet getOutputAlphabet()
      • weightsStructureChanged

        public void weightsStructureChanged()
        This method should be called whenever the CRFs weights (parameters) have their structure/arity/number changed.
      • weightsValueChanged

        public void weightsValueChanged()
        This method should be called whenever the CRFs weights (parameters) are changed.
      • newState

        protected CRF.State newState​(java.lang.String name,
                                     int index,
                                     double initialWeight,
                                     double finalWeight,
                                     java.lang.String[] destinationNames,
                                     java.lang.String[] labelNames,
                                     java.lang.String[][] weightNames,
                                     CRF crf)
      • addState

        public void addState​(java.lang.String name,
                             double initialWeight,
                             double finalWeight,
                             java.lang.String[] destinationNames,
                             java.lang.String[] labelNames,
                             java.lang.String[][] weightNames)
      • addState

        public void addState​(java.lang.String name,
                             double initialWeight,
                             double finalWeight,
                             java.lang.String[] destinationNames,
                             java.lang.String[] labelNames,
                             java.lang.String[] weightNames)
      • addState

        public void addState​(java.lang.String name,
                             double initialWeight,
                             double finalWeight,
                             java.lang.String[] destinationNames,
                             java.lang.String[] labelNames)
        Default gives separate parameters to each transition.
      • addState

        public void addState​(java.lang.String name,
                             java.lang.String[] destinationNames)
        Add a state with parameters equal zero, and labels on out-going arcs the same name as their destination state names.
      • addFullyConnectedStates

        public void addFullyConnectedStates​(java.lang.String[] stateNames)
        Add a group of states that are fully connected with each other, with parameters equal zero, and labels on their out-going arcs the same name as their destination state names.
      • addFullyConnectedStatesForLabels

        public void addFullyConnectedStatesForLabels()
      • addStartState

        public void addStartState()
      • addStartState

        public void addStartState​(java.lang.String name)
      • setAsStartState

        public void setAsStartState​(CRF.State state)
      • addStatesForLabelsConnectedAsIn

        public void addStatesForLabelsConnectedAsIn​(InstanceList trainingSet)
        Add states to create a first-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
      • addStatesForHalfLabelsConnectedAsIn

        public void addStatesForHalfLabelsConnectedAsIn​(InstanceList trainingSet)
        Add as many states as there are labels, but don't create separate weights for each source-destination pair of states. Instead have all the incoming transitions to a state share the same weights.
      • addStatesForThreeQuarterLabelsConnectedAsIn

        public void addStatesForThreeQuarterLabelsConnectedAsIn​(InstanceList trainingSet)
        Add as many states as there are labels, but don't create separate observational-test-weights for each source-destination pair of states---instead have all the incoming transitions to a state share the same observational-feature-test weights. However, do create separate default feature for each transition, (which acts as an HMM-style transition probability).
      • addFullyConnectedStatesForThreeQuarterLabels

        public void addFullyConnectedStatesForThreeQuarterLabels​(InstanceList trainingSet)
      • addFullyConnectedStatesForBiLabels

        public void addFullyConnectedStatesForBiLabels()
      • addStatesForBiLabelsConnectedAsIn

        public void addStatesForBiLabelsConnectedAsIn​(InstanceList trainingSet)
        Add states to create a second-order Markov model on labels, adding only those transitions the occur in the given trainingSet.
      • addFullyConnectedStatesForTriLabels

        public void addFullyConnectedStatesForTriLabels()
      • addSelfTransitioningStateForAllLabels

        public void addSelfTransitioningStateForAllLabels​(java.lang.String name)
      • addOrderNStates

        public java.lang.String addOrderNStates​(InstanceList trainingSet,
                                                int[] orders,
                                                boolean[] defaults,
                                                java.lang.String start,
                                                java.util.regex.Pattern forbidden,
                                                java.util.regex.Pattern allowed,
                                                boolean fullyConnected)
        Assumes that the CRF's output alphabet contains Strings. Creates an order-n CRF with input predicates and output labels given by trainingSet and order, connectivity, and weights given by the remaining arguments.
        Parameters:
        trainingSet - the training instances
        orders - an array of increasing non-negative numbers giving the orders of the features for this CRF. The largest number n is the Markov order of the CRF. States are n-tuples of output labels. Each of the other numbers k in orders represents a weight set shared by all destination states whose last (most recent) k labels agree. If orders is null, an order-0 CRF is built.
        defaults - If non-null, it must be the same length as orders, with true positions indicating that the weight set for the corresponding order contains only the weight for a default feature; otherwise, the weight set has weights for all features built from input predicates.
        start - The label that represents the context of the start of a sequence. It may be also used for sequence labels. If no label of this name exists, one will be added. Connection wills be added between the start label and all other labels, even if fullyConnected is false. This argument may be null, in which case no special start state is added.
        forbidden - If non-null, specifies what pairs of successive labels are not allowed, both for constructing norder states or for transitions. A label pair (u,v) is not allowed if u + "," + v matches forbidden.
        allowed - If non-null, specifies what pairs of successive labels are allowed, both for constructing norder states or for transitions. A label pair (u,v) is allowed only if u + "," + v matches allowed.
        fullyConnected - Whether to include all allowed transitions, even those not occurring in trainingSet,
        Returns:
        The name of the start state.
      • getState

        public CRF.State getState​(java.lang.String name)
      • setWeights

        public void setWeights​(int weightsIndex,
                               SparseVector transitionWeights)
      • setWeights

        public void setWeights​(java.lang.String weightName,
                               SparseVector transitionWeights)
      • getWeightsName

        public java.lang.String getWeightsName​(int weightIndex)
      • getWeights

        public SparseVector getWeights​(java.lang.String weightName)
      • getWeights

        public SparseVector getWeights​(int weightIndex)
      • getDefaultWeights

        public double[] getDefaultWeights()
      • setDefaultWeights

        public void setDefaultWeights​(double[] w)
      • setDefaultWeight

        public void setDefaultWeight​(int widx,
                                     double val)
      • isWeightsFrozen

        public boolean isWeightsFrozen​(int weightsIndex)
      • freezeWeights

        public void freezeWeights​(int weightsIndex)
        Freezes a set of weights to their current values. Frozen weights are used for labeling sequences (as in transduce), but are not be modified by the train methods.
        Parameters:
        weightsIndex - Index of weight set to freeze.
      • freezeWeights

        public void freezeWeights​(java.lang.String weightsName)
        Freezes a set of weights to their current values. Frozen weights are used for labeling sequences (as in transduce), but are not be modified by the train methods.
        Parameters:
        weightsName - Name of weight set to freeze.
      • unfreezeWeights

        public void unfreezeWeights​(java.lang.String weightsName)
        Unfreezes a set of weights. Frozen weights are used for labeling sequences (as in transduce), but are not be modified by the train methods.
        Parameters:
        weightsName - Name of weight set to unfreeze.
      • setFeatureSelection

        public void setFeatureSelection​(int weightIdx,
                                        FeatureSelection fs)
      • setWeightsDimensionAsIn

        public void setWeightsDimensionAsIn​(InstanceList trainingData)
      • setWeightsDimensionAsIn

        public void setWeightsDimensionAsIn​(InstanceList trainingData,
                                            boolean useSomeUnsupportedTrick)
      • setWeightsDimensionDensely

        public void setWeightsDimensionDensely()
      • getWeightsIndex

        public int getWeightsIndex​(java.lang.String weightName)
      • isTrainable

        public boolean isTrainable()
      • getWeightsValueChangeStamp

        public int getWeightsValueChangeStamp()
      • getWeightsStructureChangeStamp

        public int getWeightsStructureChangeStamp()
      • getParametersAbsNorm

        public double getParametersAbsNorm()
      • setParameter

        public void setParameter​(int sourceStateIndex,
                                 int destStateIndex,
                                 int featureIndex,
                                 double value)
        Only sets the parameter from the first group of parameters.
      • setParameter

        public void setParameter​(int sourceStateIndex,
                                 int destStateIndex,
                                 int featureIndex,
                                 int weightIndex,
                                 double value)
      • getParameter

        public double getParameter​(int sourceStateIndex,
                                   int destStateIndex,
                                   int featureIndex)
        Only gets the parameter from the first group of parameters.
      • getParameter

        public double getParameter​(int sourceStateIndex,
                                   int destStateIndex,
                                   int featureIndex,
                                   int weightIndex)
      • getNumParameters

        public int getNumParameters()
      • predict

        @Deprecated
        public Sequence[] predict​(InstanceList testing)
        Deprecated.
        This method is deprecated.
      • induceFeaturesFor

        public void induceFeaturesFor​(InstanceList instances)
        When the CRF has done feature induction, these new feature conjunctions must be created in the test or validation data in order for them to take effect.
      • print

        public void print​(java.io.PrintWriter out)
      • write

        public void write​(java.io.File f)