Package cc.mallet.types
Class Alphabet
- java.lang.Object
-
- cc.mallet.types.Alphabet
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
LabelAlphabet
public class Alphabet extends java.lang.Object implements java.io.Serializable
A mapping between integers and objects where the mapping in each direction is efficient. Integers are assigned consecutively, starting at zero, as objects are added to the Alphabet. Objects can not be deleted from the Alphabet and thus the integers are never reused.The most common use of an alphabet is as a dictionary of feature names associated with a
FeatureVector
in anInstance
. In a simple document classification usage, each unique word in a document would be a unique entry in the Alphabet with a unique integer associated with it. FeatureVectors rely on the integer part of the mapping to efficiently represent the subset of the Alphabet present in the FeatureVector.- See Also:
FeatureVector
,Instance
,Pipe
, Serialized Form
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static boolean
alphabetsMatch(AlphabetCarrying object1, AlphabetCarrying object2)
Convenience method that can often implement alphabetsMatch in classes that implement the AlphabetsCarrying interface.java.lang.Object
clone()
boolean
contains(java.lang.Object entry)
void
dump()
void
dump(java.io.PrintStream out)
void
dump(java.io.PrintWriter out)
java.lang.Class
entryClass()
java.util.UUID
getInstanceId()
boolean
growthStopped()
java.util.Iterator
iterator()
int
lookupIndex(java.lang.Object entry)
int
lookupIndex(java.lang.Object entry, boolean addIfNotPresent)
Return -1 if entry isn't present.int[]
lookupIndices(java.lang.Object[] objects, boolean addIfNotPresent)
java.lang.Object
lookupObject(int index)
java.lang.Object[]
lookupObjects(int[] indices)
java.lang.Object[]
lookupObjects(int[] indices, java.lang.Object[] buf)
Returns an array of the objects corresponding tojava.lang.Object
readResolve()
This gets called after readObject; it lets the object decide whether to return itself or return a previously read in version.void
setInstanceId(java.util.UUID id)
int
size()
void
startGrowth()
void
stopGrowth()
java.lang.Object[]
toArray()
java.lang.Object[]
toArray(java.lang.Object[] in)
Returns an array containing all the entries in the Alphabet.java.lang.String
toString()
Return String representation of all Alphabet entries, each separated by a newline.
-
-
-
Method Detail
-
clone
public java.lang.Object clone()
- Overrides:
clone
in classjava.lang.Object
-
lookupIndex
public int lookupIndex(java.lang.Object entry, boolean addIfNotPresent)
Return -1 if entry isn't present.
-
lookupIndex
public int lookupIndex(java.lang.Object entry)
-
lookupObject
public java.lang.Object lookupObject(int index)
-
toArray
public java.lang.Object[] toArray()
-
toArray
public java.lang.Object[] toArray(java.lang.Object[] in)
Returns an array containing all the entries in the Alphabet. The runtime type of the returned array is the runtime type of in. If in is large enough to hold everything in the alphabet, then it it used. The returned array is such that for all entries obj, ret[lookupIndex(obj)] = obj .
-
iterator
public java.util.Iterator iterator()
-
lookupObjects
public java.lang.Object[] lookupObjects(int[] indices)
-
lookupObjects
public java.lang.Object[] lookupObjects(int[] indices, java.lang.Object[] buf)
Returns an array of the objects corresponding to- Parameters:
indices
- An array of indices to look upbuf
- An array to store the returned objects in.- Returns:
- An array of values from this Alphabet. The runtime type of the array is the same as buf
-
lookupIndices
public int[] lookupIndices(java.lang.Object[] objects, boolean addIfNotPresent)
-
contains
public boolean contains(java.lang.Object entry)
-
size
public int size()
-
stopGrowth
public void stopGrowth()
-
startGrowth
public void startGrowth()
-
growthStopped
public boolean growthStopped()
-
entryClass
public java.lang.Class entryClass()
-
toString
public java.lang.String toString()
Return String representation of all Alphabet entries, each separated by a newline.- Overrides:
toString
in classjava.lang.Object
-
dump
public void dump()
-
dump
public void dump(java.io.PrintStream out)
-
dump
public void dump(java.io.PrintWriter out)
-
alphabetsMatch
public static boolean alphabetsMatch(AlphabetCarrying object1, AlphabetCarrying object2)
Convenience method that can often implement alphabetsMatch in classes that implement the AlphabetsCarrying interface.
-
getInstanceId
public java.util.UUID getInstanceId()
-
setInstanceId
public void setInstanceId(java.util.UUID id)
-
readResolve
public java.lang.Object readResolve() throws java.io.ObjectStreamException
This gets called after readObject; it lets the object decide whether to return itself or return a previously read in version. We use a hashMap of instanceIds to determine if we have already read in this object.- Returns:
- Throws:
java.io.ObjectStreamException
-
-