Class KMeans

  • All Implemented Interfaces:
    java.io.Serializable

    public class KMeans
    extends Clusterer
    KMeans Clusterer Clusters the points into k clusters by minimizing the total intra-cluster variance. It uses a given Metric to find the distance between Instances, which should have SparseVectors in the data field.
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int EMPTY_DROP
      Drop an empty cluster
      static int EMPTY_ERROR
      Treat an empty cluster as an error condition.
      static int EMPTY_SINGLE
      Place the single instance furthest from the previous cluster mean
    • Constructor Summary

      Constructors 
      Constructor Description
      KMeans​(Pipe instancePipe, int numClusters, Metric metric)
      Construct a KMeans object
      KMeans​(Pipe instancePipe, int numClusters, Metric metric, int emptyAction)
      Construct a KMeans object
    • Field Detail

      • EMPTY_ERROR

        public static final int EMPTY_ERROR
        Treat an empty cluster as an error condition.
        See Also:
        Constant Field Values
      • EMPTY_SINGLE

        public static final int EMPTY_SINGLE
        Place the single instance furthest from the previous cluster mean
        See Also:
        Constant Field Values
    • Constructor Detail

      • KMeans

        public KMeans​(Pipe instancePipe,
                      int numClusters,
                      Metric metric,
                      int emptyAction)
        Construct a KMeans object
        Parameters:
        instancePipe - Pipe for the instances being clustered
        numClusters - Number of clusters to use
        metric - Metric object to measure instance distances
        emptyAction - Specify what should happen when an empty cluster occurs
      • KMeans

        public KMeans​(Pipe instancePipe,
                      int numClusters,
                      Metric metric)
        Construct a KMeans object
        Parameters:
        instancePipe - Pipe for the instances being clustered
        numClusters - Number of clusters to use
        metric - Metric object to measure instance distances

        If an empty cluster occurs, it is considered an error.

    • Method Detail

      • getClusterMeans

        public java.util.ArrayList<SparseVector> getClusterMeans()
        Return the ArrayList of cluster means after a run of the algorithm.
        Returns:
        An ArrayList of Instances.