Class PagedInstanceList

  • All Implemented Interfaces:
    AlphabetCarrying, java.io.Serializable, java.lang.Cloneable, java.lang.Iterable<Instance>, java.util.Collection<Instance>, java.util.List<Instance>, java.util.RandomAccess

    public class PagedInstanceList
    extends InstanceList
    An InstanceList which avoids OutOfMemoryErrors by saving Instances to disk when there is not enough memory to create a new Instance. It implements a fixed-size paging scheme, where each page on disk stores instancesPerPage Instances. So, while the number of Instances per pages is constant, the size in bytes of each page may vary. Using this class instead of InstanceList means the number of Instances you can store is essentially limited only by disk size (and patience). The paging scheme is optimized for the most frequent case of looping through the InstanceList from index 0 to n. If there are n instances, then instances 0->(n/size()) are stored together on page 1, instances (n/size)+1 -> 2*(n/size) are on page 2, ... etc. This way, pages adjacent in the instances list will usually be in the same page.
    Author:
    Aron Culotta culotta@cs.umass.edu
    See Also:
    InstanceList, Serialized Form
    • Constructor Detail

      • PagedInstanceList

        public PagedInstanceList​(Pipe pipe,
                                 int numPages,
                                 int instancesPerPage,
                                 java.io.File swapDir)
        Creates a PagedInstanceList where "instancesPerPage" instances are swapped to disk in directory "swapDir" if the amount of free system memory drops below "minFreeMemory" bytes
        Parameters:
        pipe - instance pipe
        numPages - number of pages to keep in memory
        instancesPerPage - number of Instances to store in each page
        swapDir - where the pages on disk live.
      • PagedInstanceList

        public PagedInstanceList​(Pipe pipe,
                                 int numPages,
                                 int instancesPerPage)
    • Method Detail

      • split

        public InstanceList[] split​(java.util.Random r,
                                    double[] proportions)
        Shuffles the elements of this list among several smaller lists. Overrides InstanceList.split to add instances in original order, to prevent thrashing.
        Overrides:
        split in class InstanceList
        Parameters:
        proportions - A list of numbers (not necessarily summing to 1) which, when normalized, correspond to the proportion of elements in each returned sublist.
        r - The source of randomness to use in shuffling.
        Returns:
        one InstanceList for each element of proportions
      • add

        public boolean add​(Instance instance)
        Appends the instance to this list. Note that since memory for the Instance has already been allocated, no check is made to catch OutOfMemoryError.
        Specified by:
        add in interface java.util.Collection<Instance>
        Specified by:
        add in interface java.util.List<Instance>
        Overrides:
        add in class InstanceList
        Returns:
        true if successful
      • get

        public Instance get​(int index)
        Returns the Instance at the specified index. If this Instance is not in memory, swap a block of instances back into memory.
        Specified by:
        get in interface java.util.List<Instance>
        Overrides:
        get in class java.util.ArrayList<Instance>
      • set

        public Instance set​(int index,
                            Instance instance)
        Replaces the Instance at position index with a new one. Note that this is the only sanctioned way of changing an Instance.
        Specified by:
        set in interface java.util.List<Instance>
        Overrides:
        set in class InstanceList
      • getCollectGarbage

        public boolean getCollectGarbage()
      • setCollectGarbage

        public void setCollectGarbage​(boolean b)
      • clear

        public void clear()
        Specified by:
        clear in interface java.util.Collection<Instance>
        Specified by:
        clear in interface java.util.List<Instance>
        Overrides:
        clear in class InstanceList
      • getSwapIns

        public int getSwapIns()
      • getSwapInTime

        public long getSwapInTime()
      • getSwapOuts

        public int getSwapOuts()
      • getSwapOutTime

        public long getSwapOutTime()
      • size

        public int size()
        Specified by:
        size in interface java.util.Collection<Instance>
        Specified by:
        size in interface java.util.List<Instance>
        Overrides:
        size in class java.util.ArrayList<Instance>
      • iterator

        public java.util.Iterator<Instance> iterator()
        Specified by:
        iterator in interface java.util.Collection<Instance>
        Specified by:
        iterator in interface java.lang.Iterable<Instance>
        Specified by:
        iterator in interface java.util.List<Instance>
        Overrides:
        iterator in class java.util.ArrayList<Instance>
      • load

        public static InstanceList load​(java.io.File file)
        Constructs a new InstanceList, deserialized from file. If the string value of file is "-", then deserialize from System.in.