valueindexes

exception fmask.valueindexes.NonIntTypeError[source]
exception fmask.valueindexes.RangeError[source]
exception fmask.valueindexes.ValueIndexesError[source]
class fmask.valueindexes.ValueIndexes(a, nullVals=[])[source]

An object which contains the indexes for every value in a given array. This class is intended to mimic the reverse_indices clause in IDL, only nicer.

Takes an array, works out what unique values exist in this array. Provides a method which will return all the indexes into the original array for a given value.

The array must be of an integer-like type. Floating point arrays will not work. If one needs to look at ranges of values within a float array, it is possible to use numpy.digitize() to create an integer array corresponding to a set of bins, and then use ValueIndexes with that.

Example usage, for a given array a:

valIndexes = ValueIndexes(a)
for val in valIndexes.values:
    ndx = valIndexes.getIndexes(val)
    # Do something with all the indexes

This is a much faster and more efficient alternative to something like:

values = numpy.unique(a)
for val in values:
    mask = (a == val)
    # Do something with the mask

The point is that when a is large, and/or the number of possible values is large, this becomes very slow, and can take up lots of memory. Each loop iteration involves searching through a again, looking for a different value. This class provides a much more efficient way of doing the same thing, requiring only one pass through a. When a is small, or when the number of possible values is very small, it probably makes little difference.

If one or more null values are given to the constructor, these will not be counted, and will not be available to the getIndexes() method. This makes it more memory-efficient, so it doesn’t store indexes of a whole lot of nulls.

A ValueIndexes object has the following attributes:

  • values Array of all values indexed

  • counts Array of counts for each value

  • nDims Number of dimensions of original array

  • indexes Packed array of indexes

  • start Starting points in indexes array for each value

  • end End points in indexes for each value

  • valLU Lookup table for each value, to find it in the values array without explicitly searching.

  • nullVals Array of the null values requested.

Limitations: The array index values are handled using unsigned 32bit int values, so it won’t work if the data array is larger than 4Gb. I don’t think it would fail very gracefully, either.

getIndexes(val)[source]

Return a set of indexes into the original array, for which the value in the array is equal to val.