cf.Data

class cf.Data(data=None, units=None, _FillValue=None, chunk=True)

Bases: object

An N-dimensionsal data array with units and masked values.

  • Contains an N-dimensional, indexable and broadcastable array with many similarities to a numpy array.
  • Contains the units of the array elements.
  • Supports masked arrays, regardless of whether or not it was initialized with a masked array.
  • Uses Large Amounts of massive Arrays (LAMA) functionality to store and operate on arrays which are larger then the available memory.

Indexing

A data array is indexable in a similar way to numpy array indexing but for two important differences:

  • Size 1 dimensions are never removed.

    An integer index i takes the i-th element but does not reduce the rank of the output array by one.

  • When advanced indexing is used on more than one dimension, the advanced indices work independently.

    When more than one dimension’s slice is a 1-d boolean array or 1-d sequence of integers, then these indices work independently along each dimension (similar to the way vector subscripts work in Fortran), rather than by their elements.

Examples

>>> d.shape
[12, 19, 73, 96]
>>> d[0, :, [0,1], [0,1,2]].shape
[1, 19, 2, 3]

Conversion to a numpy array

The data array may be converted to either a numpy array view or an independent numpy array of the underlying data with the varray and array attributes respectively. Changing a numpy array view in place will also change the data array. Note that the numpy array created with the array or varray attribute forces all of the data to be read into memory at the same time, which may not be possible for very large arrays.

Initialization

Parameters :
data : array-like, optional

The data for the array.

units : str or Units, optional

The units of the data.

_FillValue : object, optional

The fill value of the data. If set to None then the default numpy fill value appropiate to the data type will be used.

chunk : bool, optional

If True then the data array will be partitioned if it is larger than the chunk size.

>>> d = cf.Data(5)
>>> d = cf.Data([1,2,3], units='K')
>>> import numpy   
>>> d = cf.Data(numpy.arange(10).reshape(2,5), units=cf.Units('m/s'), _FillValue=-999)
>>> d = cf.Data(('a', 'b', 'c'))
add_partitions(extra_boundaries, adim, existing_boundaries=None)

Examples

>>> d.add_partitions(    )
all()

Test whether all array elements evaluate to True.

Masked values are considered as True during computation.

Examples

>>> d.array
array([0, 3, 0])
>>> d.all()
False
>>> d.array
array([1, 3, 2])
>>> d.all()
True
any()

Test whether any array elements evaluate to True.

Masked values are considered as True during computation.

Examples

>>> d.array
array([0, 0, 0])
>>> d.any()
False
>>> d.array
array([0, 3, 0])
>>> d.any()
True
change_dimension_names(dim_name_map)

Change the dimenion namesdata

change_units(new_units)
chunk(chunksize=None, extra_boundaries=None, chunk_dims=None)
Parameters :

chunksize : int, optional

extra_boundaries : sequence of lists or tuples, optional

chunk_dims : sequence of lists or tuples, optional

Returns :

extra_boundaries, chunk_dims : {list, list}

Examples

>>> d.chunk()
>>> d.chunk(100000)
>>> d.chunk(extra_boundaries=([3, 6],), chunk_dims=['dim0'])
>>> d.chunk(extra_boundaries=([3, 6], [40, 80]), chunk_dims=['dim0', 'dim1'])
copy()

Return a deep copy.

Equivalent to copy.deepcopy(d)

Returns :
out :

The deep copy.

Examples

>>> e = d.copy()
dump(id=None, omit=())
equals(other, rtol=None, atol=None)
expand_aggregating_dims(adim)
expand_dims(axis=0, dim='None', direction=True)

no check is done for dim already being in self.order

hash()
new_dim_name()
partition_boundaries()
reverse(axes=None)

axes is None, an integer or a sequence of 0 or more integers

axes=None reverses all dimensions

Returns axes:The axes which were reversed (in arbitrary numerical order).
save_to_disk(itemsize=None)
Parameters :itemsize : int, optional
Returns :out : bool
set_location_map()
squeeze(axes=None)

Remove size 1 dimensions from the shape of the data in place.

Parameters :

axes : int or tuple of ints, optional

The axes to be squeezed given by their positions. If unset then all size one dimensions of the data array are removed.

Returns :
out : tuple of ints

The axes which were squeezed as a tuple of their positions.

Examples

>>> v.shape
[1]
>>> v.squeeze()
>>> v.shape
[]
>>> v.shape
[1, 2, 1, 3, 1, 4, 1, 5, 1, 6, 1]
>>> v.squeeze(axis=2).shape
[1, 2, 3, 1, 4, 1, 5, 1, 6, 1]
>>> v.squeeze(axis=(0,)).shape
[2, 3, 1, 4, 1, 5, 1, 6, 1]
>>> v.squeeze(axis=(2, 4)).shape
[2, 3, 4, 5, 1, 6, 1]
>>> v.squeeze().shape
[2, 3, 4, 5, 6]
to_disk()
to_memory(regardless=False)

Store the data array in memory if it is smaller than the chunk size.

Parameters :
regardless : bool, optional

If True then store the data array in memory regardless of its size.

Returns :

None

Examples

>>> d.to_memory()
>>> d.to_memory(True)
transpose(axes=None)
axes: list of ints, optional
By default, reverse the dimensions, otherwise permute the axes according to the values given.
ufunc(func, *args, **kwargs)

Return a

array

A numpy array copy the data array.

Examples

>>> a = d.array
>>> type(a)
<type 'numpy.ndarray'>
binary_mask
dtype

Numpy data-type of the data array.

If the array is partitioned internally into sub-arrays with different data-types, then the normal data-type coercion rules apply (e.g. if the partitions have data-types ‘int32’ and ‘float32’ then the realised array’s data-type will be ‘float32’).

Examples

>>> type(f.dtype)
<type 'numpy.dtype'>
>>> f.dtype
dtype('float64')
first_datum

The first element of the data array.

Equivalent to x[(slice(0,None),) * x.ndim].array.item() or x[(slice(0,None),) * x.ndim] = y

Examples

>>> d.array
array([[1, 2],
       [3, 4]])
>> d.first_datum
1
>> d.first_datum = 999
>> d.array
array([[999,   2],
       [  3,   4]])
is_masked

True if the data array has any masked values.

Examples

>>> d.is_masked
True
is_scalar

True if the data array is a 0-d scalar array.

Examples

>>> d.ndim
0
>>> d.is_scalar
True
>>> d.ndim >= 1
True
>>> d.is_scalar
False
last_datum

The last element of the data array.

Equivalent to x[(slice(-1,None),) * x.ndim].array.item() or x[(slice(-1,None),) * x.ndim] = y

Examples

>>> d.array
array([[1, 2],
       [3, 4]])
>> d.last_datum
4
>> d.last_datum = 999
>> d.array
array([[  1,   2],
       [  3, 999]])
mask

The boolean missing data mask of the data array.

Returned as a Data object. The mask may be set to ‘no missing data’ by deleting the attribute.

Examples

>>> d.shape
[12, 73, 96]
>>> m = d.mask
>>> m.dtype
dtype('bool')
>>> m.shape
[12, 73, 96]
>>> del d.mask
>>> d.array.mask
False
>>> import numpy
>>> a.array.mask is numpy.ma.nomask
True
ndim

Number of dimensions in the data array.

Examples

>>> d.shape
[73, 96]
>>> d.ndim
2
shape

List of the data array’s dimension sizes.

Note that this attribute is a list, not a tuple.

Examples

>>> d.shape
[73, 96]
size

Number of elements in the data array.

Examples

>>> d.shape
[73, 96]
>>> d.size
7008
varray

A numpy array view the data array.

Note that making changes to elements of the returned view changes the underlying data.

Examples

>>> a = d.varray
>>> type(a)
<type 'numpy.ndarray'>
>>> a
array([0, 1, 2, 3, 4])
>>> a[0] = 999
>>> d.varray
array([999, 1, 2, 3, 4])

Previous topic

cf.CoordinateBounds

Next topic

cf.Flags

This Page