Field structure

A field (stored in a cf.Field object) is a container for a data array (stored in a cf.Data object) and metadata comprising properties to describe the physical nature of the data and a coordinate system (called a domain, stored in a cf.Domain object), which describes the positions of each element of the data array.

It is structured in exactly the same way as a field construct defined by the CF data model.

The field’s domain may contain coordinates and cell measures (which themselves contain data arrays and properties to describe them and are stored in cf.Coordinate and cf.CellMeasure objects respectively) and transforms (stored in cf.Transform objects) which provide further coordinate metadata and describe how other coordinates may be computed.

As in the CF data model, all components of a field are optional.

Example

The structure is exposed by printing out a full dump of a field, followed by descriptions of some of the output sections:

>>> type(f)
<class 'cf.field.Field'>
>>> cf.dump(f)
======================
Field: air_temperature
======================
Dimensions
    height(1)
    latitude(64)
    longitude(128)
    time(12)

Data(time(12), latitude(64), longitude(128)) = [[[236.512756348, ..., 256.93371582]]] K
cell_methods = time: mean (interval: 1.0 month)

experiment_id = 'pre-industrial control experiment'
long_name = 'Surface Air Temperature'
standard_name = 'air_temperature'
title = 'model output prepared for IPCC AR4'

Dimension coordinate: time
    Data(time(12)) = [ 450-11-16 00:00:00, ...,  451-10-16 12:00:00] noleap calendar
    Bounds(time(12), 2) = [[ 450-11-01 00:00:00, ...,  451-11-01 00:00:00]] noleap calendar
    axis = 'T'
    long_name = 'time'
    standard_name = 'time'

Dimension coordinate: latitude
    Data(latitude(64)) = [-87.8638000488, ..., 87.8638000488] degrees_north
    Bounds(latitude(64), 2) = [[-90.0, ..., 90.0]] degrees_north
    axis = 'Y'
    long_name = 'latitude'
    standard_name = 'latitude'

Dimension coordinate: longitude
    Data(longitude(128)) = [0.0, ..., 357.1875] degrees_east
    Bounds(longitude(128), 2) = [[-1.40625, ..., 358.59375]] degrees_east
    axis = 'X'
    long_name = 'longitude'
    standard_name = 'longitude'

Dimension coordinate: height
    Data(height(1)) = [2.0] m
    axis = 'Z'
    long_name = 'height'
    positive = 'up'
    standard_name = 'height'
Dimensions
Describes the identities and sizes of the field’s dimensions.
air temperature field
Describes the field’s data array (array shape, first and last values, units and cell methods) and other descriptive CF properties (experiment_id, long_name, standard_name and title)
domain
Describes the coordinate system of the field by describing the coordinates, cell measures and transforms. See the Domain structure section for more details.
time coordinate
Describes the coordinate’s data array (array shape, first and last values and units), the coordinate’s cell bounds array (array shape, first and last values and units) and other descriptive CF properties (axis, long_name and standard_name)

CF properties and attributes

Most CF properties are stored as familiar python objects (str, int, float, tuple, list, numpy.ndarray, etc.):

>>> f.standard_name
'air_temperature'
>>> f._FillValue
1e+20
>>> f.valid_range
(-50.0, 50.0)
>>> f.flag_values
array([0, 1, 2, 4], dtype=int8)

There are some CF properties which have their own class:

Property Class Description
cell_methods cf.CellMethods The characteristics that are is represented by cell values
>>> f.cell_methods
<CF CellMethods: time: mean (interval: 1.0 month)>

There are some attributes which store metadata other than CF properties which require their own class:

Attribute Class Description
Flags cf.Flags The self describing CF flag values, meanings and masks
Units cf.Units The units of the data array
domain cf.Domain The field’s domain
>>> f.Flags
<CF Flags: values=[0 1 2], masks=[0 2 2], meanings=['low' 'medium' 'high']>
>>> f.Units
<CF Units: days since 1860-1-1 calendar=360_day>
>>> f.domain
<CF Domain: (110, 106, 1, 19)>

The cf.Units object may be accessed through the field’s units and calendar CF properties and the cf.Flags object may be accessed through the field’s flag_values, flag_meanings and flag_masks CF properties:

>>> f.calendar = 'noleap'
>>> f.flag_values = ['a', 'b', 'c']

The cf.Units and cf.Flags objects may also be manipulated directly, which automatically adjusts the relevant CF properties:

>>> f.Units
<CF Units: 'm'>
>>> f.units
'm'
>>> f.Units *= 1000
>>> f.Units
<CF Units: '1000 m'>
>>> f.units
'1000 m'
>>> f.Units.units = '10 m'
>>> f.units
'10 m'

Other attributes used commonly (but not reserved) are:

Attribute Description
file The name of the file the field was read from
id An identifier for the field in the absence of a standard name. This may be used for ascertaining if two fields are aggregatable or combinable.
ncvar The netCDF variable name of the field
>>> f.file
'/home/me/file.nc'
>>> f.id
'data_123'
>>> f.ncvar
'tas'

Data array

A field’s data array is stored by the Data attribute as a cf.Data object:

>>> type(f.Data)
<class 'cf.data.Data'>

The cf.Data object:

  • Contains an N-dimensional array with many similarities to a numpy array.
  • Contains the units of the array elements.
  • Uses LAMA functionality to store and operate on arrays which are larger then the available memory.
  • Supports masked arrays [1], regardless of whether or not it was initialized with a masked array.

Attributes

A field has attributes which give information about its data array. These are analogous to their numpy counterparts with the same name.

Field attribute Description Numpy counterpart
size Number of elements in the data array numpy.ndarray.size
shape Tuple of the data array’s dimension sizes numpy.ndarray.shape
ndim Number of dimensions in the data array numpy.ndarray.ndim
dtype Numpy data type of the data array numpy.ndarray.dtype

Data mask

The data array’s mask may be retrieved with the field’s mask attribute. The mask is returned as a field with a boolean data array:

>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96) K>
>>> m = f.mask
>>> m
<CF Field: mask(time(12), latitude(73), longitude(96)>
>>> m.dtype
dtype('bool')

Domain structure

A domain completely describes the field’s coordinate system.

It contains the dimension constructs, auxiliary coordinate constructs, transform constructs and cell measure constructs defined by the CF data model.

A field’s domain is stored in its domain attribute, the value of which is a cf.Domain object.

The domain is a dictionary-like object whose key/value pairs identify and store the coordinate and cell measure constructs which describe it.

Dimensionality

The dimension sizes of the domain are given by the domain’s dimension_sizes attribute:

>>> f.domain.dimension_sizes
{'dim1': 19, 'dim0': 12, 'dim2': 73, 'dim3': 96}

Keys are dimension identifiers ('dimN') and values are integers giving the size of each dimension.

The N part of each key identifier is replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

Components

The domain’s key/value pairs identify and store its coordinate and cell measure constructs.

Keys for dimension, auxiliary coordinate and cell measure identifiers ('dimN', 'auxN' and 'cm' respectively) and values are cf.Coordinate and cf.CellMeasure objects as appropriate:

>>> f.domain['dim0']
<CF Coordinate: time(12)>
>>> f.domain['dim2']
<CF Coordinate: latitude73)>
>>> f.domain['aux0']
<CF Coordinate: forecast_time(12)>

The dimensions of each of these components, and of the field’s data array, are stored as ordered lists in the dimensions attribute:

>>> f.domain.dimensions
{'data': ['dim0', 'dim1', 'dim2', 'dim3'],
 'aux0': ['dim0'],
 'dim0': ['dim0'],
 'dim1': ['dim1'],
 'dim2': ['dim2'],
 'dim3': ['dim3']}

Keys are dimension coordinate identifiers ('dimN'), auxiliary coordinate identifiers ('auxN') and cell measure construct identifiers ('cmN'), and values are lists of dimension identifiers ('dimN'), stating the dimensions, in order, of the construct concerned. The dimension identifiers must all exist as keys to the dimension_sizes dictionary.

The special key 'data' stores the ordered list of dimension identifiers ('dimN') relating to a containing field’s data array.

The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

Note

The field’s data array may contain fewer size 1 dimensions than its domain.

Transform constructs are stored in the transforms attribute, which is a dictionary-like object containing cf.Transform objects:

>>> f.domain.transforms
{'trans0': <CF Transform: atmosphere_sigma_coordinate>,
 'trans1': <CF Transform: rotated_latitude_longitude>}

Keys are transform identifiers ('transN') and values are cf.Transform objects.

The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

A transform may be associated with any number of the domain’s coordinates via their transform attributes.

Field list

A cf.FieldList object is an ordered sequence of fields analogous to a built-in python list.

It has all of the python list-like methods (__contains__, __getitem__, __setitem__, __len__, __delitem__, append, count, extend, index, insert, pop, remove, reverse), which behave as expected. For example:

>>> type(fl)
<class 'cf.field.FieldList'>
>>> fl
[<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>,
 <CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>]
>>> len(fl)
2
>>> for f in fl:
...     print repr(f)
...
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> for f in fl[::-1]:
...     print repr(f)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
>>> f = fl[0]
>>> type(f)
<class 'cf.field.Field'>
>>> f in fl
True
>>> f = fl.pop()
>>> type(f)
<class 'cf.field.Field'>

Field versus field list

In some contexts, whether an object is a field or a field list is not known and does not matter. So to avoid ungainly type testing, some aspects of the cf.FieldList interface are shared by a cf.Field and vice versa.

Attributes and methods

Any attribute or method belonging to a field may be used on a field list and will be applied independently to each element:

>>> fl.ndim
[2, 3]
>>> fl.subspace[..., 0]
[<CF Field: x_wind(grid_latitude(110), grid_longitude(1)) m s-1>,
 <CF Field: air_temperature(time(12), latitude(73), longitude(1)) K>]
>>> fl **= 2
>>> for f in fl:
...     f.long_name = f.standard_name + '**2'
...
>>> fl
[<CF Field: long_name:x_wind**2(grid_latitude(110), grid_longitude(1)) m2 s-2>,
 <CF Field: long_name:air_temperature**2(time(12), latitude(73), longitude(1)) K2>]
>>> fl.squeeze('longitude')
[<CF Field: long_name:x_wind**2(grid_latitude(110)) m2 s-2>,
 <CF Field: long_name:air_temperature**2(time(12), latitude(73)) K2>]

CF properties may be changed to a common value with the setattr method:

>>> fl.setattr('comment', 'my data')
>>> fl.comment
['my data', 'my data']
>>> fl.setattr('foo', 'bar')
>>> fl.getattr('foo')
['bar', 'bar']

Changes tailored to each individual field in the list need to be carried out in a loop:

>>> long_names = ('square of x wind', 'square of temperature')
>>> for f, value in zip(fl, long_names):
...     f.long_name = value
>>> for f in fl:
...     f.long_name = 'square of ' + f.long_name

Looping

Just as it is straight forward to iterate over the fields in a field list, a field will behave like a single element field list in iterative and indexing contexts:

>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> f is f[0]
True
>>> f is f[-1]
True
>>> f is f[slice(0, 1)]
True
>>> f is f[slice(0, None, -1)]
True
>>> for g in f:
...     repr(g)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>

Footnotes

[1]Arrays that may have missing or invalid entries