Parameters: |
- method : str
Define the collapse method. All of the axes specified by the
axes parameter are collapsed simultaneously by this
method. The method is given by one of the following strings:
method |
Description |
'max' or 'maximum' |
Maximum |
'min' or 'minimum' |
Minimum |
'sum' |
Sum |
'mid_range' |
Mid-range |
'range' |
Range |
'mean' or 'average' or 'avg' |
Mean |
'sd' or 'standard_deviation' |
Standard deviation |
'var' or 'variance' |
Variance |
'sample_size' |
Sample size |
'sum_of_weights' |
Sum of weights |
'sum_of_weights2' |
Sum of squares of weights |
An alternative form is to provide a CF cell methods-like
string. In this case an ordered sequence of collapses may be
defined and both the collapse methods and their axes are
provided. The axes are interpreted as for the axes
parameter, which must not also be set. For example:
>>> g = f.collapse('time: max (interval 1 hr) X: Y: mean dim3: sd')
is equivalent to:
>>> g = f.collapse('max', axes='time')
>>> g = g.collapse('mean', axes=['X', 'Y'])
>>> g = g.collapse('sd', axes='dim3')
Climatological collapses are carried out if a method string
contains any of the modifiers 'within days', 'within
years', 'over days' or 'over years'. For example, to
collapse a time axis into multiannual means of calendar
monthly minima:
>>> g = f.collapse('time: minimum within years T: mean over years',
... within_years=cf.M())
which is equivalent to:
>>> g = f.collapse('time: minimum within years', within_years=cf.M())
>>> g = g.collapse('mean over years', axes='T')
- axes, kwargs : optional
The axes to be collapsed. The axes are those that would be
selected by this call of the field’s axes method:
f.axes(axes, **kwargs). See cf.Field.axes for
details. If an axis has size 1 then it is ignored. By default
all axes with size greater than 1 are collapsed. An exception
is raised if axes is not None and method is a CF cell
methods-like string.
- weights : optional
Specify the weights for the collapse. The weights are those
that would be returned by this call of the field’s
weights method: f.weights(weights,
components=True). By default weights is 'auto', meaning
that a combination of volume, area and linear weights is
created based on the field’s metadata. See cf.Field.weights
for details.
- Example:
To specify weights based on cell areas use
weights='area'. To specify weights based on cell areas
and linear height you could set weights=('area', 'Z').
- squeeze : bool, optional
If True then size 1 collapsed axes are removed from the output
data array. By default the axes which are collapsed are
retained in the result’s data array.
- mtol : number, optional
Set the fraction of input array elements which is allowed to
contain missing data when contributing to an individual output
array element. Where this fraction exceeds mtol, missing
data is returned. The default is 1, meaning that a missing
datum in the output array only occurs when its contributing
input array elements are all missing data. A value of 0 means
that a missing datum in the output array occurs whenever any
of its contributing input array elements are missing data. Any
intermediate value is permitted.
- Example:
To ensure that an output array element is a missing datum
if more than 25% of its input array elements are missing
data: mtol=0.25.
- ddof : number, optional
The delta degrees of freedom in the calculation of a standard
deviation or variance. The number of degrees of freedom used
in the calculation is (N-ddof) where N represents the number
of non-missing elements. By default ddof is 1, meaning the
standard deviation and variance of the population is estimated
according to the usual formula with (N-1) in the denominator
to avoid the bias caused by the use of the sample mean
(Bessel’s correction).
- a : optional
Specify the value of \(a\) in the calculation of a
weighted standard deviation or variance when the ddof
parameter is greater than 0. See the notes above for
details. A value is required each output array element, so a
must be a single number or else a field which is broadcastable
to the collapsed field. By default the calculation of each
output array element uses an approximate value of a which is
the smallest positive number whose products with the smallest
and largest of the contributing weights, and their sum, are
all integers. In this case, a positive number is considered to
be an integer if its decimal part is sufficiently small (no
greater than 10-8 plus 10-5 times its
integer part).
- Example:
To guarantee that \(\tilde{s}\) is exact when the
weights for each output array element are collectively
coprime integers: a=1.
- Note:
- The default approximation will never overestimate
\(a\), so \(\tilde{s}\) will always greater than
or equal to its true value when \(a\) is not
specified.
- coordinate : str, optional
Set how the cell coordinate values for collapsed axes are
defined. This has no effect on the cell bounds for the
collapsed axes, which always represent the extrema of the
input coordinates. Valid values are:
coordinate |
Description |
'mid_range' |
An output coordinate is the average of the
first and last input coordinate bounds (or
the first and last coordinates if there are
no bounds). This is the default. |
'min' |
An output coordinate is the minimum of the
input coordinates. |
'max' |
An output coordinate is the maximum of the
input coordinates. |
- group : optional
Independently collapse groups of axis elements. Upon output,
the results of the collapses are concatenated so that the
output axis has a size equal to the number of groups. The
group parameter defines how the elements are partitioned
into groups, and may be one of:
A numpy.array of integers defining groups. The array
must have the same length as the axis to be collapsed and
its sequence of values correspond to the axis
elements. Each group contains the elements which
correspond to a common non-negative integer value in the
numpy array. Upon output, the collapsed axis is arranged
in order of increasing group number.
- Example:
For an axis of size 8, create two groups, the first
containing the first and last elements and the second
containing the 3rd, 4th and 5th elements, whilst
ignoring the 2nd, 6th and 7th elements:
group=numpy.array([0, -1, 4, 4, 4, -1, -2, 0]).
- Note:
- The groups do not have to be in runs of consective
elements; they may be scattered throughout the axis.
- An element which corresponds to a negative integer
in the array will not be in a group.
- group_by : str, optional
Specify how coordinates are assigned to the groups defined by
the group, within_days or within_years
parameter. Ignored unless one of these parameters is a
cf.Data or cf.TimeDuration object. The group_by
parameter may be one of:
- 'coords'. This is the default. Each group contains the
axis elements whose coordinate values lie within the group
limits. Every element will be in a group.
- 'bounds'. Each group contains the axis elements whose
upper and lower coordinate bounds both lie within the
group limits. Some elements may not be inside any group,
either because the group limits do not coincide with
coordinate bounds or because the group size is
sufficiently small.
- regroup : bool, optional
For grouped collapses, return a numpy.array of integers
which identifies the groups defined by the group
parameter. The array is interpreted as for a numpy array value
of the group parameter, and thus may subsequently be used by
group parameter in a separate collapse. For example:
>>> groups = f.collapse('time: mean', group=10, regroup=True)
>>> g = f.collapse('time: mean', group=groups)
is equivalent to:
>>> g = f.collapse('time: mean', group=10)
- within_days : optional
Independently collapse groups of reference-time axis elements
for CF “within days” climatological statistics. Each group
contains elements whose coordinates span a time interval of up
to one day. Upon output, the results of the collapses are
concatenated so that the output axis has a size equal to the
number of groups.
- Note:
For CF compliance, a “within days” collapse should be
followed by an “over days” collapse.
The within_days parameter defines how the elements are
partitioned into groups, and may be one of:
- within_years : optional
Independently collapse groups of reference-time axis elements
for CF “within years” climatological statistics. Each group
contains elements whose coordinates span a time interval of up
to one calendar year. Upon output, the results of the
collapses are concatenated so that the output axis has a size
equal to the number of groups.
- Note:
For CF compliance, a “within years” collapse should be
followed by an “over years” collapse.
The within_years parameter defines how the elements are
partitioned into groups, and may be one of:
- over_days : optional
Independently collapse groups of reference-time axis elements
for CF “over days” climatological statistics. Each group
contains elements whose coordinates are matching, in that
their lower bounds have a common time of day but different
dates of the year, and their upper bounds also have a common
time of day but different dates of the year. Upon output, the
results of the collapses are concatenated so that the output
axis has a size equal to the number of groups.
- Example:
An element with coordinate bounds {1999-12-31 06:00:00,
1999-12-31 18:00:00} matches an element with
coordinate bounds {2000-01-01 06:00:00, 2000-01-01
18:00:00}.
- Example:
An element with coordinate bounds {1999-12-31 00:00:00,
2000-01-01 00:00:00} matches an element with
coordinate bounds {2000-01-01 00:00:00, 2000-01-02
00:00:00}.
- Note:
A coordinate parameter value of 'min' is assumed,
regardless of its given value.
A group_by parameter value of 'bounds' is assumed,
regardless of its given value.
An “over days” collapse must be preceded by a “within
days” collapse, as described by the CF conventions. If the
field already contains sub-daily data, but does not have
the “within days” cell methods flag then it may be added,
for example, as follows (this example assumes that the
appropriate cell method is the most recently applied,
which need not be the case; see cf.CellMethods for
details):
>>> f.cell_methods[-1].within = 'days'
The over_days parameter defines how the elements are
partitioned into groups, and may be one of:
- None. This is the default. Each collection of
matching elements forms a group.
- over_years : optional
Independently collapse groups of reference-time axis elements
for CF “over years” climatological statistics. Each group
contains elements whose coordinates are matching, in that
their lower bounds have a common sub-annual date but different
years, and their upper bounds also have a common sub-annual
date but different years. Upon output, the results of the
collapses are concatenated so that the output axis has a size
equal to the number of groups.
- Example:
An element with coordinate bounds {1999-06-01 06:00:00,
1999-09-01 06:00:00} matches an element with
coordinate bounds {2000-06-01 06:00:00, 2000-09-01
06:00:00}.
- Example:
An element with coordinate bounds {1999-12-01 00:00:00,
2000-12-01 00:00:00} matches an element with
coordinate bounds {2000-12-01 00:00:00, 2001-12-01
00:00:00}.
- Note:
A coordinate parameter value of 'min' is assumed,
regardless of its given value.
A group_by parameter value of 'bounds' is assumed,
regardless of its given value.
An “over years” collapse must be preceded by a “within
years” or an “over days” collapse, as described by the
CF conventions. If the field already contains sub-annual
data, but does not have the “within years” or “over
days” cell methods flag then it may be added, for
example, as follows (this example assumes that the
appropriate cell method is the most recently applied,
which need not be the case; see cf.CellMethods for
details):
>>> f.cell_methods[-1].over = 'days'
The over_years parameter defines how the elements are
partitioned into groups, and may be one of:
- None. Each collection of matching elements forms a
group. This is the default.
- i : bool, optional
If True then update the field list in place. By default a new
field list is created. In either case, a field list is
returned.
|