cf.Field.collapse

Field.collapse(method, axes=None, squeeze=True, mtol=1, weights=None, ddof=1, a=None, i=False, **kwargs)[source]

Collapse axes by statistical calculations.

Missing data array elements and those with zero weight are omitted from the calculation.

The following collapse methods are available over any subset of the field’s axes:

Method Notes
Maximum The maximum of the values.
Minimum The minimum of the values.
Sum The sum of the values.
Mid-range The average of the maximum and the minimum of the values.
Range The absolute difference between the maximum and the minimum of the values.
Mean

The unweighted mean, \(m\), of \(N\) values \(x_i\) is

\[m=\frac{1}{N}\sum_{i=1}^{N} x_i\]

The weighted mean, \(\tilde{m}\), of \(N\) values \(x_i\) with corresponding weights \(w_i\) is

\[\tilde{m}=\frac{1}{\sum_{i=1}^{N} w_i} \sum_{i=1}^{N} w_i x_i\]
Standard deviation

The unweighted standard deviation, \(s\), of \(N\) values \(x_i\) with mean \(m\) and with \(N-ddof\) degrees of freedom (\(ddof\ge0\)) is

\[s=\sqrt{\frac{1}{N-ddof} \sum_{i=1}^{N} (x_i - m)^2}\]

The weighted standard deviation, \(\tilde{s}_N\), of \(N\) values \(x_i\) with corresponding weights \(w_i\), weighted mean \(\tilde{m}\) and with \(N\) degrees of freedom is

\[\tilde{s}_N=\sqrt{\frac{1} {\sum_{i=1}^{N} w_i} \sum_{i=1}^{N} w_i(x_i - \tilde{m})^2}\]

The weighted standard deviation, \(\tilde{s}\), of \(N\) values \(x_i\) with corresponding weights \(w_i\) and with \(N-ddof\) degrees of freedom (\(ddof>0\)) is

\[\tilde{s}=\sqrt{\frac{a \sum_{i=1}^{N} w_i}{a \sum_{i=1}^{N} w_i - ddof}} \tilde{s}_N\]

where \(a\) is the smallest positive number whose product with each weight is an integer. \(a \sum_{i=1}^{N} w_i\) is the size of a new sample created by each \(x_i\) having \(aw_i\) repeats. In practice, \(a\) may not exist or may be difficult to calculate, so \(a\) is either set to a predetermined value or an approximate value is calculated. The approximation is the smallest positive number whose products with the smallest and largest weights and the sum of the weights are all integers, where a positive number is considered to be an integer if its decimal part is sufficiently small (no greater than 10-8 plus 10-5 times its integer part). This approximation will never overestimate \(a\), so \(\tilde{s}\) will never be underestimated when the approximation is used. If the weights are all integers which are collectively coprime then setting \(a=1\) will guarantee that \(\tilde{s}\) is exact.

Variance The variance is the square of the standard deviation.
Parameters :
method : str or cf.Cellmethods

Define the collapse method. All of the axes specified by the axes parameter are collapsed simultaneously by this method. Each method is given by one of the following strings:

Method

Possible strings

Maximum

'max', 'maximum'

Minimum

'min', 'minimum'

Sum

'sum'

Mid-range

'mid_range'

Range

'range'

Mean

'mean', 'average', 'avg'

Standard deviation

'sd', 'standard_deviation'

Variance

'var', 'variance'

An alternative form is to provide a CF cell methods-like string or a cf.CellMethods object equivalent to such a string. In this case an ordered sequence of collapses may be defined and collapse methods and the axes to which they apply are both provided. The axes are interpreted as for the axes parameter.

Example:

>>> g = f.collapse('time: max X: Y: mean dim3: sd')

is equivalent to:

>>> g = f.collapse('max', axes='time')
>>> g = g.collapse('mean', axes=['X', 'Y'])
>>> g = g.collapse('sd', axes='dim3')
axes, kwargs : optional

The axes to be collapsed. The axes are those that would be selected by this call of the field’s axes method: f.axes(axes, **kwargs). See cf.Field.axes for details. If an axis has size 1 then it is ignored. By default all axes with size greater than 1 are collapsed.

weights : optional

squeeze : bool, optional

If True then collapsed axes are removed from the data array. By default the axes which are collapsed are left in the result’s data array as axes with size 1.

mtol : number, optional

For each element in the output data array, the fraction of contributing input array elements which is allowed to contain missing data. Where this fraction exceeds mtol, missing data is returned. The default is 1, meaning a missing datum in the output array only occurs when its contributing input array elements are all missing data. A value of 0 means that a missing datum in the output array occurs whenever any of its contributing input array elements are missing data. Any intermediate value is permitted.

ddof : number, optional

The delta degrees of freedom in the calculation of a standard deviation or variance. The number of degrees of freedom used in the calculation is (N-ddof) where N represents the number of elements. By default ddof is 1, meaning the standard deviation of the population is estimated according to the usual formula with (N-1) in the denominator to avoid the bias caused by the use of the sample mean (Bessel’s correction).

a : optional

Specify the value of \(a\) in the calculation of a weighted standard deviation or variance when ddof is greater than 0. See the notes above for details. a must be a field of values which is broadcastable to the collapsed field or else a number. By default approximate values are calculated.

i : bool, optional

If True then update the field in place. By default a new field is created.

Returns :
out : cf.Field

The collapsed field.

Previous topic

cf.Field.close

Next topic

cf.Field.cm

This Page