This program computes comparison
statistics between a list of model ouput
and observation data. It outputs
results in a NetCDF file and, if required
by the user, also in a plain text
ASCII file.
Both model output and data are assumed to be climatologies: they span exactly
one year in a number of steps from 1 (annual mean) to 12 (monthly means)
or more. If there is only one time step, time axis need not be present.
Spatially, both are
assumed to be mapped onto a rectangular grid,
not necessarily the same nor
evenly spaced. They are assumed to span at least
a common region of the globe.
Comparison will be made over the common region.
For both, if a vertical dimension
is present (depth or level), only surface
level is compared.
The model list is given as a
parameter by the user. Each model for which
output file cannot be found, is
ignored.
Computed comparison statistics
are:
For
observation data:
- Standard deviation
- Global annual average
For each
model:
- Standard deviation
- Correlation between model and data
- RMS between model and data
- Global annual average
For mean
and median of all models:
- same as for each model
For an
optionnal second data set:
- same as for each model
Component list:
Standard deviations,
correlations and RMS are computed
for the following
components:
- A Total
space-time field
- B Zonal
annual mean
- C Zonal
monthly mean
- D Zonal
monthly anomaly: Monthly anomaly to zonal annual
mean
- E
Annual mean map
- F
Monthly map anomaly: Monthly anomaly to
annual mean map
- G
Longitudinal anomaly: Longitudinal anomaly to
monthly zonal mean
- H Annual
mean longitudinal anomaly: Annual mean
longitudinal anomaly to zonal mean
Naturally, when the number of time steps is different from 12, replace 'monthly' in the names
above with the appropriate term (quarterly, ...). When the number of time steps is 1
(no time dimension), only components A, B and H are meaningfull.
These components may also be more precisely defined this way:
where
x,y and t are respectively longitude, latitude and
time dimensions
S is the variable
_x_, _y_ and _t_ represent average over this
dimension
-
A: S( x, y, t)
-
B: S( _x_, y,
_t_)
average over longitude and time, fct of latitude
-
C: S( _x_, y,
t)
average over
longitude
, fct of latitude and time
-
D: S( _x_, y, t) - S( _x_, y,
_t_) C -
B
, fct of latitude and time
-
E: S( x, y,
_t_)
average over
time
, fct of longitude & latitude
-
F: S( x, y, t) - S( x, y,
_t_) A -
E
, fct of 3 dimensions
-
G: S( x, y, t) - S( _x_, y,
t) A -
G
, fct of 3 dimensions
-
H: S( x, y, _t_) - S( _x_, y,
_t_) E -
B
, fct of longitude & latitude
Output file structure:
The NetCDF output file
contains the following 1D variables:
For
observation data:
- ReferenceStandardDeviation: 1D variable of 7
standard deviations
for statistical components A to G, in this order.
- ReferenceAverage: a scalar, the global
annual average
For each
model:
- <model-name>_StandardDeviation:
- <model-name>_Correlation:
- <model-name>_RMS:
1D variables that contain standard deviations,
correlations
with observation data and RMS with observation data,
respectively,
for statistical components A to G, in this order.
- <model-name>_Average: a scalar, the
global annual average
For mean
and median of all models:
- same as for each model, with the character string
"Mean" and "Median"
in place of <model-name>
For an
optionnal second data set:
- same as for each model, with the character string
"Data_2" in place
of <model-name>
Usage: model_vs_data_comp_stat.py
[options] <obs-file-name>
<obs_var>
<model-file-templ>
<model-var>
|
obs-file-name:
NetCDF file name of observation data
obs_var:
variable id in this file
model-file-templ: NetCDF
file template name for model output
model-var-templ: variable template name in model output files
A file or a variable template name is a file name, respectively a variable
name, where each actual occurence of the model name in the file name,
respectively variable name, is replaced with the following string:
%(MODEL) or
%(model)
Example of model file template :
/home/geocean/ocmip/phase2/%(MODEL)/Abiotic/hist/%(model)_TAKA_Ftot_1995_drift.nc
Example of model variable template :
SST_%(model) : variable id of Sea Surface Temperature for each model
Options are:
--help
-h
print out this help
--models=<model-list>
-m <model-list>
Comma-separated list of all models/composite-var for
wich to compute
comparison statistics. Composite variables may be
'Median' or 'Mean',
for example. Another way to specify the model list is
the use of
parameter file named 'model_dictionary.py' (see
below)
The names appearing in the model-list are used as
keys to access
model output files (see model file template,
above).
These names also are used as base names for variables
in statistics
output file: if <name> is a model name in the
list, <name>_Correlation
<name>_StandardDeviation and <name>_RMS
are 1D variables in the output file.
--output=<filename>
-o
<filename> NetCDF file name for
storing statistic results
default is "model_vs_data_comp_stats.nc" in current
working directory
--text=<filename>
-t
<filename> Optionnal plain text
file name for storing statistic results
--data2=<filename>,<varname>
-2
<filename>,<varname>
NetCDF file name and variable name (id) for optionnal
second data set
If <varname> is ommited, default to
variable id of first data
set
--normalize-sign
-n Change the sign of model output before comparing to reference data,
so that it is in accordance with that of reference.
--scale=<factor>
-s
<factor> Apply a
scaling factor to model output before comparing to
reference data:
multiply model output by <factor>
Other parameter file:
File
'model_dictionary.py', if it exists in
working directory, is a python script
that
lists all models and composite vars used for
comparison.
It
describes also, for each model/composite-var, the
letter-code and the color
to
represent this model/var on the Taylor diagram.
File structure:
It
defines two Python variables named:
'model_dictionary' (mandatory, for models)
and
'other_dictionary' (optional, for additionnal
variables).
These variables are Python dictionnaries with,
- as
keys: model or other variable names
- as
values: inner two-item dictionnaries defining
code-letter and color for each model/variable
Code-letter is a string of one or more letter(s),
typically 1 or 2.
Color is a number ranging from 241 to 255. It is an
index
into
the default color palette.
Example:
Let's say we deal with two models 'Model_one' and
'Model_two'
'model_dictionary' variable will be defined as the
following:
model_dictionary = {\
"Model_one": {
"code_letter": "A", "color": 241 }, \
"Model_two": {
"code_letter": "2", "color": 242 }, \
}
|