Taylor diagram

 


MODEL-DATA
climatologies

comparison statistics SUB-PROGRAM





Sub-program description:

    This program computes comparison statistics between a list of model ouput
    and observation data. It outputs results in a NetCDF file and, if required
    by the user, also in a plain text ASCII file.

    Both model output and data are assumed to be climatologies: they span exactly
    one year in a number of steps from 1 (annual mean) to 12 (monthly means)
    or more. If there is only one time step, time axis need not be present.

    Spatially, both are assumed to be mapped onto a rectangular grid,
    not necessarily the same nor evenly spaced. They are assumed to span at least
    a common region of the globe. Comparison will be made over the common region.
    For both, if a vertical dimension is present (depth or level), only surface
    level is compared.

    The model list is given as a parameter by the user. Each model for which
    output file cannot be found, is ignored.

    Computed comparison statistics are:

        For observation data:
            - Standard deviation
            - Global annual average
        For each model:
            - Standard deviation
            - Correlation between model and data
            - RMS between model and data
            - Global annual average
        For mean and median of all models:
            - same as for each model
        For an optionnal second data set:
            - same as for each model


     Component list:

     Standard deviations, correlations and RMS are computed
     for the following components:

        - A Total space-time field
        - B Zonal annual mean
        - C Zonal monthly mean
        - D Zonal monthly anomaly: Monthly anomaly to zonal annual mean
        - E Annual mean map
        - F Monthly map anomaly:   Monthly anomaly to annual mean map
        - G Longitudinal anomaly:  Longitudinal anomaly to monthly zonal mean
        - H Annual mean longitudinal anomaly:  Annual mean longitudinal anomaly to zonal mean

     Naturally, when the number of time steps is different from 12, replace 'monthly' in the names
     above with the appropriate term (quarterly, ...). When the number of time steps is 1
     (no time dimension), only components A, B and H are meaningfull.

    These components may also be more precisely defined this way:

        where x,y and t are respectively longitude, latitude and time dimensions
              S is the variable
              _x_, _y_ and _t_ represent average over this dimension

        - A:    S( x, y, t)
        - B:    S( _x_, y, _t_)                    average over longitude and time, fct of latitude
        - C:    S( _x_, y, t)                      average over longitude         , fct of latitude and time
        - D:    S( _x_, y, t) - S( _x_, y, _t_)    C - B                          , fct of latitude and time
        - E:    S( x, y, _t_)                      average over time              , fct of longitude & latitude
        - F:    S( x, y, t) - S( x, y, _t_)        A - E                          , fct of 3 dimensions
        - G:    S( x, y, t) - S( _x_, y, t)        A - G                          , fct of 3 dimensions
        - H:    S( x, y, _t_) - S( _x_, y, _t_)    E - B                          , fct of longitude & latitude

    Output file structure:

    The NetCDF output file contains the following 1D variables:

        For observation data:
            - ReferenceStandardDeviation: 1D variable of 7 standard deviations
              for statistical components A to G, in this order.
            - ReferenceAverage: a scalar, the global annual average

        For each model:
            - <model-name>_StandardDeviation:
            - <model-name>_Correlation:
            - <model-name>_RMS:
                1D variables that contain standard deviations, correlations
                with observation data and RMS with observation data, respectively,
                for statistical components A to G, in this order.
            - <model-name>_Average: a scalar, the global annual average

        For mean and median of all models:
            - same as for each model, with the character string "Mean" and "Median"
              in place of <model-name>

        For an optionnal second data set:
            - same as for each model, with the character string "Data_2" in place
              of <model-name>



Sub-program usage:

Usage: model_vs_data_comp_stat.py [options] <obs-file-name> <obs_var> <model-file-templ> <model-var>

        obs-file-name:      NetCDF file name of observation data
        obs_var:            variable id in this file
        model-file-templ:   NetCDF file template name for model output
        model-var-templ:    variable template name in model output files

        A file or a variable template name is a file name, respectively a variable
        name, where each actual occurence of the model name in the file name,
        respectively variable name, is replaced with the following string:
            %(MODEL) or
            %(model)

    Example of model file template :

        /home/geocean/ocmip/phase2/%(MODEL)/Abiotic/hist/%(model)_TAKA_Ftot_1995_drift.nc

    Example of model variable template :

        SST_%(model)    : variable id of Sea Surface Temperature for each model

    Options are:

     --help
     -h 
             print out this help

     --models=<model-list>
     -m <model-list>
Comma-separated list of all models/composite-var for wich to compute
                     comparison statistics. Composite variables may be 'Median' or 'Mean',
                     for example. Another way to specify the model list is the use of
                     parameter file named 'model_dictionary.py' (see below)

                     The names appearing in the model-list are used as keys to access
                     model output files (see model file template, above).

                     These names also are used as base names for variables in statistics
                     output file: if <name> is a model name in the list, <name>_Correlation
                     <name>_StandardDeviation and <name>_RMS are 1D variables in the output file.

     --output=<filename>
     -o <filename> 
  NetCDF file name for storing statistic results
                     default is "model_vs_data_comp_stats.nc" in current working directory

     --text=<filename>
     -t <filename>
   Optionnal plain text file name for storing statistic results

     --data2=<filename>,<varname>
     -2 <filename>,<varname>
   NetCDF file name and variable name (id) for optionnal second data set
                     If <varname> is ommited, default to variable id of first data set                   

     --normalize-sign
     -n
              Change the sign of model output before comparing to reference data,
                     so that it is in accordance with that of reference.

     --scale=<factor>
     -s <factor> 
    Apply a scaling factor to model output before comparing to reference data:
                     multiply model output by <factor>

    Other parameter file:

    File 'model_dictionary.py', if it exists in working directory, is a python script
         that lists all models and composite vars used for comparison.
         It describes also, for each model/composite-var, the letter-code and the color
         to represent this model/var on the Taylor diagram.

    File structure:
         It defines two Python variables named: 'model_dictionary' (mandatory, for models)
         and 'other_dictionary' (optional, for additionnal variables).

         These variables are Python dictionnaries with,
         - as keys: model or other variable names
         - as values: inner two-item dictionnaries defining
                      code-letter and color for each model/variable

         Code-letter is a string of one or more letter(s), typically 1 or 2.
         Color is a number ranging from 241 to 255. It is an index
         into the default color palette.

    Example:
         Let's say we deal with two models 'Model_one' and 'Model_two'
         'model_dictionary' variable will be defined as the following:

            model_dictionary  = {\
                "Model_one":      { "code_letter": "A",  "color": 241 },  \
                "Model_two":      { "code_letter": "2",  "color": 242 },  \
                }