Problems discovered while processing the climate model data (CMIP5 and CMIP5-CORDEX)

Climate-model data is not perfect and sometimes deviations in data are identified when performing quality controls. Scientific analysts should be aware of these variations when using the data. Here, some issues, which were discovered while processing climate model data are described. Some of the problems were handled using different methods to reduce the discrepancies.

Composition of reference data set

The reference data set HydroGFD2.0 (Berg et al., 2018) covers land area on a 0.5 degree resolution. The geographical extension is conservative, meaning mainly grid boxes that have full land coverage are included, i.e many grid boxes along coastlines that are only partially covered by land are not included. The World-Wide HYPE model (Arheimer et al., 2020) needs input data on all land areas. Therefore, a composite dataset was created, adding ERA-Interim data to HydroGFD2.0 data to cover all land areas (also coasts). Since reference data thus comes from two different data sources weather patterns in adjacent grid boxes along the coast line might not be consistent.

Heat islands

When bias adjustment was performed, “heat islands” with extreme temperature values were discovered. This occurred in several CORDEX domains but was most common in the North America domain. The problem occurs from a combination of climate model data and bias adjustment method. The model data in the affected regions had a strongly underestimated standard deviation compared to the reference data for daily mean (tas), daily minimum (tasmin) and daily maximum (tasmax) temperature. This leads to large scaling factors in the bias adjustment, which amplifies any deviation of tas, tasmin and tasmax from its mean value during the reference period. Since some values became unrealistic, the correction of the variability in the data was restricted to a limited range. This removed most of the issues with extreme values, but with the drawback that variability in the reference period could not be corrected completely. However, this does not mean that the problem is completely solved, extreme temperatures still occur in the bias-adjusted data.

Incomplete time series

Some climate models run only to year 2098 or 2099. Since the period 2071-2100 is used for statistics, data until end of 2100 is needed. Scenarios with incomplete time series were filled out by copying the last year of data and add it at the end of the time series until the full period is covered.

Discrepancies in reference data

In the regions Bengkulu and Sumatera Barat in Indonesia discrepancies in HydroGFD2.0 reference was made. The standard deviation seems to have an unrealistically high value, which can originate from poor observation data in the area. This affects all bias adjusted climate data in the area, resulting in too high values of future temperatures. We therefore recommend being cautious when using bias adjusted data from this area.

Composition of reference data set

Heat islands

Incomplete time series

Discrepancies in reference data

Read more

OPEN DATA LICENSE

DISCLAIMER