SST data: HadSST4
HadSST4 provides monthly SST anomalies on a 5°x5° grid for 1850-present. The anomalies are derived from a 30-year climatology spanning 1961-90. Coverage is global but there is no interpolation; Thus, missing data occur in the final product. This means that uncertainties due to limited spatial coverage and systematic errors are relatively easy to identify compared with interpolated SST products; it also minimizes the loss of variance in SST anomalies that occurs in interpolated products. The primary input data to HadSST4 are from ICOADS release 3.0. Bias adjustments to the ICOADS SSTs account for changes in measurement methods (e.g. engine room intake, bucket measurements, or buoy data). These adjustments are carried forward to the present. Uncertainties in these adjustments have complex spatial and temporal dependencies. The uncertainties are represented in an ensemble of 200 realizations of the data set. This allows the spatial and temporal characteristics of the uncertainties and their effects on climate signals to be explored. In addition to the ensemble, a median estimate is provided. Additional uncertainty arises from local measurement errors and local sampling errors; these errors are represented as gridded fields. Correlated measurement errors are represented in error-covariance matrices. Unadjusted raw data are provided but their use for climate studies is discouraged.
Derivative products: HadSST4 is used in the HadCRUT5 global combined land-ocean surface temperature data set. It has also been used in the Cowtan & Way global combined land-ocean surface temperature data set and will likely also be used in a forthcoming version of the Berkeley Earth global land-ocean data set.
Long record, extending back to 1850 and up to present with regular updates
Quantification of systematic uncertainties (due to measurement methods) in space and time presented as a 200-member ensemble; bias adjustments extend up to the present. For ~2000 to present, HadSST4 in good agreement with trends in ERSSTv4/v5 and data from satellites, buoys, and floats
No interpolation so anomalies in a grid cell can be traced to original observations. Also, no loss of variance that often arises from statistical reconstruction and interpolation methods.
Coarse resolution; features like western boundary currents and eddies not resolved
Substantial coverage gaps exist, especially early in the record
Uncertainty estimates can be difficult to use
Kennedy, J. J., Rayner, N. A., Atkinson, C. P., & Killick, R. E. ( 2019). An ensemble data set of sea‐surface temperature change from 1850: the Met Office Hadley Centre HadSST.184.108.40.206 data set. Journal of Geophysical Research: Atmospheres, 124. https://doi.org/10.1029/2018JD029867
Expert Developer Guidance
The following was contributed by Dr. John Kennedy, in September, 2019:
HadSST.220.127.116.11 (the Hadley Centre Sea-Surface Temperature data set version 4, Kennedy et al. 2019) is a gridded data set of sea-surface temperature anomalies (that is, temperature difference from the 1961-1990 average) from January 1850 to December 2018. The monthly grids have a resolution of 5° latitude by 5° longitude. HadSST.18.104.22.168 is based on quality-controlled in situ measurements of sea-surface temperature from the International Comprehensive Ocean Atmosphere Data Set (ICOADS) release 3.0. It is updated using ship and moored buoy data from ICOADS release 3.0.1 and drifting buoy data provided by the Copernicus Marine Environment Monitoring Service. In situ measurements are those made at the surface by ships, drifting buoys and moored buoys. Ship measurements are made using a variety of methods and adjustments have been applied to the data to minimise the impact of artificial variability caused by changes in instrumentation. Uncertainties associated with bias adjustments, measurement errors and sampling error have been estimated and are provided with the data. Grid-box SST anomalies are only estimated in those grid boxes that contain observations. Consequently, the data set is not globally complete.
What are the key strengths of this data set?
- The data set starts in 1850 and is therefore relatively long compared to most instrumental data sets.
- It is based on the latest release of the International Comprehensive Ocean Atmosphere Data Set, ICOADS release 3.0.0.
- SSTs are only estimated for a grid box when there are observations in that grid box. Therefore, every data point is based directly on observed values and uncertainties are easier to quantify.
- The grid-box average is estimated using simple resistant statistics that are less sensitive to outliers in the input data.
- Adjustments have been applied to the data set from 1850 to present to minimise the effect of changes in measurement method. Residual errors in the adjustments are included in the uncertainty estimation.
- Detailed uncertainty information is included and a product user guide describes how to use them along with example processing code in Python.
- Uncertainties associated with the bias adjustments are presented as an ensemble of 200 data sets which are presented in a set of NetCDF files in a consistent format.
What are the key limitations of this data set?
- The resolution of the data sets is limited. Monthly fields are presented on a monthly 5° latitude by 5° longitude grid.
- There are gaps in the data, particularly: early in the record; at high latitude; and during the two world wars. In general, coverage of the northern hemisphere is better than that in the southern hemisphere.
- No smoothing or interpolation is applied beyond aggregating observations onto a regular grid. Consequently, the data are noisier than more heavily processed data sets. However, the uncertainty estimates can be used to understand the relative reliability of the grid box values.
- The uncertainties associated with systematic errors from individual ships lead to large scale correlations in the errors. This uncertainty component is likely under-estimated because many observations lack appropriate metadata (ship callsigns).
- There are limitations to the uncertainty analysis. Therefore, it is best to use HadSST4 in combination with at least one other long term analysis such as ERSSTv5 or COBE-SST-2 to assess the structural uncertainty.
- Some users find the ensemble of 200 data sets unwieldy. Therefore a ‘median’ estimate is also provided.
What are the typical research applications of this data set?
- HadSST.22.214.171.124 is intended for use in studies looking at long-term changes in sea-surface temperature, or where detailed uncertainty information is required.
- Climate monitoring
- Detection and attribution
- As an input for globally infilled datasets.
What are the most common mistakes that users encounter when processing or interpreting these data?
- A common mistake is not to use all the uncertainty information provided. A product user guide shows how to process and combine the different uncertainty components to get an estimate of the total uncertainty.
- Because there are gaps in the data, where historically there were no observations, area-averages calculated from the data can suffer from coverage biases. When comparing data sets or comparing HadSST.126.96.36.199 to model data, it is a good idea to ensure that all data sets are reduced to their common coverage.
What are some comparable data sets, if any?
- There are many SST data sets that cover the satellite period from 1980 to the present. Data sets that are of comparable length to HadSST.188.8.131.52 are:
- ERSST NOAA Extended Reconstruction SST version 5, 1854-present (Huang et al. 2017)
- COBE-SST-2 Centennial Observation Based Estimates of Sea Surface Temperature, 1870-2014 (Hirahara et al 2013)
- HadISST, 1870-present (Rayner et al 2003).
- ICOADS summaries (Freeman et al. 2017), though note that these are not bias adjusted.
- Kaplan et al. (1997)
How is uncertainty characterized in these data?
- Uncertainty arising from three sources has been quantified: sampling error, measurement error and residual data biases.
- Uncertainties associated with sampling and uncorrelated measurement errors are provided as simple gridded fields of uncertainty, in a consistent format to the SST anomaly fields.
- Uncertainties associated with correlated measurement errors are provided as error-covariances, one per month, which can be used to propagate the uncertainties appropriately.
- Residual data bias errors are expressed using a 200-member ensemble. Each ensemble member has a different set of bias adjustments applied to it. Together these provide information about the uncertainty in the adjustments.
Were corrections made to account for changes in observing systems or practices, sampling density, satellite drift, or similar issues?
- Yes. Corrections were made to reduce the effect of instrumentation changes. Residual errors remaining after the
How useful are these data for characterizing means as well as extremes?
- The data are based on point observations aggregated onto a monthly 5° latitude by 5° longitude grid. They can be used to assess changes in monthly mean sea-surface temperature. No information about specific extreme values is included in the data set.
How do I best compare these data with model output?
It depends on what you are doing. There are two main things to bear in mind.
First, there are gaps in the data and the gaps are not randomly-distributed in time. Early in the record and during the two World Wars, the coverage is very sparse. Some analyses have accounted for this by masking the model output to have a similar coverage to that of the observations.
Second, there are uncertainties in the data. The data set has been presented as an ensemble of 200 data sets that characterise the uncertainties in the bias adjustments and, used together with separate fields of uncertainties associated with other measurement and sampling errors, these provide information about the estimated total uncertainty in the gridded fields.
The idea behind providing an ensemble was that it would be relatively easy to assess the sensitivity of an analysis to observational uncertainty by re-running the analysis on some, or all, of the observational ensemble. We very much welcome feedback from users on this general approach of using ensembles of observational data sets.
Are there spurious (non-climatic) features in the temporal record?
Probably. Although every effort has been made to minimise the effects of known changes in measurement methods, information concerning how measurements were made is limited. An attempt to quantify the uncertainty has made, but the period between 1935 and 1970 is particularly problematic as there were large, but poorly-documented changes in the way that measurements were made as well as discontinuities in the data sources used to form the ICOADS data base. It is recommended that users test their analyses using a range of long-term SST data sets in order to get a broader understanding of the uncertainties.
How do I access these data?
Data are available from https://www.metoffice.gov.uk/hadobs/hadsst4/data/download.html
How frequently are the data updated?
The data set is not currently being updated (it runs from January 1850 to December 2018), but monthly updates will commence in early 2020 typically with a two week lag.##
Figure 1: Global average sea-surface temperature anomaly from HadSST.184.108.40.206 in black (uncertainty shown by grey shading), compared to three comparison data sets: (a, b) comparison with Argo float data in purple, lower panel shows relative coverage; (c, d) comparison to Along Track Scanning Radiometer Reanalysis for Climate data set in red; (e, f) comparison to drifting buoy data in blue. Note that all comparisons are collocated to control for differences in data coverage. The unadjusted in situ data are shown as a dotted black line.(contributed by J Kennedy)
ICOADS SSTs release 3.0 and 3.0.1; drifting buoys from CMEMS