# Trend Analysis

The following is by Dennis Shea (NCAR):

The detection, estimation and prediction of trends and associated statistical and physical significance are important aspects  of climate research. Given a time series of (say) temperatures, the trend is the rate at which temperature changes over a time period. The trend may be linear or non-linear. However, generally, it is synonymous with the linear slope of the line fit to the time series. Simple linear regression is most commonly used to estimate the linear trend (slope) and statistical significance (via a  Student-t test). The null hypothesis is no trend (ie, an unchanging climate). The non-parametric (ie., distribution free) Mann-Kendall (M-K) test can also used to assess monotonic trend (linear or non-linear) significance. It is much less sensitive to outliers and skewed distributions. (Note: if the distribution of the deviations from the trend line is approximatly normally distributed, the M-K will return essentially the same result as simple linear regression.) The M-K test is often combined with the Theil-Sen robust estimate of linear trend. Whatever test is used, the user should understand the underlying assumptions of both the technique used to generate the estimates of trend and the statistical methods used for testing. For example, the Student t-test assumes the residuals have zero mean and constant variance. Further, a time series of N values may have fewer than N independent values due to serial correlation or seasonal effects. The estimate of the number of independent values is sometimes called the equivalent sample size. There are methodologies to estimate the number of independent values. It is this value that should be used in assessing the statistical significance in the (say) Student t-test.  Alternatively, the series may be pre-whitened or deseasonalized prior to applying the regression or M-K test statistical tests.

There are numerous caveats that should be kept in mind when analyzing trend. Some of these include: (1) Long term, observationally based estimates are subject to differing sampling networks. Coarser sampling is likely to result in larger uncertainties. Variables which have a large spatial autocorrelation (eg, temperature, sea level pressure) may have smaller sampling errors than (say) precipitation which generally has lower spatial correlation; (2) The climate system within which the observations are made is not stationary; (3) Station, ship and satellite observations are subject to assorted errors. These could be random, systematic and external such as changing instruments, observation times or observational environments. Much work has been done on creating time series that takes into account these factors; (4) While reanalysis projects provide unchanging data assimilation and model frameworks, the observational mix changes over time. That may introduce discontinuities in the time series that may cause a trend to be estimated significant when in fact it is an artifact of the discontinuities; (5) Even a long series of random numbers may have segments with short term trends. For example, the well known surface temperature record from the Climate Research Unit which spans 1850-present, shows an undeniable long-term warming trend. However, there are short term negative trends of 10-15 years embedded within this series. Also, the rate of warming changes depending on the starting date used in that time series; (6) As noted above, a series on N observations does not necessarily mean these observations are independent. Often, there is some temporal correlation. This should be taken into account for example when computing the degrees of freedom of the t-test.