Vautard and Ghil (1989: VG hereafter) realized the formal similarity between classical lagged-covariance analysis and the method of delays. They exploited this similarity further by pointing out that pairs of SSA eigenmodes corresponding to nearly equal eigenvalues and associated (temporal) principal components that are nearly in phase quadrature can represent efficiently a nonlinear, anharmonic oscillation. This is due to the fact that a single pair of data-adaptive eigenmodes can capture the basic periodicity of a boxcar or seesaw-shaped oscillation, say, rather than necessitate the many overtones that will appear in methods with fixed basis functions.
There are three basic steps in SSA: i) embedding the sampled time series in a vector space of dimension M; ii) computing the MxM lag-covariance matrix CD of the data; and iii) diagonalizing CD.
Step (i): The time series {x(t): t=1, .......N} is embedded into a vector space of dimension M by considering M lagged copies {x(t-j): j=1, .......M} thereof. The choice of the dimension M is not obvious, but SSA is typically successful at analyzing periods in the range (M/5, M).
Step (ii): One defines the MxM lag-covariance matrix estimator CD. There are three distinct methods used widely to define CD. In the BK (Broomhead and King) algorithm, a window of length M is moved along the time series, producing a sequence of N'=N-M+1 vectors in the embedding space. This sequence is used to obtain the N' x M trajectory matrix D, where the i-th row is the i-th view of the time series through the window. In this approach, CD is defined by:
In the VG algorithm, CD is estimated directly from the data as a Toeplitz matrix with constant diagonals, i.e., its entries Cij depend only on the lag |i-j|:
Burg covariance estimation is an iterative
process based on fitting an AR model with a number of AR components equal
to the SSA window length. Both the Burg and VG methods
impose a Toeplitz structure upon the autocovariance matrix whereas the BK method does not. The Burg approach in principal should involve
less "power leakage" due to the finite length of the time series and should
therefore improve resolution. However, the Burg estimate can induce significant biases when
nonstationarities and very low-frequency variations are present in the
series. Thus in some cases it will be worthwhile to try the VG method. Also, for long series (N on the order of 5,000), the VG estimate is less computationally burdensome and thus is completed
more quickly.
In practice, the VG method is more prone
than is the Burg method to numerical instabilities (yielding negative eigenvalues)
when pure oscillations are analyzed. The BK method is
thus somewhat less prone to problems with nonstationary time series, although the VG method seems untroubled by
all but the most extreme nonstationarities. However the Toeplitz methods do appear to yield more stable results
under the influence of minor perturbations of the time series.
Both the Burg, BK and VG algorithms are implemented in the kSpectra Toolkit.
Step (iii): The covariance matrix calculated from the N sample points, is then diagonalized and the eigenvalues are ranked in decreasing order. The eigenvalue {&lambdak, k=1,..,M} gives the variance of the time series in the direction specified by the corresponding eigenvector Ek ; the square roots of the eigenvalues are called singular values and the corresponding set the singular spectrum. These terms give SSA its name; BK showed how to obtain singular spectrum by SVD applied to the trajectory matrix D. VG called the Ek's temporal empirical orthogonal functions (EOFs), by analogy with the meteorological term used when applying PCA in the spatial domain.
If we arrange and plot the singular values in decreasing order, one can often distinguish an initial steep slope, representing the signal, and a (more or less) ``flat floor,'' representing the noise level. To correctly separate signal from noise, this visual observation of the spectral plot must be confirmed by suitable criteria for statistical significance as discussed later.
Once these three steps have been completed, a number of other interesting results can be obtained. For each EOF Ek with components {Ek(j): j=1, .......M} , we can construct the time series of length N' given by:
called the k-th principal component (PC). It represents the projection of the original time series onto the k-th EOF. The sum of the power spectra of the PCs is identical to the power spectrum of the time series x(t); therefore we can study separately the spectral contribution of the various components. The PCs, however, have length N', not N, and do not contain phase information.
In order to extract time series of length N, corresponding to a chosen subset of K eigenelements, the associated PCs are combined to form the partial reconstruction RK(t) of the original time series x(t). Over the central part of the time series,
and
These series of length N are called reconstructed components (RCs). They have the important property of preserving the phase of the time series; therefore x(t) and RK(t) can be superimposed. When K contains the components of a single eigenvector Ek , RK(t) is called the k-th RC. The time series can be reconstructed completely by summing its RCs, without losing any information. When a linear trend is present, reconstruction based on the BK algorithm has certain advantages near the endpoints of the sample.
The correct partial reconstruction of the signal, i.e., the reconstruction over the correct set K={1,2,.S} of EOFs, yields the optimal signal-to-noise (S/N) enhancement with respect to white noise. MEM spectral estimation will work very well on this clean signal which, unlike the raw time series, is band limited. As a result, a low-order AR-process fit can yield good spectral resolution, without spurious peaks.
In order to perform a good reconstruction of the signal or of its oscillatory part(s) several methods - either heuristic or based on Monte Carlo ideas can be used for S/N separation or for reliable identification of oscillatory pairs and trends and are included in the kSpectra Toolkit.
Monte Carlo SSA (MC-SSA) can be used to establish whether a given timeseries is linearly distinguishable from any well-defined process, including the output of a deterministic chaotic system, but we will focus on testing against the linear stochastic processes which are normally considered as ``noise''. ``Red noise'' is often used to refer to any linear stochastic process in which power declines monotonically with increasing frequency, but we prefer to use the term to refer specifically to a first-order autoregressive, or AR(1), whose value at a time t depends on the value at time t-1 only,
where &xi(t) is a gaussian-distributed white-noise process with a variance &sigma2, x0 is the process mean and a1 is a constant coefficient.
When testing against a red noise null-hypothesis, the first step in MC-SSA is to determine the red-noise coefficients &sigma and a1 from the time series x(t) using a maximum-likelihood criterion. Based on these coefficients, an ensemble of surrogate red-noise data can be generated and, for each realization, a covariance matrix CR is computed. These covariance matrices are then projected onto the eigenvector basis EX of the original data,
The resulting matrix &LambdaR is not necessarily diagonal, but it measures the resemblance of a given surrogate set with the data set. This resemblance can be quantified by computing the statistics of the diagonal elements of &LambdaR. The statistical distribution of these elements, determined from the ensemble of Monte Carlo simulations, gives confidence intervals outside which a time series can be considered to be significantly different from a generic red-noise simulation. For instance, if an eigenvalue &lambdak lies outside a 95% noise percentile, then the red-noise null hypothesis for the associated EOF (and PC) can be rejected with this confidence. Otherwise, that particular SSA component of the time series cannot be considered as significantly different from red noise.
Special care must be taken at the parameter-estimation stage because, for the results of a test against AR(1) noise to be interesting, we must ensure we are testing against that specific AR(1) process which maximises the likelihood that we will fail to reject the null-hypothesis. Only if we can reject this (the ``worst case'') red noise process, can we be confident of rejecting all other red noise processes at the same or higher confidence level. A second important point is test multiplicity: comparing M data eigenvalues with M confidence intervals computed from the surrogate ensemble, we expect M/10 to lie above the 90th percentiles even if the null hypothesis is true. Thus a small number of excursions above a relatively low percentile should be interpreted with caution.
The MC-SSA algorithm described above can be adapted to eliminate known periodic components and test the residual against noise. This adaptation provides sharper insight into the dynamics captured by the data, since known periodicities (like orbital forcing on the Quaternary time scale or seasonal forcing on the intraseasonal-to-interannual one) often generate much of the variance at the lower frequencies manifest in a time series and alter the rest of the spectrum. The refinement of MC-SSA consists in restricting the projections to the EOFs that do not account for known periodic behavior. This refinement is implemented in the kSpectra toolkit, where it is referred to as the "Include EOFs".
Other variations of the basic MC-SSA include projection CR onto the set eigenvectors of the null-hypothesis covraince matrix (AR(1) basis), and a much computationally faster Chi-squared approximation. In contrast to MC-SSA, the chi-squared test approximates the distribution of surrogate projections as chi-squared with 3M/N degrees of freedom. This is quite accurate for the case of sinusoidal EOFs and a linear (AR-type) noise null-hypothesis. The probability of n excursions above the m-th percentile is then estimated using the binomial distribution.
A typical testing procedure for Monte Carlo or Chi-Squared tests would be as follows (i) begin by testing against pure noise and identify frequencies at which the data displays anomalously high power against this null-hypothesis; (ii) find the data EOFs which correspond to these frequencies (being data-adaptive, these will often provide a better filter for the signal than the corresponding EOFs of the noise); finally (iii) re-test including these EOFs in the null-hypothesis to check if there are other features in the spectrum which were concealed by the dominant signal . It is, of course, consistent to iterate this procedure, but beware of the problem of test multiplicity: iteration makes the test increasingly liberal, so results which are still on the margins of acceptable significance after a couple of iterations including successively more EOFs into the "signal" should be ignored,
Results of the Monte Carlo or Chi-Squared tests should be also compared with the output from Pairing tests, Heuristic error bars, and checking that relevant EOFs and PCs are indeed in phase-quadrature.
1. Allen, M.R., and Smith, L.A., 1996: Monte Carlo SSA: detecting oscillations in the presence of coloured noise. J. Climate, 9, 3373.
2. Broomhead, D.S., and King, G., 1986: Extracting qualitative dynamics from experimental data, Physica D, 20, 217-236.
3. J. P. Burg, A new analysis technique for time series data, in Modern Spectrum Analysis (D. G. Childers, ed.), IEEE Press, New York, 1978, 42-48.
4. D. Kondrashov and M. Ghil, 2006: Spatio-temporal filling of missing points in geophysical data sets, Nonl. Proc. Geophys., 13, 151-159, get it here.
5. Vautard, R., and Ghil, M., 1989: Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series, Physica D, 35, 395-424.