Measuring the "Complexity" of a time series
Bruce Land and Damian Elias

Introduction

People want to measure the complexity of various signals, such as bird song, ECG, a protein sequence or DNA. For instance, for ECG, there is the observation that healthy hearts seem to have a more varied and complex beat rate. Measuring the complexity of a ECG signal might be used as a diagnostic tool. Complexity estimates have been perfromed on DNA (Gusev,et.al. 1999, Orlov) (review by Potapov, et.al.), evolutionary sequences (Adami 2003), morphology (Fusco and Minelli 2000), development (Ricardo, et.al. 2005), maunfacturing (Efstathiou, et.al.), information systems vulnerability analysis (Bush and Hughes 2003), well as on various time series outlined below. There are many proposed measures of complexity and very little agreement about their usefulness or rigor (e.g. Perakh, or Crutchfield 2000, or Shalizi 2003, or Masum 1995 or Jäntti 2004)The variety of measures is bewildering and daunting, however I will attempt to classify and outline what little I understand so far. Funes has a readable summary of the field.

Techniques Review

We are going to consider only time series analysis in this section. There are reviews of "complexity measures" (Feldman and Crutchfield 1998, Daw, et.al. 2002). Kuusela, et.al. (2002) compare the results of ten different complexity measures on heart rate and blood pressure time series.

Our Attempts

We are interested in comparing communications signals between species, modalities, genera, etc. We tried a few different schemes to see if we could come up with a useful measure.

We found Matlab code to compute sample entropy. Test cases for sines, noise and a simple chaotic system (logistic equation) were written. The figure below shows output for the logistic equation over a range of growth values, for sample length m=2.

We also wrote code to compute permutation entropy. The test case used was, again, the logistic equation. The program plots the computed entropy versus the logistic growth rate, which is periodic for some regions and chaotic for others. The figure below shows output for 4th order. You can clearly see some regions of stability around 3.83.

Neither sample entropy or permutation entropy worked very well for the long data sets we are interested in analyzing, but we had not read about (nor yet figured out ourselves) the multiscale approachs mentioned above. We need to redo the sequences with multiscale techniques (for example, Costa et.al. 2003).

We developed a simple technique which is multiscale and which has given interesting results, but which suffers from all of the defects noted in various reviews of complexity measures: The technique is without relation to any other measure, it is not supported by theory, and has few physically interpretable parameters. A description of the algorithm follows. A signal is divided into N non-overlapping segments. For each segment, the Fourier entropy is calculated, then the mean and standard deviation of all N entropies are computed. The measure of entropy (at the time scale N) is the product of the mean and standard deviation. The measure is recomputed for a large range of Ns, to account for structuring on different scales. The measure tends to zero for sine waves and white noise. It indicates a higher "complexity" for many sorts of non-stationary signals. For example AM or FM modulation appears as higher complexity compared to noise or sine waves. A bird song example is shown below. Eight species of sparrow were characterized. Single repeats of the entire song were isolated (by Damian) and analysed. The spectrograms are here. The "complexity" ordering from the plot roughly matches the ordering of the spectrograms (as done by Damian, no double blind). The ordering was (from most complex to least) song, fox, swamp, field, white crown, brewers, white throat,chipping.

We redid the analysis using fixed length segments instead of N/song, as was done above. The ordering of the songs changes somewhat.

The next two things to try are the second order entropy techniques and the Renyi entropy.

References

General Sources


Copyright Cornell University, 2005