Scan Test Example

Scan

 Description: A test for temporal clustering in a single time series. The test statistic is Sw, the maximum number of cases observed when a pre-defined window is moved along the time series. The test is most sensitive to clustering when the scanning window is the same width as natural clusters in the data.

 Notation:

Ø N: Number of cases

Ø T: Number of time periods

Ø w: Window width

Ø Sw: Maximum number of cases observed in w as it is slid along the time series

 Null hypothesis: Cases are distributed at random across the time series. The expectation under the null hypothesis, E(Sw) is

Sw is larger than its expectation when cases cluster in a few time cells. Sw is smaller than its expectation when cases are uniformly distributed among the time cells.

 Significance: Statistical significance arises when many cases occur in one cell or when time cells with many cases fit within the scanning window. The probability of observing, under the null hypothesis, an Sw given the window width w, T time periods and N cases is approximated by (Wallenstein and Neff, 1987):

Here

is a binomial coefficient. P(j;w, T, N) is the probability of obtaining, under the null hypothesis, a scan statistic greater than or equal to j. We test for clusters, and this probability therefore is one-tailed.

 Data screen: Enter the name of the time series file (bones.tim) and the width of the scanning window (2). If possible, select the scanning window width to reflect underlying hypotheses about the temporal scale of the suspected cluster.

 Run screen: Press `F10' and select `Run', then press `Enter' to complete the calculations. By selecting a window width of 2 we are asking the scan method to determine whether the sum of bone fractures in any 2 adjacent time intervals is excessively large. The window on the right of the run screen displays Row (1) which is the number of time series; N, the number of cases (24); W, the window width (2); Sw, the scan statistic (7); and its expectation E(Sw) (3.6). P is the probability, under the null hypothesis, of obtaining an Sw greater than or equal to the observed Sw (0.4788). Using a scanning window width of 2, the scan test finds no evidence of temporal clustering in these data.

A graph of the Sw on its expectation is also shown. When Sw equals its expectation the test is not significant and a line with a slope of 45 results. When clustering exists Sw is large relative to and the points will be above the 45 line. Under uniformity (e.g. an equal number of cases in each time period) Sw will be small relative to its expectation and the points will be below the 45 line.

 Notes:

 References:

 Wallenstein, S. 1980. A test for detection of clustering over time. American Journal of Epidemiology 104: 576-584.

 Wallenstein, S. and N. Neff. 1987. An approximation for the distribution of the scan statistic. Statistics in Medicine 6: 197-207.


Website maintained by Andy Long. Comments appreciated.