Ederer-Myers-Mantel Example

Ederer-Myers-Mantel method

Description: Used to detect time clustering in several time series, this test is insensitive to differences in population size over the areas from which the time series originate. The test statistic, m₁, is the largest number of cases observed in any of a sequence of time intervals. Case counts in each time interval are required.

Notation:

Ø t : number of time intervals

Ø T: Number of time intervals in the time series

Ø r_i : number of cases in time series i

Ø f(r) : frequency, over all time series, of a given number of total cases

Ø m_1i: the largest number of cases in any time interval of time series i.

Null hypothesis: In each time series the r_i cases are allocated randomly among the t cells.

Significance: The sum of the m_1i is used to construct a Chi square statistic

The summations are over the number of time intervals in the time series, is the sum of the maximum number of cases over all time series, and are sums of the expectation and variance of m_1i under the null hypothesis.

Data screen: Select `Time' from the horizontal methods menu, `Ederer-Myers-Mantel' and `Data'. Then enter the name of the file containing the time series to be analyzed (measles.tim). To use this method the number of time intervals in the series must be between 2 and 5. You can analyze subsets or reorganize the data into fewer time intervals when T>5.

These data describe the number of measles cases reported in 18 Michigan counties from 1983 through 1987 (see the file). There are five time intervals, one for each year, and there are 18 time series, one for each county. The first number in the file is the length of the time series (5). The second row is the number of time series (18).

The remaining rows are the time series. For example, row 3 in the file is `0 18 11 0 0', and is the time series for Washtenaw county. The total number of disease cases, r, is 29, while m1 is 18. The null hypothesis under EMM states that, in Washtenaw county, the 29 cases have an equal probability of occurring among the 5 years. Measles is a contagious disease and we expect these data to demonstrate strong time clustering.

Run screen: Press `F10' to exit the data entry screen, select `Run' and press `Enter'. The calculations will take a moment and the run screen will be displayed. Examine the window on the lower right of the screen. Column 1 is r, the total number of cases in a time series, Column 2 is f(r), the number of time series with that number of cases. Columns 3 and 4 are the expectation and variance of m₁ under the null hypothesis for a time series of length 5 and with r total cases.

Again consider Washtenaw county, with 29 measles cases in five years. In the right hand window look up `r=29' to learn there was 1 county with r=29 (f(r)=1). Under the null hypothesis a county with r=29 cases has an expected maxima of cases of m₁=8.718. However, the observed maximum was m₁=18, which strongly suggests a clustering of measles cases in Washtenaw county. The variance under the null hypothesis is 1.789.

The tabled values in the results screen are used to calculate a Chi-square to test for departures from the null hypothesis in all of the counties simultaneously. Notice there are two ways the distribution of cases can differ from their null distribution: They may cluster (m₁ larger than expected) or they may be uniform (m₁ smaller than expected). Either of these alternatives will inflate the Chi-square. Use the plot of the expected m₁ on the observed m₁ to determine the nature of the departure (if any) from the null distribution. This plot is in the lower left of the run screen. The function m₁=E(m₁) is shown by the dashed line. On average, time series whose distribution of cases is consistent with the null hypothesis will plot on this line. Time clustering causes m₁ to be larger than its expectation, and clustered time series will plot above the dashed line. Uniformity -- an equal number of cases in each time interval -- causes m₁ to be smaller than its expectation, and uniform time series plot below the dashed line.

The time series for Washtenaw county plot well above the dashed line, suggesting simultaneous clustering. Notice there is one symbol plotted for each time series described in the table on the right. Sum the f(r) column to determine the number of time series plotted -- you should obtain 8. This is also the number of time series used to calculate the overall chi-square. The other 10 counties in the data set were automatically dropped from the analysis because they had 1 or 0 cases. Two time series plotted at E(m₁)=1.2, Var(m₁)=0.160. Intermediate sums used to calculate the chi-square are given in the results table, followed by the Chi-square (384.858) and its significance (0.0000). Not surprisingly, measles cases are strongly clustered.

The large number of cases in Oakland county (69) reflects its population size, 1,006,273 people in 1983. Oakland county includes suburbs of Detroit. This contrasts with rural Branch county, with 39,057 people. One of the advantages of the EMM test is that it is not affected by differences in population sizes across time series. Therefore, we have confidence in our results even though vast differences in county population size exist.

Notes: The expectation and variance may be checked against tables published by Ederer, Myers and Mantel (1964). Stat calculates the expectation and variance using tables or the asymptotic estimates of Stark and Mantel (1967). Tables and linear interpolation are used when the total cases in a time series is less than 499, otherwise asymptotic estimates (Table 5 in Mantel, Kryscio and Myers 1976) are used. The test is biased by changes in population size through time, although it is not sensitive to differences in population size across areas.

 Reference:

Stark, C. R. and N. Mantel. 1967. Lack of seasonal or temporal spatial clustering of Down's Syndrome births in Michigan. American Journal of Epidemiology 86: 199-213.

Website maintained by Andy Long. Comments appreciated.