School of
	Public Health    University of Michigan Department of Epidemiology

Scenario 1

HOME

List of
Modules


Scenario: the geographic distribution of standardized mortality rates across Glasgow.

Source: R. Haining, Spatial Data Analysis in the Social and Behavioral Sciences, CUP, 1990; R. Haining, "Bivariate correlation with spatial data," Geographical Analysis, 23 (1991): 210-227.

The geographic distribution of disease cases has been of interest to public health officials ever since Dr. Snow mapped cholera cases in London in 1854. The general geographic processes in operation are contagion and/or common locational exposure risks. These processes result in the clustering of cases, and hence spatial autocorrelation in maps of individual and/or geographically aggregated cases.

This case study tracks and maps selected standardized mortality rates in Glasgow, base upon aggregates for eighty-seven community medicine areas during 1980-82. These rates are related to contributory factors; for example, cancer rates are correlated with low quality of life, which is believed to induce stress. Proper measures of bivariate correlation between these georeferenced variables are outlined, together with their accompanying sampling distributions. The first involves prewhitening the two georeferenced variables under study. The second involves calculating the effective sample size that is affiliated with spatially autocorrelated data. The demonstrations presented emphasize that the presence of spatial correlation in map data undermines the standard statistical theory of inference pertaining to bivariate correlation.

The objective of securing a good estimate of the population mean also is treated by Haining (1990). As before, the discussion is cast in terms of impacts of spatial autocorrelation on the sampling distribution of a sample mean. The demonstrations here emphasize that spatial dependency implies a loss of information in the estimation of a mean. Information loss arising from positive spatial autocorrelation can lead to incorrectly failing to reject a null hypothesis.

Principal concepts: effective sample size (1990, p. 318; 1991, p. 213-214), pre-whitening (1991, p. 216), bivariate correlation (1990, p.313-323; 1991, p. 217), sample mean (1990, pp. 161-166).

Questions:

  1. Why does spatial autocorrelation cause information loss when calculating a sample mean?
  2. How can the phrase "effective sample size" be defined?
  3. What happens to the sampling distribution of a correlation coefficient or a sample mean as positive spatial autocorrelation increases?
  4. Why should an epidemiologist be concerned about the presence of spatial autocorrelation in disease rates?
  5. What are some implications for a multivariate analysis of georeferenced disease rates that can be gleaned from univariate and bivariate findings concerning the presence of spatial autocorrelation?
Return to Scenerios