School of
	Public Health    University of Michigan Department of Epidemiology

Scenario 2

HOME

List of
Modules


Scenario: the geographic distribution of male lung cancer mortality in France

Source: S. Richardson, "A method for testing the significance of geographical correlations with application to industrial lung cancer in France," Statistics in Medicine, 9 (1990): 515-528.

The geographic distribution of disease cases has been of interest to public health officials ever since Dr. Snow mapped cholera cases in London in 1854. The general geographic processes in operation are contagion and/or common locational exposure risks. These processes result in the clustering of cases, and hence spatial autocorrelation in maps of individual and/or geographically aggregated cases.

This paper discusses some of the problems encountered when testing significance of sample relationships uncovered in geographical epidemiology where variables typically exhibit spatial autocorrelation. Standardized male lung cancer mortality rates in specific industries in France over the two-year period of 1968-69 are tracked and mapped, using 82 administrative units. Lung cancer was selected for study because it frequently is associated with industrial exposure/occupational risk factors for men. Of note is adjustments are made for the time lag between smoking and the onset of lung pathology. Effective degrees of freedom are calculated, and correlation coefficient significance test results are reported comparing a standard test with a test taking into account the presence of spatial autocorrelation. The demonstrations presented emphasize that geographical epidemiology studies, when taking into account latent spatial autocorrelation, can easily identify risk factors that are related to those found in individual epidemiological studies by employing a modified test statistic. These demonstrations also emphasize that using standard methods, and hence overlooking the presence of spatial autocorrelation, can render somewhat misleading statistical results.

Principal concepts: effective sample size (p. 517), modified test statistic (p. 521), bivariate correlation (pp. 523, 525).

Questions:

  1. How can the phrase "effective sample size" be defined?
  2. What happens to the sampling distribution of a correlation coefficient or a sample mean as positive spatial autocorrelation increases?
  3. What is the impact of spatial autocorrelation on a partial correlation coefficient?
  4. Why should an epidemiologist be concerned about the presence of spatial autocorrelation in disease rates?
  5. What are some implications for a multivariate analysis of georeferenced disease rates that can be gleaned from univariate and bivariate findings concerning the presence of spatial autocorrelation?
Return to Scenerios