Scenario:

Assume data describing cancer mortality rates in census districts. The size of the at-risk population is known for each census district. An investigator wishes to construct a randomization test to assess possible spatial autocorrelation in a given cancer's mortality rates. He considers two possible randomization null hypotheses: (1) A case can occur with equal probability over the study area. There are k census districts, hence there is a 1/k probability of having a given case occur in any one census district. He therefore should randomize the observations by allocating them at random over the k census districts. (2) The k census districts have different population sizes. In the absence of an excess mortality source, each individual in the population has an equal probability of dying of cancer. He therefore should randomize the observations by allocating them among the k census districts proportional to population size.

Questions:

  1. Is either of these approaches correct? Why or why not?
  2. How would you randomize these data?
Return to Scenerios