The GeoMed/Epi Project:

Modules

Introduction to Spatial Analysis in Health
Introduction to nature of spatial data. Characteristics of space and of spatially-referenced events. Relevance and use of spatial pattern in other fields. Problems in measuring and interpreting spatial patterns in health. Possibilities of inference.
GIS and Spatial Data Handling
This module is an introduction to Geographic Information Systems (GIS) and spatial data handling.
The first hour of lecture focuses on many of the important issues involved in creating, manipulating, and analyzing spatial data. The focal point of this discussion will be GIS (Geographical Information Systems), which provide a vehicle for carrying out many of these functions:
- Creation: Scientists may need to collect spatial data, or digitize maps to add to their data bases, or add images collected by others (e.g. satellite info). We will talk about how one may go about this.
- Manipulation: Often one of the biggest obstacles is in managing data, importing and exporting it into forms that various software packages need.
- Analysis: what are some of the standard analysis procedures invoked from within a GIS?
The second hour of lecture deals with the role of spatial statistics in the analysis and creation of spatial data. Topics covered included quantification of spatial structure, and the construction of inferences from spatial data. Examples in population genetics and cancer epidemiology are presented.
Our attention, therefore, in this first lecture is on the very first stage of the course strategy: at the data collection/manipulation stage, and on the transition to maps created from the data; but it also looks forward to the stages at which some analysis will be required (particularly techniques of visualization).
Exploratory Spatial Data Analysis This module deals with Exploratory Spatial Data Analysis (ESDA). Topics include the objectives of ESDA, its methods, both graphical and statistical, and the role of ESDA in hypothesis generation and testing. The module closes with a discussion of multiple testing and experimentwise error.
Exploratory SDA is just that: exploratory! This is the first step in making sense of the data you have collected/assembled. Some techniques are available in the GIS studied in the previous week, but we need to go outside GIS: the evolution from management and visualization to analysis has been slow for GIS, and so other software must be called into play.
Scientific Visualization
The point of scientific visualization is to enhance your understanding of your data, as well as to provide you with new insights that won't come by standard numerical statistics alone. How do we convey the maximal amount of information possible in the neatest, most intelligible form?
Visualization takes many forms: static or dynamic; from one-dimensional to n-dimensional; monocular or binocular (stereo). In this module we present a variety of methods to help you to garner more information by looking at your data sensibly.
We examine techniques for visualizing your data, and follow that with some exploratory techniques for investigating spatial autocorrelation.
Disease Surveillance
Uses of space-time data to define outbreak clusters. Spatial patterns in monitoring for disease and directing intervention.
Public health surveillance is the ongoing systematic collection, analysis and interpretation of health event data with the objective of disease control and prevention. This module presents an introduction to issues in disease surveillance. The background of disease cluster investigations is presented, along with their role in public health. The basic cluster types are introduced and the cluster investigation guidelines of the Centers for Disease Control are described. The module closes with important issues such as `Texas Sharpshooter' sampling and whether disease causality may be inferred from cluster investigations.
Intro to Spatial Statistics
There are three points in the course strategy where spatial statistics and models come into play: in the transition from data to thematic map (e.g. geostatistical techniques); in the exploration of spatial autocorrelation and clustering; and in the testing of a prediction based on theory, at the end of the process.
Spatial statistics are statistics calculated from spatial data, and differ from `classical' statistics (e.g. ANOVA, regression) in several ways. This module begins by identifying these differences, and then presents several spatial models. Spatial statistics are then developed as a special case of randomization tests, and issues of statistical inference with spatial data are discussed. Finally, the utility and limitations of empirical distributions are presented.
Spatial clustering and surveillance statistics will be discussed separately, as will geostatistical models.
Which disease models give rise to which patterns? Experimentalists gather the patterns, and then attempt to deduce the process (developing a model which they hope captures or reflects the process); theoreticians may attempt to develop a model, then find the data to verify their predictions. We will examine some specific models of contagion.
Disease Clusters
This module provides an overview of disease clustering methods. It opens with a discussion of the role of disease clustering in scientific inference, and then describes tests for temporal clustering, spatial clustering, and space-time interaction. These are presented within the framework of global, local and focused tests. Next, disease surveillance methods are described, and the module concludes with recommendations regarding hypothesis vs. data driven approaches.
Designer Spatial Statistics
Many spatial statistics are special cases of a flexible mathematical form called the Gamma product. This module describes the Gamma product and its constituent parts, including proximity metrics, data metrics, and spatial randomization procedures. Because it is so flexible, the Gamma product provides a ready means for creating `Designer' spatial statistics customized to specific requirements.
Leaps and Creeps: Hierarchical spatial modeling
As a country experiences transition from third world to first world status, its territory tends to become increasingly organized in a hierarchical fashion. The hierarchical organization promotes efficiency within in the country, including facilitating flows between locations and administration of health services. The hierarchy almost always is reflected by the organization of a country's urban places.
The chance of a diffusion materializing increases as the size of the population residing at a location increases (a hierarchical component), and decreases as the distance separating the two locations increases (a contagion component). This diffusion pattern may be described with a social gravity model.
Some of the issues addressed include
- Mechanisms that create geographic hierarchy (e.g. air travel).
- What factors lead to hierarchical versus contagious diffusion?
- Predicting "where" when diseases leap from place to place.
The Search for Spatial Associations
Conventional statistics deal with IID (Independent and Identically Distributed) data; spatial autocorrelation invalidates the independence property of these data. This type of correlation may be viewed as the presence of redundant information in the data. Impacts of positive spatial autocorrelation include:
- a distortion of tests for normality,
- inflation of the estimated variance,
- inflation of the estimated covariance, and
- the need to estimate an autoregressive model.
Properly accounting for spatial autocorrelation involves estimating the inflation factors, and adjusting the sample size, N -- which becomes the effective sample size, N* -- in order to relate spatially autocorrelated data to equivalent hypothetical IID data. The procedural steps involved in doing this are:
1. evaluate normality (quantile plots, Shapiro-Wilk statistic), and if necessary apply a power transformation to each variable;
2. estimate the autoregressive parameter for each georeferenced variable (this module employs the Simultaneous AutoRegressive [SAR] model);
3. estimate the means, inflation factors, and correlation coefficients;
4. estimate the effective sample size N*, and its associated degrees of freedom;
5. calculate the t-statistics; and
6. identify extreme values and perform significance tests.
A variety of georeferenced health data sets are discussed.
Models of Process/Compartmental Models (Dr. John Jacquez)
Dr. Jacquez provided us with an introduction to compartmental models, which we can use to simulate or model a spatial process. He began with an introduction to the topic, and finished with a discussion of how spatial aspects can be modelled with compartmental models.
Geostatistical Models
Geostatistical models are used essentially for three reasons:
1. to characterize (i.e. model) spatial autocorrelation (via the variogram),
2. to create continuous maps based on the data for an area (via kriging), and
3. to simulate random realizations (data sets) based on a given variogram structure (spatial autocorrelation model).
The data may be disease rates, or probabilities of disease occurrence, etc.
Geostatistics, then, will be useful in moving from the data to thematic maps, in the analysis of spatial autocorrelation, and in creating maps for visualization and ESDA.
These techniques are notoriously complicated, however, so this will be a rather elementary and descriptive approach to geostatistical modelling. We will try to provide insight into the ideas via examples and many pictures, although there will be some math (the most you've seen so far).
Transmission and Exposure: implications for pattern and process
A special guest lecture by Dr. Uriel Kitron. In this module Dr. Kitron uses many examples from his years of experience in the area to continue pursuing the solution to the problem of deriving information about the underlying process which gives rise to an observed pattern.
Pattern and Process: Gaining insights from time-space characteristics of disease distribution
Using knowledge of disease mechanisms to hypothesize underlying processes that produce observed patterns. Summarize various kinds of exposure and transmission. Develop hypotheses that might explain different time-space patterns. Compare direct and indirect contagious processes with different environmental exposures.
Unanswered questions and future directions

Website maintained by Andy Long. Comments appreciated.
aelon@sph.umich.edu