Introduction to nature of spatial data. Characteristics of space and of spatially-referenced events. Relevance and use of spatial pattern in other fields. Problems in measuring and interpreting spatial patterns in health. Possibilities of inference.
This module is an introduction to Geographic Information Systems (GIS) and spatial data handling.
The first hour of lecture focuses on many of the important issues involved in creating, manipulating, and analyzing spatial data. The focal point of this discussion will be GIS (Geographical Information Systems), which provide a vehicle for carrying out many of these functions:
The second hour of lecture deals with the role of spatial statistics in the analysis and creation of spatial data. Topics covered included quantification of spatial structure, and the construction of inferences from spatial data. Examples in population genetics and cancer epidemiology are presented.
Our attention, therefore, in this first lecture is on the very first stage of the course strategy: at the data collection/manipulation stage, and on the transition to maps created from the data; but it also looks forward to the stages at which some analysis will be required (particularly techniques of visualization).
Exploratory SDA is just that: exploratory! This is the first step in making sense of the data you have collected/assembled. Some techniques are available in the GIS studied in the previous week, but we need to go outside GIS: the evolution from management and visualization to analysis has been slow for GIS, and so other software must be called into play.
The point of scientific visualization is to enhance your understanding of your data, as well as to provide you with new insights that won't come by standard numerical statistics alone. How do we convey the maximal amount of information possible in the neatest, most intelligible form?
Visualization takes many forms: static or dynamic; from one-dimensional to n-dimensional; monocular or binocular (stereo). In this module we present a variety of methods to help you to garner more information by looking at your data sensibly.
We examine techniques for visualizing your data, and follow that with some exploratory techniques for investigating spatial autocorrelation.
Uses of space-time data to define outbreak clusters. Spatial patterns in monitoring for disease and directing intervention.
Public health surveillance is the ongoing systematic collection, analysis and interpretation of health event data with the objective of disease control and prevention. This module presents an introduction to issues in disease surveillance. The background of disease cluster investigations is presented, along with their role in public health. The basic cluster types are introduced and the cluster investigation guidelines of the Centers for Disease Control are described. The module closes with important issues such as `Texas Sharpshooter' sampling and whether disease causality may be inferred from cluster investigations.
There are three points in the course strategy where spatial statistics and models come into play: in the transition from data to thematic map (e.g. geostatistical techniques); in the exploration of spatial autocorrelation and clustering; and in the testing of a prediction based on theory, at the end of the process.
Spatial statistics are statistics calculated from spatial data, and differ from `classical' statistics (e.g. ANOVA, regression) in several ways. This module begins by identifying these differences, and then presents several spatial models. Spatial statistics are then developed as a special case of randomization tests, and issues of statistical inference with spatial data are discussed. Finally, the utility and limitations of empirical distributions are presented.
Spatial clustering and surveillance statistics will be discussed separately, as will geostatistical models.
Which disease models give rise to which patterns? Experimentalists gather the patterns, and then attempt to deduce the process (developing a model which they hope captures or reflects the process); theoreticians may attempt to develop a model, then find the data to verify their predictions. We will examine some specific models of contagion.
This module provides an overview of disease clustering methods. It opens with a discussion of the role of disease clustering in scientific inference, and then describes tests for temporal clustering, spatial clustering, and space-time interaction. These are presented within the framework of global, local and focused tests. Next, disease surveillance methods are described, and the module concludes with recommendations regarding hypothesis vs. data driven approaches.
Many spatial statistics are special cases of a flexible mathematical form called the Gamma product. This module describes the Gamma product and its constituent parts, including proximity metrics, data metrics, and spatial randomization procedures. Because it is so flexible, the Gamma product provides a ready means for creating `Designer' spatial statistics customized to specific requirements.
As a country experiences transition from third world to first world status, its territory tends to become increasingly organized in a hierarchical fashion. The hierarchical organization promotes efficiency within in the country, including facilitating flows between locations and administration of health services. The hierarchy almost always is reflected by the organization of a country's urban places.
The chance of a diffusion materializing increases as the size of the population residing at a location increases (a hierarchical component), and decreases as the distance separating the two locations increases (a contagion component). This diffusion pattern may be described with a social gravity model.
Some of the issues addressed include
Conventional statistics deal with IID (Independent and Identically Distributed) data; spatial autocorrelation invalidates the independence property of these data. This type of correlation may be viewed as the presence of redundant information in the data. Impacts of positive spatial autocorrelation include:
Properly accounting for spatial autocorrelation involves estimating the inflation factors, and adjusting the sample size, N -- which becomes the effective sample size, N* -- in order to relate spatially autocorrelated data to equivalent hypothetical IID data. The procedural steps involved in doing this are:
Dr. Jacquez provided us with an introduction to compartmental models, which we can use to simulate or model a spatial process. He began with an introduction to the topic, and finished with a discussion of how spatial aspects can be modelled with compartmental models.
Geostatistical models are used essentially for three reasons:
Geostatistics, then, will be useful in moving from the data to thematic maps, in the analysis of spatial autocorrelation, and in creating maps for visualization and ESDA.
These techniques are notoriously complicated, however, so this will be a rather elementary and descriptive approach to geostatistical modelling. We will try to provide insight into the ideas via examples and many pictures, although there will be some math (the most you've seen so far).
A special guest lecture by Dr. Uriel Kitron. In this module Dr. Kitron uses many examples from his years of experience in the area to continue pursuing the solution to the problem of deriving information about the underlying process which gives rise to an observed pattern.
 
Website maintained by Andy Long. Comments appreciated. aelon@sph.umich.edu