Scatterplot Matrices

Next: Exercises Up: More Elaborate Plots Previous: Exercises

Scatterplot Matrices

Another approach to graphing a set of variables is to look at a matrix of all possible pairwise scatterplots of the variables. The scatterplot-matrix function will produce such a plot. The data

(def hardness (list 45 55 61 66 71 71 81 86 53 60 64 68 79 81 56
                    68 75 83 88 59 71 80 82 89 51 59 65 74 81 86))
(def tensile-strength (list 162 233 232 231 231 237 224 219 203 189 
                            210 210 196 180 200 173 188 161 119 161 
                            151 165 151 128 161 146 148 144 134 127))
(def abrasion-loss (list 372 206 175 154 136 112 55 45 221 166 164
                         113  82  32 228 196 128 97 64 249 219 186
                         155 114 341 340 284 267 215 148))

were produced in a study of the abrasion loss in rubber tires and the expression

(scatterplot-matrix (list hardness tensile-strength abrasion-loss)
                    :variable-labels
                    (list "Hardness" "Tensile Strength" "Abrasion Loss"))

produces the scatterplot matrix in Figure 9 .

Figure 9: Scatterplot matrix of abrasion loss data.

The plot of abrasion-loss against tensile-strength gives you an idea of the joint variation in these two variables. But hardness varies from point to point as well. To get an understanding of the relationship among all three variables it would be nice to be able to fix hardness at various levels and look at the way the plot of abrasion-loss against tensile-strength changes as you change these levels. You can do this kind of exploration in the scatterplot matrix by using the two highlighting techniques selecting and brushing .

: Selecting . Your plot is in the selecting mode when the cursor is an arrow. This is the default setting. In this mode you can select a point by clicking the mouse on top of it. To select a group of points drag a selection rectangle around the group. If the group does not fit in a rectangle you can build up your selection by holding down the shift key as you click or drag. If you click without the shift key any existing selection will be unselected; when the shift key is down selected points remain selected.
: Brushing . You can enter the brushing mode by choosing Mouse Mode... from the Scatmat menu and selecting Brushing from the dialog box that is popped up. In this mode the cursor will look like a paint brush and a dashed rectangle, the brush , will be attached to your cursor. As you move the brush across the plot points in the brush will be highlighted. Points outside of the brush will not be highlighted unless they are marked as selected. To select points in the brushing mode (make their highlighting permanent) hold the mouse button down as you move the brush.

In the plot in Figure 10

Figure 10: Scatterplot matrix with middle hardness values highlighted.

the points within the middle of the hardness range have been highlighted using a long, thin brush (you can change the size of your brush using the Resize Brush command on the Scatmat menu). In the plot of abrasion-loss against tensile-strength you can see that the highlighted points seem to follow a curve. If you want to fit a model to this data this suggests fitting a model that accounts for this curvature.

A scatterplot matrix is also useful for examining the relationship between a quantitative variable and several categorical variables. In the data

(def yield (list 7.9 9.2 10.5 11.2 12.8 13.3 12.1 12.6 14.0 9.1 10.8 12.5
                 8.1 8.6 10.1 11.5 12.7 13.7 13.7 14.4 15.5 11.3 12.5 14.5  
                15.3 16.1 17.5 16.6 18.5 19.2 18.0 20.8 21 17.2 18.4 18.9 )) 
(def density (list 1 1 1 2 2 2 3 3 3 4 4 4 1 1 1 2 2 2 3 3 3 4 4 4 
                   1 1 1 2 2 2 3 3 3 4 4 4))
(def variety (list 1 1 1  1 1 1  1 1 1  1 1 1  2 2 2  2 2 2  2 2 2
                   2 2 2  3 3 3  3 3 3  3 3 3  3 3 3))

(Devore and Peck [ 11 , page 595, Example 14,]) the yield of tomato plants is recorded for an experiment run at four different planting densities and using three different varieties. In the plot in Figure 11

Figure 11: Scatterplot matrix for tomato yield data with points from the third variety highlighted.

a long, thin brush has been used to highlight the points in the third variety. If there is no interaction between the varieties and the density then the shape of the highlighted points should move approximately in parallel as the brush is moved from one variety to another.

Like spin-plot, the function scatterplot-matrix also accepts the optional keyword argument scale.

Next: Exercises Up: More Elaborate Plots Previous: Exercises

Anthony Rossini
Fri Oct 20 10:29:17 EDT 1995