Section 12.2: Describing the relationship: sample percentages
The web example from section 12.1 leads us into section 12.2. Once the
chi-square test indicates that there is dependence in the variables, we want to
have an easy way to describe it, and sample percentages is the easiest way.
Note: we do not perform the chi-square test on the
percentages, but rather on the actual data collected instead. If we do not
detect a dependency, then we do not move to the discussion of the differences
in percentages.
Sections 11.1/11.2: Describing Relationships: Scatterplots and the Correlation Coefficient
In contrast to chapter 12, where we attempted to determine if two
qualitative variables were dependent, in chapter 11 we look at the relationship
between two quantitative variables, and attempt to decide if it is
linear.
We use a "scatterplot" to represent the two variables:
Interpreting a scatterplot:
positive linear relationship (as x increases, y increases)
negative linear relationship (as x increases, y decreases)
no linear relationship apparent
Your conclusions should be drawn only for values of the
independent variable (X) between the limits of
the data values for X.
Write your interpretation in terms of the variables of
interest, not "X" and "Y"
We use the correlation coefficient (denoted "r") to
characterize the strength of the linear relationship:
See the image above!
Always between -1 and 1
correlation coefficient test: A test for linear dependence of two
variables (p. 446)
Decision rule: Accept Ha if
Test Statistic:
where n is the number of pairs, and df=n-2.
Conditions for using the test for correlation (p. 447)
At each value of x, the distribution of the
y values in the population is normal, with
standard deviation independent of x (see Figure
11.5, p. 447)
The sampled values of y are independent of one
another.