Last time | Next time |
Here's another example: A modeling misadventure with corn seedlings.
I asked you to read this because straight lines are not enough: we need a variety of functions, for a variety of situations. I thought that Ben's comment on the first page was funny: "If you begin to feel bogged down you can skip ahead and use the section for reference as needed." (I can see why one could get bogged down!)
If you did get bogged down, then I hope that you at least had a look at the pictures, on pages 123, 124, 128, 132!
So yesterday, my January, 2018 edition of Science appeared, and I've flagged all the linear regressions in the Research Section (in 77 pages, there are ten examples!).
Our first answer: We're not sure! (And that's okay! But do we have enough to continue, or do we bail out?)
It appears that there's more going on for the minima than for the maxima; but there does seem to be something going on. That is, we've identified a pattern, consistent with "leaning yes", at least in some of the cities.
That means we can proceed to phase II: modeling.
We suggested that we might need to consider other things, like
(Permit me to explain why I would put it in that way....)
Let's suppose that there were no change in temperature across the years. Then we might expect temperatures to vary erratically, randomly over the course of time about the mean min or max. So we would expect our scatter to flit about the means, call them $Max$ and $min$, for a particular town.
That model, which we might call the "null model", would be constant functions \[ y^M(t)=Max \] and \[ y^m(t)=min \] (where $y^M$ indicates we're talking about maxima, and $y^m$ indicates minima) and the data would include some random fluctions (which we might characterize as "noise") \[ y^M_i=Max+\epsilon_i \] or \[ y^m_i=min+\tau_i \]
This function is linear in $t$, and has as its graph a straight line.
Patrick has moved on from the Planning part of UPCE, to the "Carrying Out" of the plan, and even to the Evaluation part -- I heard him assert that the upward temperature trend was significant.
So he has carried out UPCE -- but "E" should not be thought of as "End"! We iterate -- do UPCE again -- using what we learned in the first phase.
In ordinary least squares regression, or objective is to compute $a$ and $b$ so as to minimize the sum of squared residuals: \[ S(a,b)=\sum_{i}(y_i-y(t_i))^2=\sum_{i}(y_i-(a+bt_i))^2 \]
But today I want to show how our notion of "linear models" can be extended to more interesting cases. Consider atmospheric CO2 (the Keeling data from Mauna Loa) in more detail than Stewart showed (he was looking at yearly means):
On page 120, he calls $f(x)=a+bx+cx^2$ the "simplest nonlinear model". In what sense is $f$ nonlinear? In another sense it is linear, however! It is linear in the parameters, $a$, $b$, and $c$!
So it turns out that we can use linear regression to find the best parameters of this "nonlinear model". So we'll also be doing that next week.