Project 2 - due 4/20

You may work on this project with one other student and, in fact, are encouraged to do so. If you choose to do so,

Your report should be typed. You should respond to all questions using complete sentences, proper spelling, good grammar, etc.

In general, answers alone are not sufficient -- you must show appropriate work and reasoning. You may use available technology, whether your calculators, MATLAB, Mathematica, or any other. Work turned in should be neat, carefully written, and complete.

In this project you will be studying a topic of on-going interest: linear regression. It is linear regression that I described at the beginning of the course when I needed to buy a car, and wanted to find a good buy among many option.

You'll want to read section III.VI (projection), and check out topic "Line of Best Fit": they might be helpful.

  1. Your job will also be to find the best buy among many options. You will identify a product (not cars) with several (perhaps even many) quantifiable features (for cars, those might be year, mpg, horsepower, etc.). You will collect data on various models or types of your product (as many as possible), as well as price.

    You will build a linear model to identify the "best buy". This linear model will be of the form

    where is the matrix of values of quantifiable features (one row per product), and is the vector of coefficients which weights those features to produce the price vector, .

    Suppose that there are 3 features, which we might call . Then we're looking at a model where, for product ,

    Notice that we've added a fourth coefficient , a constant "intercept" coefficient, which multiplies the "feature" identically equal to 1 -- that is, .

    Construct the matrix and the vector for your product. You should have a dozen or so products, and three or so features.

  2. Now, with so many rows (product samples) and so few columns (quantifiable features), it's highly unlikely that actually resides in the column space of : that is, you're unlikely to find a vector that solves the system.

    Attempt to solve the system

  3. This is usually handled by projecting vector onto the columns of .

    Solve

    .

  4. Let be the solution of

    .

    Then let

    .

    This is the estimated price, based on our best model. By comparing the estimated price to the actual price, develop a criterion for determining the best value among your products. Justify your answer.

  5. Critique this process in the case of your product.


Website maintained by Andy Long. Comments appreciated.