ppr(formula, data=sys.parent(), weights, subset, na.action, contrasts=NULL, ww=rep(1,q), nterms, max.terms=nterms, optlevel=2, sm.method=c("supsmu", "spline", "gcvspline"), bass=0, span=0, df=5, gcvpen=1) ppr(x, y, weights=rep(1,n), ww=...... (as above) ...)
formula
| a regression formula specifying one or more response variables and the explanatory variables. |
x
| matrix of explanatory variables. Rows represent observations, and columns represent variables. Missing values are not accepted. |
nterms
| number of terms to include in the final model. |
data
|
Data frame from which variables specified in formula are
preferentially to be taken.
|
weights
| a vector of weights for each case. |
ww
|
a vector of weights for each response, so the fit criterion is
the sum over case i and responses j of
w_i ww_j (y_ij - fit_ij)^2 divided by the sum of w_i .
|
subset
| An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |
na.action
|
A function to specify the action to be taken if NA s are
found. The
default action is for the procedure to fail. An alternative is
na.omit , which leads to rejection of cases with missing values on
any required variable. (NOTE: If given, this argument must be named.)
|
contrasts
| the contrasts to be used when any factor explanatory variables are coded. |
max.terms
| maximum number of terms to choose from when building the model. |
optlevel
| integer from 0 to 3 which determines the throughness of an optimization routine in the SMART program. See the Details section. |
sm.method
|
the method used for smoothing the ridge functions. The default is to
use Friedman's super smoother supsmu . The alternatives are to use
the smoothing spline code underlying smooth.spline , either with a
specified (equivalent) degrees of freedom for each ridge functions, or
to allow the smoothness to be chosen by GCV.
|
bass
|
super smoother bass tone control used with automatic span selection
(see supsmu ); the range of values is 0 to 10, with larger values
resulting in increased smoothing.
|
span
|
super smoother span control (see supsmu ). The default, 0 ,
results in automatic span selection by local cross validation. span
can also take a value in (0, 1] .
|
df
|
if sm.method is "spline" specifies the smoothness of each ridge
term via the requested equivalent degrees of freedom.
|
gcvpen
|
if sm.method is "gcvspline" this is the penalty used in the GCV
selection for each degree of freedom used.
|
ppreg
. The answers will be very similar on a given
machine, but this code is extremely sensitive to the compiler used.
The differences are the ability to use spline smoothers and the
interface which should be much easier to use.max.terms
ridge terms one at a
time; it will use less if it is unable to find a term to add that makes
sufficient difference. It then removes the least "important"
term at each step until nterm
terms are left.
The levels of optimization (argument optlevel
)
differ in how thoroughly the models are refitted during this process.
At level 0 the existing ridge terms are not refitted. At level 1
the projection directions are not refitted, but the ridge
functions and the regression coefficients are.
Levels 2 and 3 refit all the terms and are equivalent for one
response; level 3 is more careful to re-balance the contributions
from each regressor at each step and so is a little less likely to
converge to a saddle point of the sum of squares criterion.
call
| the matched call |
p
| the number of explanatory variables (after any coding) |
q
| the number of response variables |
ml
|
the argument max.terms
|
gof
| the overall residual (weighted) sum of squares for the selected model |
gofn
|
the overall residual (weighted) sum of squares against the number of
terms, up to max.terms . Will be invalid (and zero) for less than
nterms .
|
df
|
the argument df
|
edf
|
if sm.method is "spline" or "gcvspline" the equivalent number of
degrees of freedom for each ridge term used.
|
xnames
| the names of the explanatory variables |
ynames
| the names of the response variables |
alpha
| a matrix of the projection directions, with a column for each ridge term |
beta
| a matrix of the coefficients applied for each response to the ridge terms: the rows are the responses and the columns the ridge terms |
yb
| the weighted means of each response |
ys
|
the overall scale factor used: internally the responses are divided by
ys to have unit total weighted sum of squares.
|
fitted.values
|
the fitted values, as a matrix if q > 1
|
residuals
|
the residuals, as a matrix if q > 1
|
smod
| internal work array, which includes the ridge functions evaluated at the training set points. |
Friedman, J. H. (1984) SMART User's Guide. Laboratory for Computational Statistics, Stanford University Technical Report No. 1.
plot.ppr
, ppreg
, supsmu
,
smooth.spline
# Note: your numerical values may differ data(rock) attach(rock) area1 <- area/10000; peri1 <- peri/10000 rock.ppr <- ppr(log(perm) ~ area1 + peri1 + shape, data=rock, nterms=2, max.terms=5) rock.ppr # Call: # ppr.formula(formula = log(perm) ~ area1 + peri1 + shape, data = rock, # nterms = 2, max.terms = 5) # # Goodness of fit: # 2 terms 3 terms 4 terms 5 terms # 8.737806 5.289517 4.745799 4.490378 summary(rock.ppr) # ..... (same as above) # ..... # # Projection direction vectors: # term 1 term 2 # area1 0.34357179 0.37071027 # peri1 -0.93781471 -0.61923542 # shape 0.04961846 0.69218595 # # Coefficients of ridge terms: # term 1 term 2 # 1.6079271 0.5460971 par(mfrow=c(3,2))# maybe: , pty="s") plot(rock.ppr, main="ppr(log(perm)~ ., nterms=2, max.terms=5)") plot(update(rock.ppr, bass=5), main = "update(..., bass = 5)") plot(update(rock.ppr, sm.method="gcv", gcvpen=2), main = "update(..., sm.method=\"gcv\", gcvpen=2)") # Note: your numerical values may differ data(rock) attach(rock) area1 <- area/10000; peri1 <- peri/10000 rock.ppr <- ppr(log(perm) ~ area1 + peri1 + shape, data=rock, nterms=2, max.terms=5) rock.ppr # Call: # ppr.formula(formula = log(perm) ~ area1 + peri1 + shape, data = rock, # nterms = 2, max.terms = 5) # # Goodness of fit: # 2 terms 3 terms 4 terms 5 terms # 8.737806 5.289517 4.745799 4.490378 summary(rock.ppr) # ..... (same as above) # ..... # # Projection direction vectors: # term 1 term 2 # area1 0.34357179 0.37071027 # peri1 -0.93781471 -0.61923542 # shape 0.04961846 0.69218595 # # Coefficients of ridge terms: # term 1 term 2 # 1.6079271 0.5460971 par(mfrow=c(3,2))# maybe: , pty="s") plot(rock.ppr, main="ppr(log(perm)~ ., nterms=2, max.terms=5)") plot(update(rock.ppr, bass=5), main = "update(..., bass = 5)") plot(update(rock.ppr, sm.method="gcv", gcvpen=2), main = "update(..., sm.method=\"gcv\", gcvpen=2)")