validate.ols {Design} | R Documentation |
The validate
function when used on an object created by ols
does resampling validation of a multiple linear
regression model, with or without backward step-down variable deletion.
Uses resampling to estimate the optimism in various measures of
predictive accuracy which include R^2, MSE (mean squared error with
a denominator of n),
and the intercept
and slope of an overall calibration a + b * (predicted y). The "corrected" slope
can be thought of as shrinkage factor that takes into account overfitting.
validate.ols
can also be used when a model for a continuous response
is going to be applied to a binary response. A Somers' D_{xy} for this case
is computed for each resample by dichotomizing y
. This can be used to
obtain an ordinary receiver operating characteristic curve area using
the formula 0.5(D_{xy} + 1). The Nagelkerke-Maddala R^2 index for
the dichotomized y
is also given.
See predab.resample
for the list of resampling methods.
# fit <- fitting.function(formula=response ~ terms, x=TRUE, y=TRUE) ## S3 method for class 'ols': validate(fit, method="boot", B=40, bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, pr=FALSE, u=NULL, rel=">", tolerance=1e-7, ...)
fit |
a fit derived by ols . The options x=TRUE and y=TRUE
must have been specified. See validate for a description of
arguments method - pr .
|
method |
|
B |
|
bw |
|
rule |
|
type |
|
sls |
|
aics |
|
pr |
see validate and predab.resample |
u |
If specifed, y is also dichotomized at the cutoff u for
the purpose of getting a bias-corrected estimate of D_{xy}.
|
rel |
relationship for dichotomizing predicted y . Defaults to
">" to use y>u . rel can also be "<" ,
">=" , and "<=" .
|
tolerance |
tolerance for singularity; passed to lm.fit.qr .
|
... |
other arguments to pass to predab.resample , such as group , cluster , and subset
|
matrix with rows corresponding to R-square, MSE, intercept, slope, and optionally D_{xy} and R^2, and columns for the original index, resample estimates, indexes applied to whole or omitted sample using model derived from resample, average optimism, corrected index, and number of successful resamples.
prints a summary, and optionally statistics for each re-fit
Frank Harrell
Department of Biostatistics, Vanderbilt University
f.harrell@vanderbilt.edu
ols
, predab.resample
, fastbw
, Design
, Design.trans
, calibrate
set.seed(1) x1 <- runif(200) x2 <- sample(0:3, 200, TRUE) x3 <- rnorm(200) distance <- (x1 + x2/3 + rnorm(200))^2 f <- ols(sqrt(distance) ~ rcs(x1,4) + scored(x2) + x3, x=TRUE, y=TRUE) #Validate full model fit (from all observations) but for x1 < .75 validate(f, B=20, subset=x1 < .75) # normally B=150 #Validate stepwise model with typical (not so good) stopping rule validate(f, B=20, bw=TRUE, rule="p", sls=.1, type="individual")