calibrate {Design} | R Documentation |
Uses bootstrapping or cross-validation to get bias-corrected (overfitting-
corrected) estimates of predicted vs. observed values based on
subsetting predictions into intervals (for survival models) or on
nonparametric smoothers (for other models). There are calibration
functions for Cox (cph
), parametric survival models (psm
),
binary and ordinal logistic models (lrm
) and ordinary least squares (ols
).
For survival models,
"predicted" means predicted survival probability at a single
time point, and "observed" refers to the corresponding Kaplan-Meier
survival estimate, stratifying on intervals of predicted survival.
For logistic and linear models, a nonparametric calibration curve is
estimated over a sequence of predicted values.
The fit must have specified x=TRUE, y=TRUE
. The print
and plot
methods
for lrm
and ols
models (which use calibrate.default
) print the mean
absolute error in predictions, the mean squared error, and the 0.9 quantile
of the absolute error. Here, error refers to the difference between the
predicted values and the corresponding bias-corrected calibrated values.
Below, the second, third, and fourth invocations of calibrate
are, respectively, for ols
and lrm
, cph
, and
psm
. The first and second plot
invocation are
respectively for lrm
and ols
fits or all other fits.
calibrate(fit, ...) ## Default S3 method: calibrate(fit, predy, method=c("boot","crossvalidation",".632","randomization"), B=40, bw=FALSE, rule=c("aic","p"), type=c("residual","individual"), sls=.05, pr=FALSE, kint, smoother="lowess", ...) ## S3 method for class 'cph': calibrate(fit, method="boot", u, m=150, cuts, B=40, bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, pr=FALSE, what="observed-predicted", tol=1e-12, ...) ## S3 method for class 'psm': calibrate(fit, method="boot", u, m=150, cuts, B=40, bw=FALSE,rule="aic", type="residual",sls=.05,aics=0, pr=FALSE,what="observed-predicted",tol=1e-12, maxiter=15, rel.tolerance=1e-5, ...) ## S3 method for class 'calibrate': print(x, ...) ## S3 method for class 'calibrate.default': print(x, ...) ## S3 method for class 'calibrate': plot(x, xlab, ylab, subtitles=TRUE, conf.int=TRUE, ...) ## S3 method for class 'calibrate.default': plot(x, xlab, ylab, xlim, ylim, legend=TRUE, subtitles=TRUE, ...)
fit |
a fit from ols , lrm , cph or psm
|
x |
an object created by calibrate |
method |
|
B |
|
bw |
|
rule |
|
type |
|
sls |
|
aics |
see validate |
u |
the time point for which to validate predictions for survival models. For cph fits,
you must have specified surv=TRUE, time.inc=u , where u is
the constant specifying the time to predict.
|
m |
group predicted u -time units survival into intervals containing
m subjects on the average (for survival models only)
|
cuts |
actual cut points for predicted survival probabilities. You may
specify only one of m and cuts (for survival models only)
|
pr |
set to TRUE to print intermediate results for each re-sample
|
what |
The default is "observed-predicted" , meaning to estimate optimism
in this difference. This is preferred as it accounts for skewed
distributions of predicted probabilities in outer intervals. You can
also specify "observed" . This argument applies to survival models only.
|
tol |
criterion for matrix singularity (default is 1e-12 ) |
maxiter |
for psm , this is passed to
survreg.control (default is 15 iterations)
|
rel.tolerance |
parameter passed to
survreg.control for psm (default is 1e-5).
|
predy |
a scalar or vector of predicted values to calibrate (for lrm ,
ols ). Default is 50 equally spaced points between the 5th
smallest and the 5th largest predicted values. For lrm the
predicted values are probabilities (see kint ).
|
kint |
For an ordinal logistic model the default predicted
probability that Y>=q the middle level. Specify kint to specify the
intercept to use, e.g., kint=2 means to calibrate Prob(Y>=q
b), where b is the second level of Y.
|
smoother |
a function in two variables which produces x- and
y-coordinates by smoothing the input y . The default is to
use lowess(x, y, iter=0) .
|
... |
other arguments to pass to predab.resample , such as group ,
cluster , and subset .
Also, other arguments for plot .
|
xlab |
defaults to "Predicted x-units Survival" or to a suitable label for other models |
ylab |
defaults to "Fraction Surviving x-units" or to a suitable label for other models |
xlim |
|
ylim |
2-vectors specifying x- and y-axis limits, if not using defaults |
subtitles |
set to FALSE to suppress subtitles in plot describing method and for lrm
and ols the mean absolute error and original sample size
|
conf.int |
set to FALSE to suppress plotting 0.95 confidence intervals for
Kaplan-Meier estimates
|
legend |
set to FALSE to suppress legends (for lrm , ols
only) on the calibration plot, or specify a list with elements x
and y containing the coordinates of the upper left corner of the
legend. By default, a legend will be drawn in the lower right 1/16th of
the plot.
|
If the fit was created using penalized maximum likelihood estimation,
the same penalty
and penalty.scale
parameters are used during
validation.
matrix specifying mean predicted survival in each interval, the
corresponding estimated bias-corrected Kaplan-Meier estimates,
number of subjects, and other statistics. For linear and logistic models,
the matrix instead has rows corresponding to the prediction points, and
the vector of predicted values being validated is returned as an attribute.
The returned object has class "calibrate"
or "calibrate.default"
.
prints, and stores an object pred.obs
or .orig.cal
Frank Harrell
Department of Biostatistics
Vanderbilt University
f.harrell@vanderbilt.edu
validate
, predab.resample
, groupkm
, errbar
, cph
, psm
,
lowess
set.seed(1) d.time <- rexp(200) x1 <- runif(200) x2 <- factor(sample(c('a','b','c'),200,TRUE)) f <- cph(Surv(d.time) ~ pol(x1,2)*x2, x=TRUE, y=TRUE, surv=TRUE, time.inc=2) #or f <- psm(S ~ ...) cal <- calibrate(f, u=2, m=50, B=20) # usually B=200 or 300 plot(cal) y <- sample(0:2, 200, TRUE) x1 <- runif(200) x2 <- runif(200) x3 <- runif(200) x4 <- runif(200) f <- lrm(y ~ x1+x2+x3*x4, x=TRUE, y=TRUE) cal <- calibrate(f, kint=2, predy=seq(.2,.8,length=60), group=y) # group= does k-sample validation: make resamples have same # numbers of subjects in each level of y as original sample plot(cal) #See the example for the validate function for a method of validating #continuation ratio ordinal logistic models. You can do the same #thing for calibrate