validate.cph {Design} | R Documentation |
This is the version of the validate
function specific to models
fitted with cph
or psm
.
# fit <- cph(formula=Surv(ftime,event) ~ terms, x=TRUE, y=TRUE, ...) ## S3 method for class 'cph': validate(fit, method="boot", B=40, bw=FALSE, rule="aic", type="residual", sls=.05, aics=0, pr=FALSE, dxy=FALSE, u, tol=1e-9, ...) ## S3 method for class 'psm': validate(fit, method="boot",B=40, bw=FALSE, rule="aic", type="residual", sls=.05, aics=0, pr=FALSE, dxy=FALSE, tol=1e-12, rel.tolerance=1e-5, maxiter=15, ...)
fit |
a fit derived cph . The options x=TRUE and y=TRUE
must have been specified. If the model contains any stratification factors
and dxy=TRUE,
the options surv=TRUE and time.inc=u must also have been given,
where u is the same value of u given to validate .
|
method |
see validate |
B |
number of repetitions. For method="crossvalidation" , is the
number of groups of omitted observations.
|
rel.tolerance |
|
maxiter |
|
bw |
TRUE to do fast step-down using the fastbw function,
for both the overall model and for each repetition. fastbw
keeps parameters together that represent the same factor.
|
rule |
Applies if bw=TRUE . "aic" to use Akaike's information criterion as a
stopping rule (i.e., a factor is deleted if the chi-square falls below
twice its degrees of freedom), or "p" to use P-values.
|
type |
"residual" or "individual" - stopping rule is for individual factors or
for the residual chi-square for all variables deleted
|
sls |
significance level for a factor to be kept in a model, or for judging the residual chi-square. |
aics |
cutoff on AIC when rule="aic" .
|
pr |
TRUE to print results of each repetition
|
tol |
|
... |
see validate or predab.resample |
dxy |
set to TRUE to validate Somers' Dxy using
rcorr.cens , which takes longer.
|
u |
must be specified if the model has any stratification factors and dxy=TRUE .
In that case, strata are not included in X beta and the
survival curves may cross. Predictions at time t=u are
correlated with observed survival times. Does not apply to
validate.psm .
|
Statistics validated include the Nagelkerke R^2,
Dxy, slope shrinkage, the discrimination index D
[(model L.R. chi-square - 1)/L], the unreliability index
U = (difference in -2 log likelihood between uncalibrated
X beta and
X beta with overall slope calibrated to test sample) / L,
and the overall quality index Q = D - U.
L is -2 log likelihood with beta=0. The "corrected" slope
can be thought of as shrinkage factor that takes into account overfitting.
See predab.resample
for the list of resampling methods.
matrix with rows corresponding to Dxy, Slope, D,
U, and Q, and columns for the original index, resample estimates,
indexes applied to whole or omitted sample using model derived from
resample, average optimism, corrected index, and number of successful
resamples.
The values corresponting to the row Dxy are equal to 2 *
(C - 0.5) where C is the C-index or concordance probability.
If the user is correlating the linear predictor (predicted log hazard)
with survival time and she wishes to get the more usual correlation
using predicted survival time or predicted survival probability,
Dxy should be negated.
prints a summary, and optionally statistics for each re-fit (if pr=TRUE
)
Frank Harrell
Department of Biostatistics, Vanderbilt University
f.harrell@vanderbilt.edu
validate
, predab.resample
, fastbw
, Design
, Design.trans
, calibrate
,
rcorr.cens
, cph
, coxph.fit
n <- 1000 set.seed(731) age <- 50 + 12*rnorm(n) label(age) <- "Age" sex <- factor(sample(c('Male','Female'), n, TRUE)) cens <- 15*runif(n) h <- .02*exp(.04*(age-50)+.8*(sex=='Female')) dt <- -log(runif(n))/h e <- ifelse(dt <= cens,1,0) dt <- pmin(dt, cens) units(dt) <- "Year" S <- Surv(dt,e) f <- cph(S ~ age*sex, x=TRUE, y=TRUE) # Validate full model fit validate(f, B=10) # normally B=150 # Validate a model with stratification. Dxy is the only # discrimination measure for such models, by Dxy requires # one to choose a single time at which to predict S(t|X) f <- cph(S ~ rcs(age)*strat(sex), x=TRUE, y=TRUE, surv=TRUE, time.inc=2) validate(f, dxy=TRUE, u=2, B=10) # normally B=150 # Note u=time.inc