sensuc {Design} | R Documentation |
Performs an analysis of the sensitivity of a binary treatment (X) effect to an
unmeasured binary confounder (U) for a fitted binary logistic or an
unstratified non-time-dependent Cox survival model (the function works
well for the former, not so well for the latter). This is done by
fitting a sequence of models with separately created U variables
added to the original model. The sequence of models is formed
by simultaneously varying a and b,
where a measures the association between U and X and b
measures the association between U and Y, where Y is the outcome
of interest. For Cox models, an
approximate solution is used by letting Y represent some binary
classification of the event/censoring time and the event indicator.
For example, Y could be just be the event indicator, ignoring time of
the event or censoring, or it could be 1 if a subject failed before
one year and 0 otherwise.
When for each combination of a
and b the vector of binary values U is generated, one of two
methods is used to constrain the properties of U. With either
method, the overall prevalence of U is constrained to be prev.u
.
With the default
method (or.method="x:u y:u"
), U is sampled so that the X:U odds
ratio is a and the Y:U odds ratio is b. With the second method,
U is sampled according to the model
logit(U=1 | X, Y) = α + β*Y + gamma*X, where
β=log(b) and gamma=log(a) and α is
determined so that the prevalence of U=1 is prev.u
. This
second method results in the adjusted odds ratio for Y:U given
X being b whereas the default method forces the
unconditional (marginal) Y:U odds ratio to be b. Rosenbaum
uses the default method.
There is a plot
method for plotting objects created by sensuc
.
Values of a are placed on the x-axis and observed marginal odds or
hazards ratios for U (unadjusted ratios) appear on the y-axis. For
Cox models, the hazard ratios will not agree exactly with X:event
indicator odds ratios but they sometimes be made close through judicious choice
of the event
function. The default plot
uses four symbols which differentiate whether for the a,b
combination the effect of X adjusted for U (and for any other
covariables that were in the original model fit) is positive
(usually meaning an effect ratio greater than 1) and "significant",
merely positive, not positive and non significant, or not positive but
significant. There is also an
option to draw the numeric value
of the X effect ratio at the a,b combination along
with its Z statistic underneath in smaller letters, and an option
to draw the effect ratio in one of four colors depending on the
significance of the Z statistic.
# fit <- lrm(formula=y ~ x + other.predictors, x=TRUE, y=TRUE) #or # fit <- cph(formula=Surv(event.time,event.indicator) ~ x + other.predictors, # x=TRUE, y=TRUE) sensuc(fit, or.xu=seq(1, 6, by = 0.5), or.u=or.xu, prev.u=0.5, constrain.binary.sample=TRUE, or.method=c("x:u y:u","u|x,y"), event=function(y) if(is.matrix(y))y[,ncol(y)] else 1*y) ## S3 method for class 'sensuc': plot(x, ylim=c((1+trunc(min(x$effect.u)-.01))/ ifelse(type=='numbers',2,1), 1+trunc(max(x$effect.u)-.01)), xlab='Odds Ratio for X:U', ylab=if(x$type=='lrm')'Odds Ratio for Y:U' else 'Hazard Ratio for Y:U', digits=2, cex.effect=.75, cex.z=.6*cex.effect, delta=diff(par('usr')[3:4])/40, type=c('symbols','numbers','colors'), pch=c(15,18,5,0), col=c(2,3,1,4), alpha=.05, impressive.effect=function(x)x > 1,...)
fit |
result of lrm or cph with x=TRUE, y=TRUE . The
first variable in the right hand side of the model formula must have
been the binary X variable, and it may not interact with other
predictors.
|
x |
result of sensuc
|
or.xu |
vector of possible odds ratios measuring the X:U association. |
or.u |
vector of possible odds ratios measuring the Y:U association.
Default is or.xu .
|
prev.u |
desired prevalence of U=1. Default is 0.5, which is usually a "worst case" for sensitivity analyses. |
constrain.binary.sample |
By default, the binary U values are sampled from the appropriate
distributions conditional on Y and X so that the proportions of
U=1 in each sample are exactly the desired probabilities, to within
the closeness of ntimesprobability to an integer. Specify
constrain.binary.sample=FALSE to sample from ordinary Bernoulli
distributions, to allow proportions of U=1 to reflect sampling fluctuations.
|
or.method |
see above |
event |
a function classifying the response variable into a binary event for the
purposes of constraining the association between U and Y.
For binary logistic models, event is left at its default value, which
is the identify function, i.e, the original Y values are taken as the
events (no other choice makes any sense here). For Cox models, the
default event function takes the last column of the Surv object
stored with the fit. For rare events (high proportion of censored
observations), odds ratios approximate hazard ratios, so the default is OK.
For other cases, the survival times should be considered (probably in
conjunction with the event indicators), although it may not be possible
to get a high enough hazard ratio between U and Y by sampling U by
temporarily making Y binary. See the last example which is
for a 2-column Surv object (first column of response variable=event time,
second=event indicator). When
dichotomizing survival time at a given point, it is advantageous to choose
the cutpoint so that not many censored survival times preceed the cutpoint.
Note that in fitting Cox models to examine sensitivity to U, the original
non-dichotomized failure times are used.
|
ylim |
y-axis limits for plot
|
xlab |
x-axis label |
ylab |
y-axis label |
digits |
number of digits to the right of the decimal point for drawing numbers
on the plot, for
type="numbers" or type="colors" .
|
cex.effect |
character size for drawing effect ratios |
cex.z |
character size for drawing Z statistics |
delta |
decrement in y value used to draw Z values below effect ratios |
type |
specify "symbols" (the default), "numbers" , or "colors" (see above)
|
pch |
4 plotting characters corresponding to positive and significant effects for X, positive and non-significant effects, not positive and not significant, not positive but significant |
col |
4 colors as for pch
|
alpha |
significance level |
impressive.effect |
a function of the odds or hazard ratio for X returning TRUE for a
positive effect. By default, a positive effect is taken to mean a
ratio exceeding one.
|
... |
optional arguments passed to plot
|
sensuc
returns an object of class "sensuc"
with the following elements: OR.xu
(vector of desired X:U odds ratios or a values), OOR.xu
(observed marginal X:U odds ratios), OR.u
(desired Y:U odds
ratios or b values), effect.x
(adjusted odds or hazards ratio for
X in a model adjusted for U and all of the other predictors),
effect.u
(unadjusted Y:U odds or hazards ratios), effect.u.adj
(adjusted Y:U odds or hazards ratios), Z (Z-statistics), prev.u
(input to sensuc
), cond.prev.u
(matrix with one row per a,b
combination, specifying prevalences of U conditional on Y and X
combinations), and type
("lrm"
or "cph"
).
Frank Harrell
Mark Conaway
Department of Biostatistics
Vanderbilt University School of Medicine
f.harrell@vanderbilt.edu, mconaway@virginia.edu
Rosenbaum, Paul R (1995): Observational Studies. New York: Springer-Verlag.
Rosenbaum P, Rubin D (1983): Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J Roy Statist Soc B 45:212–218.
set.seed(17) x <- sample(0:1, 500,TRUE) y <- sample(0:1, 500,TRUE) y[1:100] <- x[1:100] # induce an association between x and y x2 <- rnorm(500) f <- lrm(y ~ x + x2, x=TRUE, y=TRUE) #Note: in absence of U odds ratio for x is exp(2nd coefficient) g <- sensuc(f, c(1,3)) # Note: If the generated sample of U was typical, the odds ratio for # x dropped had U been known, where U had an odds ratio # with x of 3 and an odds ratio with y of 3 plot(g) # Fit a Cox model and check sensitivity to an unmeasured confounder # f <- cph(Surv(d.time,death) ~ treatment + pol(age,2)*sex, x=TRUE, y=TRUE) # sensuc(f, event=function(y) y[,2] & y[,1] < 365.25 ) # Event = failed, with event time before 1 year # Note: Analysis uses f$y which is a 2-column Surv object