family {stats} | R Documentation |
Family objects provide a convenient way to specify the details of the
models used by functions such as glm
. See the
documentation for glm
for the details on how such model
fitting takes place.
family(object, ...) binomial(link = "logit") gaussian(link = "identity") Gamma(link = "inverse") inverse.gaussian(link = "1/mu^2") poisson(link = "log") quasi(link = "identity", variance = "constant") quasibinomial(link = "logit") quasipoisson(link = "log")
link |
a specification for the model link function. This can be
a name/expression, a literal character string, a length-one character
vector or an object of class
"link-glm" (provided it is not specified
via one of the standard names given next).
The gaussian family accepts the links "identity" ,
"log" and "inverse" ;
the binomial family the links "logit" ,
"probit" , "cauchit" , (corresponding to logistic,
normal and Cauchy CDFs respectively) "log" and
"cloglog" (complementary log-log);
the Gamma family the links "inverse" , "identity"
and "log" ;
the poisson family the links "log" , "identity" ,
and "sqrt" and the inverse.gaussian family the links
"1/mu^2" , "inverse" , "identity"
and "log" .
The quasi family accepts the links "logit" , "probit" ,
"cloglog" , "identity" , "inverse" ,
"log" , "1/mu^2" and "sqrt" , and
the function power can be used to create a
power link function.
|
variance |
for all families other than quasi , the variance
function is determined by the family. The quasi family will
accept the literal character string (or unquoted as a name/expression)
specifications "constant" , "mu(1-mu)" , "mu" ,
"mu^2" and "mu^3" , a length-one character vector
taking one of those values, or a list containing components
varfun , validmu , dev.resids , initialize
and name .
|
object |
the function family accesses the family
objects which are stored within objects created by modelling
functions (e.g., glm ). |
... |
further arguments passed to methods. |
family
is a generic function with methods for classes
"glm"
and "lm"
(the latter returning gaussian()
).
The quasibinomial
and quasipoisson
families differ from
the binomial
and poisson
families only in that the
dispersion parameter is not fixed at one, so they can “model”
over-dispersion. For the binomial case see McCullagh and Nelder
(1989, pp. 124–8). Although they show that there is (under some
restrictions) a model with
variance proportional to mean as in the quasi-binomial model, note
that glm
does not compute maximum-likelihood estimates in that
model. The behaviour of S is closer to the quasi- variants.
An object of class "family"
(which has a concise print method).
This is a list with elements
family |
character: the family name. |
link |
character: the link name. |
linkfun |
function: the link. |
linkinv |
function: the inverse of the link function. |
variance |
function: the variance as a function of the mean. |
dev.resids |
function giving the deviance residuals as a function
of (y, mu, wt) . |
aic |
function giving the AIC value if appropriate (but NA
for the quasi- families). See logLik for the assumptions
made about the dispersion parameter. |
mu.eta |
function: derivative function(eta)
dmu/deta. |
initialize |
expression. This needs to set up whatever data
objects are needed for the family as well as n (needed for
AIC in the binomial family) and mustart (see glm . |
valid.mu |
logical function. Returns TRUE if a mean
vector mu is within the domain of variance . |
valid.eta |
logical function. Returns TRUE if a linear
predictor eta is within the domain of linkinv . |
The link
and variance
arguments have rather awkward
semantics for back-compatibility. The recommended way is to supply
them is as quoted character strings, but they can also be supplied
unquoted (as names or expressions). In addition, they can also be
supplied as a length-one character vector giving the name of one of
the options, or as a list (for link
, of class "link-glm"
).
This is potentially ambiguous: supplying link=logit
could mean
the unquoted name of a link or the value of object logit
. It
is interpreted if possible as the name of an allowed link, then
as an object. (You can force the interpretation to always be the value of
an object via logit[1]
.)
The design was inspired by S functions of the same names described
in Hastie & Pregibon (1992) (except quasibinomial
and
quasipoisson
).
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
Dobson, A. J. (1983) An Introduction to Statistical Modelling. London: Chapman and Hall.
Cox, D. R. and Snell, E. J. (1981). Applied Statistics; Principles and Examples. London: Chapman and Hall.
Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
nf <- gaussian()# Normal family nf str(nf)# internal STRucture gf <- Gamma() gf str(gf) gf$linkinv gf$variance(-3:4) #- == (.)^2 ## quasipoisson. compare with example(glm) counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) d.AD <- data.frame(treatment, outcome, counts) glm.qD93 <- glm(counts ~ outcome + treatment, family=quasipoisson()) glm.qD93 anova(glm.qD93, test="F") summary(glm.qD93) ## for Poisson results use anova(glm.qD93, dispersion = 1, test="Chisq") summary(glm.qD93, dispersion = 1) ## Example of user-specified link, a logit model for p^days ## See Shaffer, T. 2004. Auk 121(2): 526-540. logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * .Call("logit_mu_eta", eta, PACKAGE = "stats") valideta <- function(eta) TRUE link <- paste("logexp(", days, ")", sep="") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } binomial(logexp(3)) ## in practice this would be used with a vector of 'days', in ## which case use an offset of 0 in the corresponding formula ## to get the null deviance right. ## tests of quasi x <- rnorm(100) y <- rpois(100, exp(1+x)) glm(y ~x, family=quasi(var="mu", link="log")) # which is the same as glm(y ~x, family=poisson) glm(y ~x, family=quasi(var="mu^2", link="log")) ## Not run: glm(y ~x, family=quasi(var="mu^3", link="log")) # should fail y <- rbinom(100, 1, plogis(x)) # needs to set a starting value for the next fit glm(y ~x, family=quasi(var="mu(1-mu)", link="logit"), start=c(0,1))