histogram {lattice}R Documentation

Histograms and Kernel Density Plots

Description

Draw Histograms and Kernel Density Plots, possibly conditioned on other variables.

Usage

histogram(x, data, ...)
densityplot(x, data, ...)
## S3 method for class 'formula':
histogram(x,
          data,
          allow.multiple, outer = TRUE,
          auto.key = FALSE,
          aspect = "fill",
          panel = "panel.histogram",
          prepanel, scales, strip, groups,
          xlab, xlim, ylab, ylim,
          type = c("percent", "count", "density"),
          nint = if (is.factor(x)) nlevels(x)
          else round(log2(length(x)) + 1),
          endpoints = extend.limits(range(x, finite = TRUE), prop = 0.04),
          breaks,
          equal.widths = TRUE,
          drop.unused.levels = lattice.getOption("drop.unused.levels"),
          ...,
          default.scales = list(),
          subscripts,
          subset)

## S3 method for class 'numeric':
histogram(x, data, xlab, ...)
## S3 method for class 'factor':
histogram(x, data, xlab, ...)

## S3 method for class 'formula':
densityplot(x,
            data,
            allow.multiple = is.null(groups) || outer,
            outer = !is.null(groups),
            auto.key = FALSE,
            aspect = "fill",
            panel = "panel.densityplot",
            prepanel, scales, strip, groups,
            xlab, xlim, ylab, ylim,
            bw, adjust, kernel, window, width, give.Rkern,
            n = 50, from, to, cut, na.rm,
            drop.unused.levels = lattice.getOption("drop.unused.levels"),
            ...,
            default.scales = list(),
            subscripts,
            subset)
## S3 method for class 'numeric':
densityplot(x, data, xlab, ...)

do.breaks(endpoints, nint)

Arguments

x The object on which method dispatch is carried out.
For the formula method, a formula of the form ~ x | g1 * g2 * ... indicates that histograms or Kernel Density estimates of x should be produced conditioned on the levels of the (optional) variables g1, g2, .... x can be numeric (or factor for histogram), and each of g1, g2, ... must be either factors or shingles.
As a special case, the right hand side of the formula can contain more than one variable separated by a + sign. What happens in this case is described in details in the documentation for xyplot. Note that in either form, all the variables involved in the formula have to have same length.
For the numeric and factor methods, x replaces the x vector described above. Conditioning is not allowed in these cases.
data For the formula method, an optional data frame in which variables are to be evaluated. Ignored with a warning in other cases.
type Character string indicating type of histogram to be drawn. "percent" and "count" give relative frequency and frequency histograms, and can be misleading when breakpoints are not equally spaced. "density" produces a density scale histogram.
type defaults to "percent", except when the breakpoints are unequally spaced or breaks = NULL, when it defaults to "density".
nint Number of bins. Applies only when breaks is unspecified or NULL in the call. Not applicable when the variable being plotted is a factor.
endpoints vector of length 2 indicating the range of x-values that is to be covered by the histogram. This applies only when breaks is unspecified and the variable being plotted is not a factor. In do.breaks, this specifies the interval that is to be divided up.
breaks usually a numeric vector of length (number of bins + 1) defining the breakpoints of the bins. Note that when breakpoints are not equally spaced, the only value of type that makes sense is density. When unspecified, the default is to use
      breaks = seq_len(1 + nlevels(x)) - 0.5
when x is a factor, and
      breaks = do.breaks(endpoints, nint)
otherwise. Breakpoints calculated in such a manner are used in all panels.
Other values of breaks are possible, in which case they affect the display in each panel differently. A special value of breaks is NULL, in which case the number of bins is determined by nint and then breakpoints are chosen according to the value of equal.widths. Other valid values of breaks are those of the breaks argument in hist. This allows specification of breaks as an integer giving the number of bins (similar to nint), as a character string denoting a method, and as a function.
equal.widths logical, relevant only when breaks=NULL. If TRUE, equally spaced bins will be selected, otherwise, approximately equal area bins will be selected (this would mean that the breakpoints will not be equally spaced).
n number of points at which density is to be evaluated
panel The function that uses the packet (subset of display variables) corresponding to a panel to create a display. Default panel functions are documented separately, and often have arguments that can be used to customize its display in various ways. Such arguments can usually be directly supplied to the high level function.
allow.multiple, outer, auto.key, aspect, prepanel, scales, strip, groups, xlab, xlim, ylab, ylim, drop.unused.levels, default.scales, subscripts, subset See xyplot
bw, adjust, kernel, window, width, give.Rkern, from, to, cut, na.rm arguments to density, passed on as appropriate
... Further arguments. See corresponding entry in xyplot for non-trivial details.

Details

histogram draws Conditional Histograms, while densityplot draws Conditional Kernel Density Plots. The density estimate in densityplot is actually calculated using the function density, and all arguments accepted by it can be passed (as ...) in the call to densityplot to control the output. See documentation of density for details. (Note: The default value of the argument n of density is changed to 50.)

These and all other high level Trellis functions have several arguments in common. These are extensively documented only in the help page for xyplot, which should be consulted to learn more detailed usage.

do.breaks is an utility function that calculates breakpoints given an interval and the number of pieces to break it into.

Value

An object of class "trellis". The update method can be used to update components of the object and the print method (usually called by default) will plot it on an appropriate plotting device.

Note

The form of the arguments accepted by the default panel function panel.histogram is different from that in S-PLUS. Whereas S-PLUS calculates the heights inside histogram and passes only the breakpoints and the heights to the panel function, here the original variable x is passed along with the breakpoints. This allows plots as in the second example below.

Author(s)

Deepayan Sarkar Deepayan.Sarkar@R-project.org

See Also

xyplot, panel.histogram, density, panel.densityplot, panel.mathdensity, Lattice

Examples

require(stats)
histogram( ~ height | voice.part, data = singer, nint = 17,
          endpoints = c(59.5, 76.5), layout = c(2,4), aspect = 1,
          xlab = "Height (inches)")

histogram( ~ height | voice.part, data = singer,
          xlab = "Height (inches)", type = "density",
          panel = function(x, ...) {
              panel.histogram(x, ...)
              panel.mathdensity(dmath = dnorm, col = "black",
                                args = list(mean=mean(x),sd=sd(x)))
          } )

densityplot( ~ height | voice.part, data = singer, layout = c(2, 4),  
            xlab = "Height (inches)", bw = 5)

[Package lattice version 0.14-16 Index]