split {base}R Documentation

Divide into Groups

Description

split divides the data in the vector x into the groups defined by f. The replacement forms replace values corresponding to such a division. unsplit reverses the effect of split.

Usage

split(x, f, drop = FALSE, ...)
split(x, f, drop = FALSE, ...) <- value
unsplit(value, f, drop = FALSE)

Arguments

x vector or data frame containing values to be divided into groups.
f a “factor” in the sense that as.factor(f) defines the grouping, or a list of such factors in which case their interaction is used for the grouping.
drop logical indicating if levels that do not occur should be dropped (if f is a factor or a list).
value a list of vectors or data frames compatible with a splitting of x. Recycling applies if the lengths do not match.
... further potential arguments passed to methods.

Details

split and split<- are generic functions with default and data.frame methods.

f is recycled as necessary and if the length of x is not a multiple of the length of f a warning is printed. unsplit works only with lists of vectors. The data frame method can also be used to split a matrix into a list of matrices, and the replacement form likewise, provided they are invoked explicitly.

Any missing values in f are dropped together with the corresponding values of x.

Value

The value returned from split is a list of vectors containing the values for the groups. The components of the list are named by the used factor levels given by f. (If f is longer than x then some of the components will be of zero length.)
The replacement forms return their right hand side. unsplit returns a vector for which split(x, f) equals value

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

See Also

cut

Examples

require(stats)
n <- 10; nn <- 100
g <- factor(round(n * runif(n * nn)))
x <- rnorm(n * nn) + sqrt(as.numeric(g))
xg <- split(x, g)
boxplot(xg, col = "lavender", notch = TRUE, varwidth = TRUE)
sapply(xg, length)
sapply(xg, mean)

## Calculate z-scores by group

z <- unsplit(lapply(split(x, g), scale), g)
tapply(z, g, mean)

# or

z <- x
split(z, g) <- lapply(split(x, g), scale)
tapply(z, g, sd)

## Split a matrix into a list by columns
ma <- cbind(x = 1:10, y = (-4:5)^2)
split(ma, col(ma))

split(1:10, 1:2)

[Package base version 2.4.1 Index]