R: predict method for random forest objects

predict.randomForest {randomForest}

R Documentation

predict method for random forest objects

Description

Prediction of test data using random forest.

Usage

## S3 method for class 'randomForest':
predict(object, newdata, type="response",
  norm.votes=TRUE, predict.all=FALSE, proximity=FALSE, nodes=FALSE,
  cutoff, ...)

Arguments

`object`	an object of class `randomForest`, as that created by the function `randomForest`.
`newdata`	a data frame or matrix containing new data. (Note: If not given, the out-of-bag prediction in `object` is returned.
`type`	one of `response`, `prob`. or `votes`, indicating the type of output: predicted values, matrix of class probabilities, or matrix of vote counts. `class` is allowed, but automatically converted to "response", for backward compatibility.
`norm.votes`	Should the vote counts be normalized (i.e., expressed as fractions)? Ignored if `object$type` is `regression`.
`predict.all`	Should the predictions of all trees be kept?
`proximity`	Should proximity measures be computed? An error is issued if `object$type` is `regression`.
`nodes`	Should the terminal node indicators (an n by ntree matrix) be return? If so, it is in the ``nodes'' attribute of the returned object.
`cutoff`	(Classification only) A vector of length equal to number of classes. The `winning' class for an observation is the one with the maximum ratio of proportion of votes to cutoff. Default is taken from the `forest$cutoff` component of `object` (i.e., the setting used when running `randomForest`).
`...`	not used currently.

Value

If object$type is regression, a vector of predicted values is returned. If predict.all=TRUE, then the returned object is a list of two components: aggregate, which is the vector of predicted values by the forest, and individual, which is a matrix where each column contains prediction by a tree in the forest.
If object$type is classification, the object returned depends on the argument type:

`response`	predicted classes (the classes with majority vote).
`prob`	matrix of class probabilities (one column for each class and one row for each input).
`vote`	matrix of vote counts (one column for each class and one row for each new input); either in raw counts or in fractions (if `norm.votes=TRUE`).

If predict.all=TRUE, then the individual component of the returned object is a character matrix where each column contains the predicted class by a tree in the forest.
If proximity=TRUE, the returned object is a list with two components: pred is the prediction (as described above) and proximity is the proximitry matrix. An error is issued if object$type is regression.
If nodes=TRUE, the returned object has a ``nodes'' attribute, which is an n by ntree matrix, each column containing the node number that the cases fall in for that tree.

Author(s)

Andy Liaw andy_liaw@merck.com and Matthew Wiener matthew_wiener@merck.com, based on original Fortran code by Leo Breiman and Adele Cutler.

References

Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

Examples

data(iris)
set.seed(111)
ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2))
iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,])
iris.pred <- predict(iris.rf, iris[ind == 2,])
table(observed = iris[ind==2, "Species"], predicted = iris.pred)

[Package randomForest version 4.5-18 Index]