dists {cba}R Documentation

Matrix Distance Computation

Description

This function computes and return the auto-distance matrix between the rows of a matrix, as well as the cross-distance matrix between two matrices.

Usage

dists(x, y = NULL, method = "minkowski", p = 2)

dapply(x, y, FUN, ...)

Arguments

x a numeric matrix object.
y NULL, or a numeric matrix object.
method a mnemonic string referencing the distance measure.
p Minkowski metric parameter.
FUN a user supplied function.
... further arguments to the user supplied function.

Details

The interface is fashioned after dist: you have to specify a method to use, i.e. a (not so) mnemonic name.

Methods that are also implemented in dist are: minkowski, maximum, canberra, and binary. See the documentation there. Note that for binary the arguments x (and y) must be logical.

Additional methods implemented are:

ebinary:
one minus extended Jaccard similarities on real-valued vectors using Euclidean distances.
fbinary:
one minus fuzzy Jaccard similarities on positive real-valued vectors as proposed by Kurt Hornik.
angular:
one minus cosine similarities on real-valued vectors.

Missing values are allowed but are excluded from all computations involving the rows within which they occur. However, rows (and columns) of NAs are not dropped as in dist.

For compatibility the distance is zero instead of NA in the case two (near) zero vectors are involved in the computation of binary, ebinary, and angular. Note that this is inconsistent with the coding of NA by as.dummy.

Function dapply allows the user to apply arbitrary distance functions that take as arguments at least two vectors (i.e. rows of x, etc.) and return a scalar real value.

Value

Auto distances are returned as an object of class dist and cross-distances as an object of class matrix.

Warning

The interface is experimental and may change in the future.

Author(s)

Christian Buchta

See Also

dist for compatibility information.

Examples

### binary data
x <- matrix(sample(c(FALSE,TRUE),8,rep=TRUE), ncol=2)
dists(x, method="binary")
### for real-valued data
dists(x, method="ebinary")
### for positive real-valued data
dists(x, method="fbinary")
### cross distances
dists(x, x, method="binary")
### this is the same but less efficient
as.matrix(dists(x, method="binary"))
## test inheritance of names
rownames(x) <- LETTERS[1:4]
dists(x)
dists(x,x)
## custom distance function
f <- function(x, y) sum(x*y)
dapply(x, FUN=f)
dapply(x,x, FUN=f)

[Package cba version 0.2-1 Index]