Convert Numeric to Factor
Usage
cut(x, ...)
cut.default(x, breaks, labels = NULL,
include.lowest = FALSE, right = TRUE, dig.lab = 3)
Arguments
x
|
a numeric vector which is to be converted to a factor by cutting.
|
break
|
either a vector of cut points or number
giving the number of intervals which x is to be cut into.
|
labels
|
labels for the levels of the resulting category. By default,
labels are constructed using "(a,b]" interval notation. If
labels = FALSE , simple integer codes are returned instead of
a factor.
|
include.lowest
|
logical, indicating if an `x[i]' equal to the
lowest (or highest, for right = FALSE ) `breaks' value should be
included.
|
right
|
logical, indicating if the intervals should closed on the
right (and open on the left) or vice versa.
|
dig.lab
|
integer which is used when labels are not given. It
determines the number of digits used in formatting the break numbers.
|
Description
cut
divides the range of x
into intervals
and codes the values in x
according to which
interval they fall.
The leftmost interval corresponds to level one,
the next leftmost to level two and so on.Details
If a labels
parameter is specified, its values are used
to name the factor levels. If none is specified, the factor
level labels are constructed as "(b1, b2]"
, "(b2, b3]"
etc. for right=TRUE
and as "[b1, b2)"
, ... if
right=FALSE
.
In this case, dig.lab
indicates how many digits should be used in
formatting the numbers b1
, b2
, ....Value
A factor
is returned, unless labels = FALSE
which
results in the mere integer level codes.Note
Instead of table(cut(x, br))
, hist(x, br, plot = FALSE)
is
more efficient and less memory hungry.See Also
split
for splitting a variable according to a group factor;
factor
, tabulate
, table
.Examples
Z <- rnorm(10000)
table(cut(Z, br = -6:6))
system.time(print(sum(table(cut(Z, br = -6:6, labels=FALSE)))))
system.time(print(sum( hist (Z, br = -6:6, plot=FALSE)$counts)))
cut(rep(1,5),4)#-- dummy
tx0 <- c(9, 4, 6, 5, 3, 10, 5, 3, 5)
x <- rep(0:8, tx0)
tx <- table(x)
all(tx == tx0)
table( cut(x, b = 8))
table( cut(x, br = 3*(-2:5)))
table( cut(x, br = 3*(-2:5), right = F))
##--- some values OUTSIDE the breaks :
table(cx <- cut(x, br = 2*(0:4)))
table(cxl <- cut(x, br = 2*(0:4), right = F))
which(is.na(cx)); x[is.na(cx)] #-- the first 9 values 0
which(is.na(cxl)); x[is.na(cxl)] #-- the last 5 values 8
## Label construction:
y <- rnorm(100)
table(cut(y, breaks = pi/3*(-3:3)))
table(cut(y, breaks = pi/3*(-3:3), dig.lab=4))
table(cut(y, breaks = 1*(-3:3), dig.lab=4))# extra digits don't "harm" here
table(cut(y, breaks = 1*(-3:3), right = F))#- the same, since no exact INT!