SparseArray objects
SparseArray-class.Rd
The SparseArray package defines the SparseArray virtual class whose purpose is to be extended by other S4 classes that aim at representing in-memory multidimensional sparse arrays.
It has currently two concrete subclasses, COO_SparseArray and SVT_SparseArray, both also defined in this package. Each subclass uses its own internal representation for the nonzero multidimensional data, the COO layout for COO_SparseArray, and the SVT layout for SVT_SparseArray. The two layouts are described in the COO_SparseArray and SVT_SparseArray man pages, respectively.
Finally, the package also defines the SparseMatrix virtual class, as a subclass of the SparseArray class, for the specific 2D case.
Arguments
- x
An ordinary matrix or array, or a dg[C|R]Matrix object, or an lg[C|R]Matrix object, or any matrix-like or array-like object that supports coercion to SVT_SparseArray.
- type
A single string specifying the requested type of the object.
By default, the SparseArray object returned by the constructor function has the same
type()
asx
. However the user can use thetype
argument to request a different type. Note that doing:sa <- SparseArray(x, type=type)
is equivalent to doing:
sa <- SparseArray(x) type(sa) <- type
but the former is more convenient and will generally be more efficient.
Supported types are all R atomic types plus
"list"
.
Details
The SparseArray class extends the Array virtual class defined in the S4Arrays package. Here is the full SparseArray sub-hierarchy as defined in the SparseArray package (virtual classes are marked with an asterisk):
: Array class : Array*
: hierarchy : ^
|
- - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - -
: SparseArray : SparseArray*
: sub-hierarchy : ^ ^ ^
| | |
COO_SparseArray | SVT_SparseArray
^ | ^
- - - - - - - - - - - - | - - - - | - - - - | - - - - - - - - - -
: SparseMatrix : | SparseMatrix* |
: sub-sub-hierarchy : | ^ ^ |
| | | |
COO_SparseMatrix SVT_SparseMatrix
Any object that belongs to a class that extends SparseArray e.g. (a SVT_SparseArray or SVT_SparseMatrix object) is called a SparseArray derivative.
Most of the standard matrix and array API defined in base R should
work on SparseArray derivatives, including dim()
, length()
,
dimnames()
, `dimnames<-`()
, [
, drop()
,
`[<-`
(subassignment), t()
, rbind()
, cbind()
,
etc...
SparseArray derivatives also support type()
, `type<-`()
,
is_sparse()
, nzcount()
, nzwhich()
, nzvals()
,
`nzvals<-`()
, sparsity()
, arbind()
, and acbind()
.
Value
A SparseArray derivative, that is a SVT_SparseArray, COO_SparseArray, SVT_SparseMatrix, or COO_SparseMatrix object.
The type()
of the input object is preserved, except if a
different one was requested via the type
argument.
What is considered a zero depends on the type()
:
"logical"
zero isFALSE
;"integer"
zero is0L
;"double"
zero is0
;"complex"
zero is0+0i
;"raw"
zero israw(1)
;"character"
zero is""
(empty string);"list"
zero isNULL
.
See also
The COO_SparseArray and SVT_SparseArray classes.
is_nonzero for
is_nonzero()
andnz*()
functionsnzcount()
,nzwhich()
, etc...SparseArray_aperm for permuting the dimensions of a SparseArray object (e.g. transposition).
SparseArray_subsetting for subsetting a SparseArray object.
SparseArray_subassignment for SparseArray subassignment.
SparseArray_abind for combining 2D or multidimensional SparseArray objects.
SparseArray_summarization for SparseArray summarization methods.
SparseArray_Arith, SparseArray_Compare, and SparseArray_Logic, for operations from the
Arith
,Compare
, andLogic
groups on SparseArray objects.SparseArray_Math for operations from the
Math
andMath2
groups on SparseArray objects.SparseArray_Complex for operations from the
Complex
group on SparseArray objects.SparseArray_misc for miscellaneous operations on a SparseArray object.
SparseArray_matrixStats for col/row summarization methods for SparseArray objects.
rowsum_methods for
rowsum()
methods for sparse matrices.SparseMatrix_mult for SparseMatrix multiplication and cross-product.
randomSparseArray
to generate a random SparseArray object.readSparseCSV
to read/write a sparse matrix from/to a CSV (comma-separated values) file.dgCMatrix-class, dgRMatrix-class, and lgCMatrix-class in the Matrix package, for the de facto standard for sparse matrix representations in the R ecosystem.
is_sparse
in the S4Arrays package.The Array class defined in the S4Arrays package.
Ordinary array objects in base R.
base::which
in base R.
Examples
## ---------------------------------------------------------------------
## Display details of class definition & known subclasses
## ---------------------------------------------------------------------
showClass("SparseArray")
#> Virtual Class "SparseArray" [package "SparseArray"]
#>
#> Slots:
#>
#> Name: dim dimnames
#> Class: integer list
#>
#> Extends: "Array"
#>
#> Known Subclasses:
#> Class "SparseMatrix", directly
#> Class "COO_SparseArray", directly
#> Class "SVT_SparseArray", directly
#> Class "COO_SparseMatrix", by class "SparseMatrix", distance 2
#> Class "COO_SparseMatrix", by class "COO_SparseArray", distance 2, with explicit coerce
#> Class "SVT_SparseMatrix", by class "SVT_SparseArray", distance 2
## ---------------------------------------------------------------------
## The SparseArray() constructor
## ---------------------------------------------------------------------
a <- array(rpois(9e6, lambda=0.3), dim=c(500, 3000, 6))
SparseArray(a) # an SVT_SparseArray object
#> <500 x 3000 x 6 SparseArray> of type "integer" [nzcount=2333666 (26%)]:
#> ,,1
#> [,1] [,2] [,3] [,4] ... [,2997] [,2998] [,2999] [,3000]
#> [1,] 1 1 1 0 . 0 0 1 1
#> [2,] 0 1 0 0 . 1 0 0 0
#> ... . . . . . . . . .
#> [499,] 0 0 0 1 . 0 0 0 1
#> [500,] 0 0 0 1 . 0 0 0 0
#>
#> ...
#>
#> ,,6
#> [,1] [,2] [,3] [,4] ... [,2997] [,2998] [,2999] [,3000]
#> [1,] 0 1 1 0 . 0 1 0 1
#> [2,] 0 3 0 1 . 0 0 0 2
#> ... . . . . . . . . .
#> [499,] 0 0 0 0 . 0 0 0 1
#> [500,] 0 0 0 0 . 1 0 0 0
#>
m <- matrix(rpois(9e6, lambda=0.3), ncol=500)
SparseArray(m) # an SVT_SparseMatrix object
#> <18000 x 500 SparseMatrix> of type "integer" [nzcount=2333381 (26%)]:
#> [,1] [,2] [,3] [,4] ... [,497] [,498] [,499] [,500]
#> [1,] 0 0 0 0 . 0 0 0 0
#> [2,] 1 0 1 0 . 0 1 0 0
#> [3,] 0 1 0 2 . 0 0 0 1
#> [4,] 0 0 0 0 . 0 0 1 0
#> [5,] 1 0 0 0 . 1 0 0 0
#> ... . . . . . . . . .
#> [17996,] 0 0 0 0 . 0 0 0 2
#> [17997,] 0 0 0 1 . 0 0 0 0
#> [17998,] 0 1 0 0 . 0 0 0 1
#> [17999,] 1 1 0 0 . 0 0 0 0
#> [18000,] 0 0 0 0 . 1 0 0 1
dgc <- sparseMatrix(i=c(4:1, 2:4, 9:12, 11:9), j=c(1:7, 1:7),
x=runif(14), dims=c(12, 7))
class(dgc)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
SparseArray(dgc) # an SVT_SparseMatrix object
#> <12 x 7 SparseMatrix> of type "double" [nzcount=14 (17%)]:
#> [,1] [,2] [,3] ... [,6] [,7]
#> [1,] 0.0000000 0.0000000 0.0000000 . 0.00000000 0.00000000
#> [2,] 0.0000000 0.0000000 0.2981720 . 0.00000000 0.00000000
#> [3,] 0.0000000 0.1679951 0.0000000 . 0.08551527 0.00000000
#> [4,] 0.7284175 0.0000000 0.0000000 . 0.00000000 0.58189642
#> [5,] 0.0000000 0.0000000 0.0000000 . 0.00000000 0.00000000
#> ... . . . . . .
#> [8,] 0.0000000 0.0000000 0.0000000 . 0.0000000 0.0000000
#> [9,] 0.3782352 0.0000000 0.0000000 . 0.0000000 0.6285016
#> [10,] 0.0000000 0.1226004 0.0000000 . 0.4560827 0.0000000
#> [11,] 0.0000000 0.0000000 0.5249243 . 0.0000000 0.0000000
#> [12,] 0.0000000 0.0000000 0.0000000 . 0.0000000 0.0000000
dgr <- as(dgc, "RsparseMatrix")
class(dgr)
#> [1] "dgRMatrix"
#> attr(,"package")
#> [1] "Matrix"
SparseArray(dgr) # a COO_SparseMatrix object
#> <12 x 7 SparseMatrix> of type "double" [nzcount=14 (17%)]:
#> [,1] [,2] [,3] ... [,6] [,7]
#> [1,] 0.0000000 0.0000000 0.0000000 . 0.00000000 0.00000000
#> [2,] 0.0000000 0.0000000 0.2981720 . 0.00000000 0.00000000
#> [3,] 0.0000000 0.1679951 0.0000000 . 0.08551527 0.00000000
#> [4,] 0.7284175 0.0000000 0.0000000 . 0.00000000 0.58189642
#> [5,] 0.0000000 0.0000000 0.0000000 . 0.00000000 0.00000000
#> ... . . . . . .
#> [8,] 0.0000000 0.0000000 0.0000000 . 0.0000000 0.0000000
#> [9,] 0.3782352 0.0000000 0.0000000 . 0.0000000 0.6285016
#> [10,] 0.0000000 0.1226004 0.0000000 . 0.4560827 0.0000000
#> [11,] 0.0000000 0.0000000 0.5249243 . 0.0000000 0.0000000
#> [12,] 0.0000000 0.0000000 0.0000000 . 0.0000000 0.0000000
## ---------------------------------------------------------------------
## nzcount(), nzwhich(), nzvals(), `nzvals<-`()
## ---------------------------------------------------------------------
x <- SparseArray(a)
## Get the number of nonzero array elements in 'x':
nzcount(x)
#> [1] 2333666
## nzwhich() returns the indices of the nonzero array elements in 'x'.
## Either as an integer (or numeric) vector of length 'nzcount(x)'
## containing "linear indices":
nzidx <- nzwhich(x)
length(nzidx)
#> [1] 2333666
head(nzidx)
#> [1] 1 5 10 14 15 26
## Or as an integer matrix with 'nzcount(x)' rows and one column per
## dimension where the rows represent "array indices" (a.k.a. "array
## coordinates"):
Mnzidx <- nzwhich(x, arr.ind=TRUE)
dim(Mnzidx)
#> [1] 2333666 3
## Each row in the matrix is an n-tuple representing the "array
## coordinates" of a nonzero element in 'x':
head(Mnzidx)
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#> [2,] 5 1 1
#> [3,] 10 1 1
#> [4,] 14 1 1
#> [5,] 15 1 1
#> [6,] 26 1 1
tail(Mnzidx)
#> [,1] [,2] [,3]
#> [2333661,] 477 3000 6
#> [2333662,] 482 3000 6
#> [2333663,] 487 3000 6
#> [2333664,] 489 3000 6
#> [2333665,] 497 3000 6
#> [2333666,] 499 3000 6
## Extract the values of the nonzero array elements in 'x' and return
## them in a vector "parallel" to 'nzwhich(x)':
x_nzvals <- nzvals(x) # equivalent to 'x[nzwhich(x)]'
length(x_nzvals)
#> [1] 2333666
head(x_nzvals)
#> [1] 1 1 1 1 1 1
nzvals(x) <- log1p(nzvals(x))
x
#> <500 x 3000 x 6 SparseArray> of type "double" [nzcount=2333666 (26%)]:
#> ,,1
#> [,1] [,2] [,3] ... [,2999] [,3000]
#> [1,] 0.6931472 0.6931472 0.6931472 . 0.6931472 0.6931472
#> [2,] 0.0000000 0.6931472 0.0000000 . 0.0000000 0.0000000
#> ... . . . . . .
#> [499,] 0 0 0 . 0.0000000 0.6931472
#> [500,] 0 0 0 . 0.0000000 0.0000000
#>
#> ...
#>
#> ,,6
#> [,1] [,2] [,3] ... [,2999] [,3000]
#> [1,] 0.0000000 0.6931472 0.6931472 . 0.0000000 0.6931472
#> [2,] 0.0000000 1.3862944 0.0000000 . 0.0000000 1.0986123
#> ... . . . . . .
#> [499,] 0 0 0 . 0.0000000 0.6931472
#> [500,] 0 0 0 . 0.0000000 0.0000000
#>
## Sanity checks:
stopifnot(identical(nzidx, which(a != 0)))
stopifnot(identical(Mnzidx, which(a != 0, arr.ind=TRUE, useNames=FALSE)))
stopifnot(identical(x_nzvals, a[nzidx]))
stopifnot(identical(x_nzvals, a[Mnzidx]))
stopifnot(identical(`nzvals<-`(x, nzvals(x)), x))