Random SparseArray object
randomSparseArray.Rd
randomSparseArray()
and poissonSparseArray()
can be used
to generate a random SparseArray object efficiently.
Usage
randomSparseArray(dim, density=0.05, dimnames=NULL)
poissonSparseArray(dim, lambda=-log(0.95), density=NA, dimnames=NULL)
## Convenience wrappers for the 2D case:
randomSparseMatrix(nrow, ncol, density=0.05, dimnames=NULL)
poissonSparseMatrix(nrow, ncol, lambda=-log(0.95), density=NA,
dimnames=NULL)
Arguments
- dim
The dimensions (specified as an integer vector) of the SparseArray object to generate.
- density
The desired density (specified as a number >= 0 and <= 1) of the SparseArray object to generate, that is, the ratio between its number of nonzero elements and its total number of elements. This is
nzcount(x)/length(x)
or1 - sparsity(x)
.Note that for
poissonSparseArray()
andpoissonSparseMatrix()
density
must be < 1 and the actual density of the returned object won't be exactly as requested but will typically be very close.- dimnames
The dimnames to put on the object to generate. Must be
NULL
or a list of length the number of dimensions. Each list element must be eitherNULL
or a character vector along the corresponding dimension.- lambda
The mean of the Poisson distribution. Passed internally to the calls to
rpois()
.Only one of
lambda
anddensity
can be specified.When
density
is requested,rpois()
is called internally withlambda
set to-log(1 - density)
. This is expected to generate Poisson data with the requested density.Finally note that the default value for
lambda
corresponds to a requested density of 0.05.- nrow, ncol
Number of rows and columns of the SparseMatrix object to generate.
Details
randomSparseArray()
mimics the rsparsematrix()
function from the Matrix package but returns a SparseArray
object instead of a dgCMatrix object.
poissonSparseArray()
populates a SparseArray object with
Poisson data i.e. it's equivalent to:
but is faster and more memory efficient because intermediate dense array
a
is never generated.
Value
A SparseArray derivative (of class SVT_SparseArray or SVT_SparseMatrix) with the requested dimensions and density.
The type of the returned object is "double"
for
randomSparseArray()
and randomSparseMatrix()
,
and "integer"
for poissonSparseArray()
and
poissonSparseMatrix()
.
Note
Unlike with Matrix::rsparsematrix()
there's no
limit on the number of nonzero elements that can be contained in the
returned SparseArray object.
For example Matrix::rsparsematrix(3e5, 2e4, density=0.5)
will fail
with an error but randomSparseMatrix(3e5, 2e4, density=0.5)
should
work (even though it will take some time and the memory footprint of the
resulting object will be about 18 Gb).
See also
The
Matrix::rsparsematrix
function in the Matrix package.The
stats::rpois
function in the stats package.SVT_SparseArray objects.
Examples
## ---------------------------------------------------------------------
## randomSparseArray() / randomSparseMatrix()
## ---------------------------------------------------------------------
set.seed(123)
dgcm1 <- rsparsematrix(2500, 950, density=0.1)
set.seed(123)
svt1 <- randomSparseMatrix(2500, 950, density=0.1)
svt1
#> <2500 x 950 SparseMatrix> of type "double" [nzcount=237500 (10%)]:
#> [,1] [,2] [,3] ... [,949] [,950]
#> [1,] 0 0 0 . 0 0
#> [2,] 0 0 0 . 0 0
#> [3,] 0 0 0 . 0 0
#> [4,] 0 0 0 . 0 0
#> [5,] 0 0 0 . 0 0
#> ... . . . . . .
#> [2496,] 0.00 0.00 0.00 . -0.18 0.00
#> [2497,] 0.00 0.00 0.00 . 0.00 0.00
#> [2498,] 0.00 0.00 0.00 . 0.00 0.00
#> [2499,] 0.00 0.00 0.00 . 0.00 0.00
#> [2500,] 0.00 0.53 0.00 . 0.00 0.00
type(svt1) # "double"
#> [1] "double"
stopifnot(identical(as(svt1, "dgCMatrix"), dgcm1))
## ---------------------------------------------------------------------
## poissonSparseArray() / poissonSparseMatrix()
## ---------------------------------------------------------------------
svt2 <- poissonSparseMatrix(2500, 950, density=0.1)
svt2
#> <2500 x 950 SparseMatrix> of type "integer" [nzcount=237276 (10%)]:
#> [,1] [,2] [,3] [,4] ... [,947] [,948] [,949] [,950]
#> [1,] 1 0 0 0 . 0 0 0 1
#> [2,] 1 0 0 0 . 0 0 0 0
#> [3,] 0 1 0 0 . 0 1 0 0
#> [4,] 0 1 0 0 . 0 0 0 0
#> [5,] 0 0 0 0 . 0 0 0 0
#> ... . . . . . . . . .
#> [2496,] 0 0 0 0 . 0 1 0 0
#> [2497,] 0 0 0 0 . 0 0 0 0
#> [2498,] 0 0 0 0 . 0 0 0 0
#> [2499,] 1 0 1 0 . 0 0 0 2
#> [2500,] 0 0 1 0 . 0 0 0 0
type(svt2) # "integer"
#> [1] "integer"
1 - sparsity(svt2) # very close to the requested density
#> [1] 0.09990568
set.seed(123)
svt3 <- poissonSparseArray(c(600, 1700, 80), lambda=0.01)
set.seed(123)
a3 <- array(rpois(length(svt3), lambda=0.01), dim(svt3))
stopifnot(identical(svt3, SparseArray(a3)))
## The memory footprint of 'svt3' is 10x smaller than that of 'a3':
object.size(svt3)
#> 20613424 bytes
object.size(a3)
#> 326400224 bytes
as.double(object.size(a3) / object.size(svt3))
#> [1] 15.83435