Write/read the dimnames of an HDF5 dataset
h5writeDimnames.Rd
h5writeDimnames
and h5readDimnames
can be used to
write/read the dimnames of an HDF5 dataset to/from the HDF5 file.
Note that h5writeDimnames
is used internally by
HDF5Array::writeHDF5Array(x, ..., with.dimnames=TRUE)
to write the dimnames of x
to the HDF5 file together
with the array data.
set_h5dimnames
and get_h5dimnames
are low-level
utilities that can be used to attach existing HDF5 datasets
along the dimensions of a given HDF5 dataset, or to retrieve
the names of the HDF5 datasets that are attached along the
dimensions of a given HDF5 dataset.
Usage
h5writeDimnames(dimnames, filepath, name, group=NA, h5dimnames=NULL)
h5readDimnames(filepath, name, as.character=FALSE)
set_h5dimnames(filepath, name, h5dimnames, dry.run=FALSE)
get_h5dimnames(filepath, name)
Arguments
- dimnames
The dimnames to write to the HDF5 file. Must be supplied as a list (possibly named) with one list element per dimension in the HDF5 dataset specified via the
name
argument. Each list element indimnames
must be an atomic vector or aNULL
. When not aNULL
, its length must equal the extent of the corresponding dimension in the HDF5 dataset.- filepath
For
h5writeDimnames
andh5readDimnames
: The path (as a single string) to the HDF5 file where the dimnames should be written to or read from.For
set_h5dimnames
andget_h5dimnames
: The path (as a single string) to the HDF5 file where to set or get the h5dimnames.- name
For
h5writeDimnames
andh5readDimnames
: The name of the dataset in the HDF5 file for which the dimnames should be written or read.For
set_h5dimnames
andget_h5dimnames
: The name of the dataset in the HDF5 file for which to set or get the h5dimnames.- group
NA
(the default) or the name of the HDF5 group where to write the dimnames. If set toNA
then the group name is automatically generated fromname
. If set to the empty string (""
) then no group will be used.Except when
group
is set to the empty string, the names inh5dimnames
(see below) must be relative to the group.- h5dimnames
For
h5writeDimnames
:NULL
(the default) or a character vector containing the names of the HDF5 datasets (one per list element indimnames
) where to write the dimnames. Names associated withNULL
list elements indimnames
are ignored and should typically be NAs.If set to
NULL
then the names are automatically set to numbers indicating the associated dimensions ("1"
for the first dimension,"2"
for the second, etc...)For
set_h5dimnames
: A character vector containing the names of the HDF5 datasets to attach as dimnames of the dataset specified inname
. The vector must have one element per dimension in datasetname
. NAs are allowed and indicate dimensions along which nothing should be attached.- as.character
Even though the dimnames of an HDF5 dataset are usually stored as datasets of type
"character"
(H5 datatype"H5T_STRING"
) in the HDF5 file, this is not a requirement. By defaulth5readDimnames
will return them as-is. Setas.character
toTRUE
to make sure that they are returned as character vectors. See example below.- dry.run
When set to
TRUE
,set_h5dimnames
doesn't make any change to the HDF5 file but will still raise errors if the operation cannot be done.
Value
h5writeDimnames
and set_h5dimnames
return nothing.
h5readDimnames
returns a list (possibly named) with one list
element per dimension in HDF5 dataset name
and containing its
dimnames retrieved from the file.
get_h5dimnames
returns a character vector containing the names
of the HDF5 datasets that are currently set as the dimnames of the
dataset specified in name
. The vector has one element per
dimension in dataset name
. NAs in the vector indicate dimensions
along which nothing is set.
See also
writeHDF5Array
in the HDF5Array package for a high-level function to write an array-like object and its dimnames to an HDF5 file.h5write
in the rhdf5 package thath5writeDimnames
uses internally to write the dimnames to the HDF5 file.h5mread
in this package (h5mread) thath5readDimnames
uses internally to read the dimnames from the HDF5 file.h5ls
to list the content of an HDF5 file.
Examples
## ---------------------------------------------------------------------
## BASIC EXAMPLE
## ---------------------------------------------------------------------
library(rhdf5) # for h5write()
m0 <- matrix(1:60, ncol=5)
colnames(m0) <- LETTERS[1:5]
h5file <- tempfile(fileext=".h5")
h5write(m0, h5file, "M0") # h5write() ignores the dimnames
h5ls(h5file)
#> group name otype dclass dim
#> 0 / M0 H5I_DATASET INTEGER 12 x 5
h5writeDimnames(dimnames(m0), h5file, "M0")
h5ls(h5file)
#> group name otype dclass dim
#> 0 / .M0_dimnames H5I_GROUP
#> 1 /.M0_dimnames 2 H5I_DATASET STRING 5
#> 2 / M0 H5I_DATASET INTEGER 12 x 5
get_h5dimnames(h5file, "M0")
#> [1] NA "/.M0_dimnames/2"
h5readDimnames(h5file, "M0")
#> [[1]]
#> NULL
#>
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#>
## Reconstruct 'm0' from HDF5 file:
m1 <- h5mread(h5file, "M0")
dimnames(m1) <- h5readDimnames(h5file, "M0")
stopifnot(identical(m0, m1))
## Create an HDF5Array object that points to HDF5 dataset M0:
HDF5Array::HDF5Array(h5file, "M0")
#> <12 x 5> HDF5Matrix object of type "integer":
#> A B C D E
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
## Sanity checks:
stopifnot(
identical(dimnames(m0), h5readDimnames(h5file, "M0")),
identical(dimnames(m0), dimnames(HDF5Array::HDF5Array(h5file, "M0")))
)
## ---------------------------------------------------------------------
## SHARED DIMNAMES
## ---------------------------------------------------------------------
## If a collection of HDF5 datasets share the same dimnames, the
## dimnames only need to be written once in the HDF5 file. Then they
## can be attached to the individual datasets with set_h5dimnames():
h5write(array(runif(240), c(12, 5:4)), h5file, "A1")
set_h5dimnames(h5file, "A1", get_h5dimnames(h5file, "M0"))
get_h5dimnames(h5file, "A1")
#> [1] NA "/.M0_dimnames/2" NA
h5readDimnames(h5file, "A1")
#> [[1]]
#> NULL
#>
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#>
#> [[3]]
#> NULL
#>
HDF5Array::HDF5Array(h5file, "A1")
#> <12 x 5 x 4> HDF5Array object of type "double":
#> ,,1
#> A B C D E
#> [1,] 0.5138681 0.5613142 0.3359823 0.1017115 0.6887957
#> [2,] 0.8903995 0.5563498 0.5707793 0.4997081 0.7345539
#> ... . . . . .
#> [11,] 0.22040122 0.03208188 0.04340145 0.19352432 0.36232432
#> [12,] 0.63289237 0.56658312 0.22279007 0.75390293 0.89293491
#>
#> ...
#>
#> ,,4
#> A B C D E
#> [1,] 0.97442437 0.34824101 0.73438011 0.05068026 0.48549050
#> [2,] 0.43504479 0.56824725 0.41129356 0.62114155 0.70585658
#> ... . . . . .
#> [11,] 0.9499032 0.7897759 0.8846402 0.1689505 0.7994001
#> [12,] 0.4284685 0.4287632 0.9297739 0.9330450 0.3604983
#>
h5write(matrix(sample(letters, 60, replace=TRUE), ncol=5), h5file, "A2")
set_h5dimnames(h5file, "A2", get_h5dimnames(h5file, "M0"))
get_h5dimnames(h5file, "A2")
#> [1] NA "/.M0_dimnames/2"
h5readDimnames(h5file, "A2")
#> [[1]]
#> NULL
#>
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#>
HDF5Array::HDF5Array(h5file, "A2")
#> <12 x 5> HDF5Matrix object of type "character":
#> A B C D E
#> [1,] "f" "p" "p" "v" "w"
#> [2,] "g" "f" "x" "b" "s"
#> [3,] "w" "b" "j" "o" "x"
#> [4,] "c" "h" "c" "b" "z"
#> [5,] "g" "h" "d" "y" "f"
#> ... . . . . .
#> [8,] "n" "a" "e" "n" "y"
#> [9,] "n" "n" "k" "g" "i"
#> [10,] "t" "e" "d" "b" "i"
#> [11,] "b" "i" "a" "g" "v"
#> [12,] "a" "i" "u" "d" "c"
## Sanity checks:
stopifnot(identical(dimnames(m0), h5readDimnames(h5file, "A1")[1:2]))
stopifnot(identical(dimnames(m0), h5readDimnames(h5file, "A2")))
## ---------------------------------------------------------------------
## USE h5writeDimnames() AFTER A CALL TO writeHDF5Array()
## ---------------------------------------------------------------------
## After calling writeHDF5Array(x, ..., with.dimnames=FALSE) the
## dimnames on 'x' can still be written to the HDF5 file by doing the
## following:
## 1. Write 'm0' to the HDF5 file and ignore the dimnames (for now):
HDF5Array::writeHDF5Array(m0, h5file, "M2", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
## 2. Use h5writeDimnames() to write 'dimnames(m0)' to the file and
## associate them with the "M2" dataset:
h5writeDimnames(dimnames(m0), h5file, "M2")
## 3. Use the HDF5Array() constructor to make an HDF5Array object that
## points to the "M2" dataset:
HDF5Array::HDF5Array(h5file, "M2")
#> <12 x 5> HDF5Matrix object of type "integer":
#> A B C D E
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
## Note that at step 2. you can use the extra arguments of
## h5writeDimnames() to take full control of where the dimnames
## should be stored in the file:
HDF5Array::writeHDF5Array(m0, h5file, "M3", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
h5writeDimnames(dimnames(m0), h5file, "M3",
group="a_secret_place", h5dimnames=c("NA", "M3_dim2"))
h5ls(h5file)
#> group name otype dclass dim
#> 0 / .M0_dimnames H5I_GROUP
#> 1 /.M0_dimnames 2 H5I_DATASET STRING 5
#> 2 / .M2_dimnames H5I_GROUP
#> 3 /.M2_dimnames 2 H5I_DATASET STRING 5
#> 4 / A1 H5I_DATASET FLOAT 12 x 5 x 4
#> 5 / A2 H5I_DATASET STRING 12 x 5
#> 6 / M0 H5I_DATASET INTEGER 12 x 5
#> 7 / M2 H5I_DATASET INTEGER 12 x 5
#> 8 / M3 H5I_DATASET INTEGER 12 x 5
#> 9 / a_secret_place H5I_GROUP
#> 10 /a_secret_place M3_dim2 H5I_DATASET STRING 5
## h5readDimnames() and HDF5Array() still "find" the dimnames:
h5readDimnames(h5file, "M3")
#> [[1]]
#> NULL
#>
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#>
HDF5Array::HDF5Array(h5file, "M3")
#> <12 x 5> HDF5Matrix object of type "integer":
#> A B C D E
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
## Sanity checks:
stopifnot(
identical(dimnames(m0), h5readDimnames(h5file, "M3")),
identical(dimnames(m0), dimnames(HDF5Array::HDF5Array(h5file, "M3")))
)
## ---------------------------------------------------------------------
## STORE THE DIMNAMES AS NON-CHARACTER TYPES
## ---------------------------------------------------------------------
HDF5Array::writeHDF5Array(m0, h5file, "M4", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 13 25 37 49
#> [2,] 2 14 26 38 50
#> [3,] 3 15 27 39 51
#> [4,] 4 16 28 40 52
#> [5,] 5 17 29 41 53
#> ... . . . . .
#> [8,] 8 20 32 44 56
#> [9,] 9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60
dimnames <- list(1001:1012, as.raw(11:15))
h5writeDimnames(dimnames, h5file, "M4")
h5ls(h5file)
#> group name otype dclass dim
#> 0 / .M0_dimnames H5I_GROUP
#> 1 /.M0_dimnames 2 H5I_DATASET STRING 5
#> 2 / .M2_dimnames H5I_GROUP
#> 3 /.M2_dimnames 2 H5I_DATASET STRING 5
#> 4 / .M4_dimnames H5I_GROUP
#> 5 /.M4_dimnames 1 H5I_DATASET INTEGER 12
#> 6 /.M4_dimnames 2 H5I_DATASET INTEGER 5
#> 7 / A1 H5I_DATASET FLOAT 12 x 5 x 4
#> 8 / A2 H5I_DATASET STRING 12 x 5
#> 9 / M0 H5I_DATASET INTEGER 12 x 5
#> 10 / M2 H5I_DATASET INTEGER 12 x 5
#> 11 / M3 H5I_DATASET INTEGER 12 x 5
#> 12 / M4 H5I_DATASET INTEGER 12 x 5
#> 13 / a_secret_place H5I_GROUP
#> 14 /a_secret_place M3_dim2 H5I_DATASET STRING 5
h5readDimnames(h5file, "M4")
#> [[1]]
#> [1] 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012
#>
#> [[2]]
#> [1] 0b 0c 0d 0e 0f
#>
h5readDimnames(h5file, "M4", as.character=TRUE)
#> [[1]]
#> [1] "1001" "1002" "1003" "1004" "1005" "1006" "1007" "1008" "1009" "1010"
#> [11] "1011" "1012"
#>
#> [[2]]
#> [1] "0b" "0c" "0d" "0e" "0f"
#>
## Sanity checks:
stopifnot(identical(dimnames, h5readDimnames(h5file, "M4")))
dimnames(m0) <- dimnames
stopifnot(identical(
dimnames(m0),
h5readDimnames(h5file, "M4", as.character=TRUE)
))