Skip to contents

h5writeDimnames and h5readDimnames can be used to write/read the dimnames of an HDF5 dataset to/from the HDF5 file.

Note that h5writeDimnames is used internally by HDF5Array::writeHDF5Array(x, ..., with.dimnames=TRUE) to write the dimnames of x to the HDF5 file together with the array data.

set_h5dimnames and get_h5dimnames are low-level utilities that can be used to attach existing HDF5 datasets along the dimensions of a given HDF5 dataset, or to retrieve the names of the HDF5 datasets that are attached along the dimensions of a given HDF5 dataset.

Usage

h5writeDimnames(dimnames, filepath, name, group=NA, h5dimnames=NULL)
h5readDimnames(filepath, name, as.character=FALSE)

set_h5dimnames(filepath, name, h5dimnames, dry.run=FALSE)
get_h5dimnames(filepath, name)

Arguments

dimnames

The dimnames to write to the HDF5 file. Must be supplied as a list (possibly named) with one list element per dimension in the HDF5 dataset specified via the name argument. Each list element in dimnames must be an atomic vector or a NULL. When not a NULL, its length must equal the extent of the corresponding dimension in the HDF5 dataset.

filepath

For h5writeDimnames and h5readDimnames: The path (as a single string) to the HDF5 file where the dimnames should be written to or read from.

For set_h5dimnames and get_h5dimnames: The path (as a single string) to the HDF5 file where to set or get the h5dimnames.

name

For h5writeDimnames and h5readDimnames: The name of the dataset in the HDF5 file for which the dimnames should be written or read.

For set_h5dimnames and get_h5dimnames: The name of the dataset in the HDF5 file for which to set or get the h5dimnames.

group

NA (the default) or the name of the HDF5 group where to write the dimnames. If set to NA then the group name is automatically generated from name. If set to the empty string ("") then no group will be used.

Except when group is set to the empty string, the names in h5dimnames (see below) must be relative to the group.

h5dimnames

For h5writeDimnames: NULL (the default) or a character vector containing the names of the HDF5 datasets (one per list element in dimnames) where to write the dimnames. Names associated with NULL list elements in dimnames are ignored and should typically be NAs.

If set to NULL then the names are automatically set to numbers indicating the associated dimensions ("1" for the first dimension, "2" for the second, etc...)

For set_h5dimnames: A character vector containing the names of the HDF5 datasets to attach as dimnames of the dataset specified in name. The vector must have one element per dimension in dataset name. NAs are allowed and indicate dimensions along which nothing should be attached.

as.character

Even though the dimnames of an HDF5 dataset are usually stored as datasets of type "character" (H5 datatype "H5T_STRING") in the HDF5 file, this is not a requirement. By default h5readDimnames will return them as-is. Set as.character to TRUE to make sure that they are returned as character vectors. See example below.

dry.run

When set to TRUE, set_h5dimnames doesn't make any change to the HDF5 file but will still raise errors if the operation cannot be done.

Value

h5writeDimnames and set_h5dimnames return nothing.

h5readDimnames returns a list (possibly named) with one list element per dimension in HDF5 dataset name and containing its dimnames retrieved from the file.

get_h5dimnames returns a character vector containing the names of the HDF5 datasets that are currently set as the dimnames of the dataset specified in name. The vector has one element per dimension in dataset name. NAs in the vector indicate dimensions along which nothing is set.

See also

  • writeHDF5Array in the HDF5Array package for a high-level function to write an array-like object and its dimnames to an HDF5 file.

  • h5write in the rhdf5 package that h5writeDimnames uses internally to write the dimnames to the HDF5 file.

  • h5mread in this package (h5mread) that h5readDimnames uses internally to read the dimnames from the HDF5 file.

  • h5ls to list the content of an HDF5 file.

Examples

## ---------------------------------------------------------------------
## BASIC EXAMPLE
## ---------------------------------------------------------------------
library(rhdf5)  # for h5write()

m0 <- matrix(1:60, ncol=5)
colnames(m0) <- LETTERS[1:5]

h5file <- tempfile(fileext=".h5")
h5write(m0, h5file, "M0")  # h5write() ignores the dimnames
h5ls(h5file)
#>   group name       otype  dclass    dim
#> 0     /   M0 H5I_DATASET INTEGER 12 x 5

h5writeDimnames(dimnames(m0), h5file, "M0")
h5ls(h5file)
#>           group         name       otype  dclass    dim
#> 0             / .M0_dimnames   H5I_GROUP               
#> 1 /.M0_dimnames            2 H5I_DATASET  STRING      5
#> 2             /           M0 H5I_DATASET INTEGER 12 x 5

get_h5dimnames(h5file, "M0")
#> [1] NA                "/.M0_dimnames/2"
h5readDimnames(h5file, "M0")
#> [[1]]
#> NULL
#> 
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#> 

## Reconstruct 'm0' from HDF5 file:
m1 <- h5mread(h5file, "M0")
dimnames(m1) <- h5readDimnames(h5file, "M0")
stopifnot(identical(m0, m1))

## Create an HDF5Array object that points to HDF5 dataset M0:
HDF5Array::HDF5Array(h5file, "M0")
#> <12 x 5> HDF5Matrix object of type "integer":
#>        A  B  C  D  E
#>  [1,]  1 13 25 37 49
#>  [2,]  2 14 26 38 50
#>  [3,]  3 15 27 39 51
#>  [4,]  4 16 28 40 52
#>  [5,]  5 17 29 41 53
#>   ...  .  .  .  .  .
#>  [8,]  8 20 32 44 56
#>  [9,]  9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60

## Sanity checks:
stopifnot(
  identical(dimnames(m0), h5readDimnames(h5file, "M0")),
  identical(dimnames(m0), dimnames(HDF5Array::HDF5Array(h5file, "M0")))
)

## ---------------------------------------------------------------------
## SHARED DIMNAMES
## ---------------------------------------------------------------------
## If a collection of HDF5 datasets share the same dimnames, the
## dimnames only need to be written once in the HDF5 file. Then they
## can be attached to the individual datasets with set_h5dimnames():

h5write(array(runif(240), c(12, 5:4)), h5file, "A1")
set_h5dimnames(h5file, "A1", get_h5dimnames(h5file, "M0"))
get_h5dimnames(h5file, "A1")
#> [1] NA                "/.M0_dimnames/2" NA               
h5readDimnames(h5file, "A1")
#> [[1]]
#> NULL
#> 
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#> 
#> [[3]]
#> NULL
#> 
HDF5Array::HDF5Array(h5file, "A1")
#> <12 x 5 x 4> HDF5Array object of type "double":
#> ,,1
#>                A          B          C          D          E
#>  [1,]  0.5138681  0.5613142  0.3359823  0.1017115  0.6887957
#>  [2,]  0.8903995  0.5563498  0.5707793  0.4997081  0.7345539
#>   ...          .          .          .          .          .
#> [11,] 0.22040122 0.03208188 0.04340145 0.19352432 0.36232432
#> [12,] 0.63289237 0.56658312 0.22279007 0.75390293 0.89293491
#> 
#> ...
#> 
#> ,,4
#>                A          B          C          D          E
#>  [1,] 0.97442437 0.34824101 0.73438011 0.05068026 0.48549050
#>  [2,] 0.43504479 0.56824725 0.41129356 0.62114155 0.70585658
#>   ...          .          .          .          .          .
#> [11,]  0.9499032  0.7897759  0.8846402  0.1689505  0.7994001
#> [12,]  0.4284685  0.4287632  0.9297739  0.9330450  0.3604983
#> 

h5write(matrix(sample(letters, 60, replace=TRUE), ncol=5), h5file, "A2")
set_h5dimnames(h5file, "A2", get_h5dimnames(h5file, "M0"))
get_h5dimnames(h5file, "A2")
#> [1] NA                "/.M0_dimnames/2"
h5readDimnames(h5file, "A2")
#> [[1]]
#> NULL
#> 
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#> 
HDF5Array::HDF5Array(h5file, "A2")
#> <12 x 5> HDF5Matrix object of type "character":
#>       A   B   C   D   E  
#> [1,]  "f" "p" "p" "v" "w"
#> [2,]  "g" "f" "x" "b" "s"
#> [3,]  "w" "b" "j" "o" "x"
#> [4,]  "c" "h" "c" "b" "z"
#> [5,]  "g" "h" "d" "y" "f"
#> ...   .   .   .   .   .  
#> [8,]  "n" "a" "e" "n" "y"
#> [9,]  "n" "n" "k" "g" "i"
#> [10,] "t" "e" "d" "b" "i"
#> [11,] "b" "i" "a" "g" "v"
#> [12,] "a" "i" "u" "d" "c"

## Sanity checks:
stopifnot(identical(dimnames(m0), h5readDimnames(h5file, "A1")[1:2]))
stopifnot(identical(dimnames(m0), h5readDimnames(h5file, "A2")))

## ---------------------------------------------------------------------
## USE h5writeDimnames() AFTER A CALL TO writeHDF5Array()
## ---------------------------------------------------------------------
## After calling writeHDF5Array(x, ..., with.dimnames=FALSE) the
## dimnames on 'x' can still be written to the HDF5 file by doing the
## following:

## 1. Write 'm0' to the HDF5 file and ignore the dimnames (for now):
HDF5Array::writeHDF5Array(m0, h5file, "M2", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,]    1   13   25   37   49
#>  [2,]    2   14   26   38   50
#>  [3,]    3   15   27   39   51
#>  [4,]    4   16   28   40   52
#>  [5,]    5   17   29   41   53
#>   ...    .    .    .    .    .
#>  [8,]    8   20   32   44   56
#>  [9,]    9   21   33   45   57
#> [10,]   10   22   34   46   58
#> [11,]   11   23   35   47   59
#> [12,]   12   24   36   48   60

## 2. Use h5writeDimnames() to write 'dimnames(m0)' to the file and
##    associate them with the "M2" dataset:
h5writeDimnames(dimnames(m0), h5file, "M2")

## 3. Use the HDF5Array() constructor to make an HDF5Array object that
##    points to the "M2" dataset:
HDF5Array::HDF5Array(h5file, "M2")
#> <12 x 5> HDF5Matrix object of type "integer":
#>        A  B  C  D  E
#>  [1,]  1 13 25 37 49
#>  [2,]  2 14 26 38 50
#>  [3,]  3 15 27 39 51
#>  [4,]  4 16 28 40 52
#>  [5,]  5 17 29 41 53
#>   ...  .  .  .  .  .
#>  [8,]  8 20 32 44 56
#>  [9,]  9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60

## Note that at step 2. you can use the extra arguments of
## h5writeDimnames() to take full control of where the dimnames
## should be stored in the file:
HDF5Array::writeHDF5Array(m0, h5file, "M3", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,]    1   13   25   37   49
#>  [2,]    2   14   26   38   50
#>  [3,]    3   15   27   39   51
#>  [4,]    4   16   28   40   52
#>  [5,]    5   17   29   41   53
#>   ...    .    .    .    .    .
#>  [8,]    8   20   32   44   56
#>  [9,]    9   21   33   45   57
#> [10,]   10   22   34   46   58
#> [11,]   11   23   35   47   59
#> [12,]   12   24   36   48   60
h5writeDimnames(dimnames(m0), h5file, "M3",
                group="a_secret_place", h5dimnames=c("NA", "M3_dim2"))
h5ls(h5file)
#>              group           name       otype  dclass        dim
#> 0                /   .M0_dimnames   H5I_GROUP                   
#> 1    /.M0_dimnames              2 H5I_DATASET  STRING          5
#> 2                /   .M2_dimnames   H5I_GROUP                   
#> 3    /.M2_dimnames              2 H5I_DATASET  STRING          5
#> 4                /             A1 H5I_DATASET   FLOAT 12 x 5 x 4
#> 5                /             A2 H5I_DATASET  STRING     12 x 5
#> 6                /             M0 H5I_DATASET INTEGER     12 x 5
#> 7                /             M2 H5I_DATASET INTEGER     12 x 5
#> 8                /             M3 H5I_DATASET INTEGER     12 x 5
#> 9                / a_secret_place   H5I_GROUP                   
#> 10 /a_secret_place        M3_dim2 H5I_DATASET  STRING          5
## h5readDimnames() and HDF5Array() still "find" the dimnames:
h5readDimnames(h5file, "M3")
#> [[1]]
#> NULL
#> 
#> [[2]]
#> [1] "A" "B" "C" "D" "E"
#> 
HDF5Array::HDF5Array(h5file, "M3")
#> <12 x 5> HDF5Matrix object of type "integer":
#>        A  B  C  D  E
#>  [1,]  1 13 25 37 49
#>  [2,]  2 14 26 38 50
#>  [3,]  3 15 27 39 51
#>  [4,]  4 16 28 40 52
#>  [5,]  5 17 29 41 53
#>   ...  .  .  .  .  .
#>  [8,]  8 20 32 44 56
#>  [9,]  9 21 33 45 57
#> [10,] 10 22 34 46 58
#> [11,] 11 23 35 47 59
#> [12,] 12 24 36 48 60

## Sanity checks:
stopifnot(
  identical(dimnames(m0), h5readDimnames(h5file, "M3")),
  identical(dimnames(m0), dimnames(HDF5Array::HDF5Array(h5file, "M3")))
)

## ---------------------------------------------------------------------
## STORE THE DIMNAMES AS NON-CHARACTER TYPES
## ---------------------------------------------------------------------
HDF5Array::writeHDF5Array(m0, h5file, "M4", with.dimnames=FALSE)
#> <12 x 5> HDF5Matrix object of type "integer":
#>       [,1] [,2] [,3] [,4] [,5]
#>  [1,]    1   13   25   37   49
#>  [2,]    2   14   26   38   50
#>  [3,]    3   15   27   39   51
#>  [4,]    4   16   28   40   52
#>  [5,]    5   17   29   41   53
#>   ...    .    .    .    .    .
#>  [8,]    8   20   32   44   56
#>  [9,]    9   21   33   45   57
#> [10,]   10   22   34   46   58
#> [11,]   11   23   35   47   59
#> [12,]   12   24   36   48   60
dimnames <- list(1001:1012, as.raw(11:15))
h5writeDimnames(dimnames, h5file, "M4")
h5ls(h5file)
#>              group           name       otype  dclass        dim
#> 0                /   .M0_dimnames   H5I_GROUP                   
#> 1    /.M0_dimnames              2 H5I_DATASET  STRING          5
#> 2                /   .M2_dimnames   H5I_GROUP                   
#> 3    /.M2_dimnames              2 H5I_DATASET  STRING          5
#> 4                /   .M4_dimnames   H5I_GROUP                   
#> 5    /.M4_dimnames              1 H5I_DATASET INTEGER         12
#> 6    /.M4_dimnames              2 H5I_DATASET INTEGER          5
#> 7                /             A1 H5I_DATASET   FLOAT 12 x 5 x 4
#> 8                /             A2 H5I_DATASET  STRING     12 x 5
#> 9                /             M0 H5I_DATASET INTEGER     12 x 5
#> 10               /             M2 H5I_DATASET INTEGER     12 x 5
#> 11               /             M3 H5I_DATASET INTEGER     12 x 5
#> 12               /             M4 H5I_DATASET INTEGER     12 x 5
#> 13               / a_secret_place   H5I_GROUP                   
#> 14 /a_secret_place        M3_dim2 H5I_DATASET  STRING          5

h5readDimnames(h5file, "M4")
#> [[1]]
#>  [1] 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012
#> 
#> [[2]]
#> [1] 0b 0c 0d 0e 0f
#> 
h5readDimnames(h5file, "M4", as.character=TRUE)
#> [[1]]
#>  [1] "1001" "1002" "1003" "1004" "1005" "1006" "1007" "1008" "1009" "1010"
#> [11] "1011" "1012"
#> 
#> [[2]]
#> [1] "0b" "0c" "0d" "0e" "0f"
#> 

## Sanity checks:
stopifnot(identical(dimnames, h5readDimnames(h5file, "M4")))
dimnames(m0) <- dimnames
stopifnot(identical(
    dimnames(m0),
    h5readDimnames(h5file, "M4", as.character=TRUE)
))