Multi-attribute Arrays¶
In this tutorial we will learn how to add multiple attributes to TileDB arrays. We will focus only on dense arrays, as everything you learn here applies to sparse arrays as well in a straightforward manner. It is recommended to read the tutorial on dense arrays first.
Program |
Links |
|
Basic concepts and definitions¶
Creating a multi-attribute array¶
This is similar to what we covered in the simple dense array example. The only
difference is that we add two attributes to the array schema instead of one,
namely a1
that stores characters, and a2
that stores floats. Notice
however that a2
is defined to store two float values per cell.
Note
In the current version of TileDB, once an array has been created, you cannot modify the array schema. This means that it is not currently possible to add or remove attributes to an already existing array.
Writing to the array¶
Writing is similar to the simple dense array example. The difference here is that
we need to prepare two data buffers (one for a1
and one for a2
).
Note that there should be a one-to-one correspondence
between the values of a1
and a2
in the buffers; for instance, value
1
in data_a1
is associated with value (1.1, 1.2)
in data_a2
(recall each cell stores two floats on a2
), 2
in data_a1
with (2.1, 2.2)
in data_a2
, etc.
Warning
During writing, you must provide a value for all attributes for the cells being written, otherwise an error will be thrown.
The array on disk now stores the written data. The resulting array is depicted in the figure below.
Reading from the array¶
We focus on subarray [1,2], [2,4]
.
Subselecting on attributes¶
While you must provide values for all attributes during writes, the same is not true during reads.
If you compile and run the example of this tutorial as shown below, you should see the following output:
On-disk structure¶
Let us look at the contents of the array of this example on disk.
$ ls -l multi_attribute_array/
total 8
drwx------ 5 stavros staff 160 Jun 25 15:34 __1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319
-rwx------ 1 stavros staff 159 Jun 25 15:34 __array_schema.tdb
-rwx------ 1 stavros staff 0 Jun 25 15:34 __lock.tdb
drwx------ 2 stavros staff 64 Jun 25 15:34 __meta
$ ls -l multi_attribute_array/__1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319/
total 24
-rwx------ 1 stavros staff 939 Jun 25 15:34 __fragment_metadata.tdb
-rwx------ 1 stavros staff 36 Jun 25 15:34 a1.tdb
-rwx------ 1 stavros staff 148 Jun 25 15:34 a2.tdb
TileDB created two separate attribute files in fragment subdirectory
__1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319
:
a1.tdb
that stores the cell values
on attribute a1
(the file size is 16
bytes, equal to the size
required for storing 16 1-byte characters, plus 20 bytes of metadata overhead),
and a2.tdb
that stores the cell
values on attribute a2
(the file size is 128
bytes, equal to the
size required for storing 32 4-byte floats, recalling that each cell stores
two floats, plus the 20 bytes of metadata).