DSEFS is able to compress files to save storage space and bandwidth. Compression is performed by DSE during upload upon a user’s explicit request. Decompression is transparent. Data is always uncompressed by the server before it is returned to the client.
Compression is performed within block boundaries. The unit of compression—the chunk of data that gets compressed individually—is called a frame and its size can be specified during file upload.
DSEFS is shipped with the lz4 encoder which works out of the box.
To compress files use the
--compression-encoder parameter for
The parameter specifies the compression encoder to use for the file that is about to get uploaded.
$ dsefs / > put -c lz4 file /path/to/file
The frame size can optionally be set with the
-f, --compression-frame-size option.
Where is the
The location of the
dse.yaml file depends on the type of installation:
Package installations + Installer-Services installations
Tarball installations + Installer-No Services installations
The maximum frame size in bytes is set in the
compression_frame_max_size option in
If a user sets the frame size to a value greater than
compression_frame_max_size when using
put -f, an error is thrown and the command fails.
compression_frame_max_size setting based on the available memory of the node.
Files that are compressed can be appended in the same way as uncompressed files.
If the file is compressed the appended data gets transparently compressed with the file’s encoder specified for the initial
Directories can have a default compression encoder specified during directory creation with the
Newly added files with the
put command inherit the default compression encoder from containing directory.
You can override the default compression encoder with the
c parameter during
$ dsefs / > mkdir -c lz4 /some/path
Decompression is performed automatically for all commands that transport data to the client. There is no need for additional configuration to retrieve the original, decompressed file content.
Enabling compression creates a distinction between the logical and physical file size.
The logical size is the size of a file before uploading it to DSEFS, where it is then compressed.
The logical size is shown by the
stat command under Size.
$ dsefs dsefs://10.0.0.1:5598/ > stat /tmp/wikipedia-sample.bz2 FILE dsefs://10.0.0.1:5598/tmp/wikipedia-sample.bz2: Owner none Group none Permission rwxrwxrwx Created 2017-04-06 20:06:21+0000 Modified 2017-04-06 20:06:21+0000 Accessed 2017-04-06 20:06:21+0000 Size 7723180 Block size 67108864 Redundancy 3 Compressed true Encrypted false Comment
The physical size is the actual size of a data stored on the storage device.
The physical size is shown by the
df command and by the
stat -v command for each block separately, under the Compressed length column.
Truncating compressed files is not possible.