DSEFS command line tool
The DSEFS functionality supports operations including uploading, downloading, moving, and deleting files, creating directories, and verifying the DSEFS status.
DSEFS commands are available only in the logical datacenter. DSEFS works with secured and unsecured clusters, see DSEFS authentication.
You can interact with the DSEFS file system in several modes:
-
Interactive command line shell.
To start DSEFS and launch the DSE FS shell:
dse fs
-
As part of dse commands.
-
With a REST API.
configLog == Configuring DSEFS shell logging
The default location of the DSEFS shell log file .dsefs-shell.log
is the user home directory.
The default log level is INFO.
To configure DSEFS shell logging, edit the <installation_location>/resources/dse/conf/logback-dsefs-shell.xml
file.
Using with the dse command line
Precede the DSEFS command with dse fs
:
dse [<dse_auth_credentials>] fs <dsefs_command> [<options>]
For example, to list the file system status and disk space usage in human-readable format:
dse -u user1 -p mypassword fs "df -h"
Optional command arguments are enclosed in square brackets. For example, [<dse_auth_credentials>] and [-R]
Variable values are italicized. For example, <directory> and [<subcommand>].
Working with the local file system in the DSEFS shell
You can refer to files in the local file system by prefixing paths with file:
.
For example the following command will list files in the system root directory:
dsefs dsefs://127.0.0.1:5598/ > ls file:/
bin cdrom dev home lib32 lost+found mnt proc run srv tmp var initrd.img.old vmlinuz.old
boot data etc lib lib64 media opt root sbin sys usr initrd.img vmlinuz
If you need to perform many subsequent operations on the local file system, first change the current working directory to file:
or any local file system path:
dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files > ls
conf src target build.sbt
dsefs file:/home/user1/path/to/local/files > cd ..
dsefs file:/home/user1/path/to/local >
DSEFS shell remembers the last working directory of each file system separately. To go back to the previous DSEFS directory, enter:
dsefs file:/home/user1/path/to/local/files > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ >
To go back again to the previous local directory:
dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files >
To refer to a path relative to the last working directory of the file system, prefix a relative path with either dsefs:
or file:
.
The following session will create a directory new_directory
in the directory /home/user1
:
dsefs dsefs://127.0.0.1:5598/ > cd file:/home/user1
dsefs file:/home/user1 > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ > mkdir file:new_directory
dsefs dsefs://127.0.0.1:5598/ > realpath file:new_directory
file:/home/user1/new_directory
dsefs dsefs://127.0.0.1:5598/ > stat file:new_directory
DIRECTORY file:/home/user1/new_directory:
Owner user1
Group user1
Permission rwxr-xr-x
Created 2017-01-15 13:10:06+0200
Modified 2017-01-15 13:10:06+0200
Accessed 2017-01-15 13:10:06+0200
Size 4096
To copy a file between two different file systems, you can also use the cp
command with explicit file system prefixes in the paths:
dsefs file:/home/user1/test > cp dsefs:archive.tgz another-archive-copy.tgz
dsefs file:/home/user1/test > ls
another-archive-copy.tgz archive-copy.tgz archive.tgz
Authentication
For dse <dse_auth_credentials>
you can provide user credentials in several ways, see Providing credentials from DSE tools.
For authentication with DSEFS, see DSEFS authentication.
Wildcard support
Some DSEFS commands support wildcard pattern expansion in the path argument. Path arguments containing wildcards are expanded before method invocation into a set of paths matching the wildcard pattern, then the given method is invoked for each expanded path.
For example in the following directory tree:
dirA
|--dirB
|--file1
|--file2
Giving the stat dirA/*
command would be transparently translated into three invocations: stat dirA/dirB
, stat dirA/file1
, and stat dirA/file2
.
DSEFS supports the following wildcard patterns:
-
matches any files system entry (file or directory) name, as in the example of
stat dirA/
. -
?
matches any single character in the file system entry name. For examplestat dirA/dir?
matchesdirA/dirB
. -
[]
matches any characters enclosed within the brackets. For examplestat dirA/file[0123]
matchesdirA/file1
anddirA/file2
. -
{}
matches any sequence of characters enclosed within the brackets and separated with,
. For examplestat dirA/{dirB,file2}
matchesdirA/dirB
anddirA/file2
.
There are no limitations on the number of wildcard patterns in a single path.
For authentication with DSEFS, see DSEFS authentication.
Executing multiple commands
DSEFS can execute multiple commands on one line. Use quotes around the commands and arguments. Each command will be executed separately by DSEFS.
dse fs 'cat file1 file2 file3 file4' 'ls dir1'
Forcing synchronization
Before confirming writing a file, DSEFS by default forces all blocks of the file to be written to the storage devices.
This behavior can be controlled with --no-force-sync
and --force-fsync
flags when creating files or directories in the DSEFS shell with mkdir
, put
, and cp
commands.
The force/no-force behavior is inherited from the parent directory, if not specified.
For example, if a directory is created with --no-force-sync
, then all files are created with --no-force-sync
unless --force-fsync
is explicitly set during file creation.
Turning off forced synchronization improves latency and performance at a cost of durability. For example, if a power loss occurs before writing the data to the storage device, you may lose data. Turn off forced synchronization only if you have a reliable backup power supply in your datacenter and failure of all replicas is unlikely, or if you can afford losing file data.
The Hadoop SYNC_BLOCK
flag has the same effect as --force-sync
in DSEFS.
The Hadoop LAZY_PERSIST
flag has the same effect as --no-force-sync
in DSEFS.
Removing a DSEFS node
When removing a node running DSEFS from a DSE cluster, additional steps are needed to ensure proper correctness within the DSEFS data set.
Make sure the replication factor for the cluster is greater than ONE before continuing. |
-
From a node in the same datacenter as the node to be removed, start the DSEFS shell.
dse fs
-
Show the current DSEFS nodes with the
df
command.dsefs > df
Location Status DC Rack Host Address Port Directory Used Free Reserved 144e587c-11b1-4d74-80f7-dc5e0c744aca up GraphAnalytics rack1 node1.example.com 10.200.179.38 5598 /var/lib/dsefs/data 0 29289783296 5368709120 98ca0435-fb36-4344-b5b1-8d776d35c7d6 up GraphAnalytics rack1 node2.example.com 10.200.179.39 5598 /var/lib/dsefs/data 0 29302099968 5368709120
-
Find the node to be removed in the list and note the UUID value for it under the Location column.
-
If the node is up, unmount it from DSEFS with the command
umount <UUID>
.dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6
-
If the node is not up (for example, after a hardware failure), force unmount it from DSEFS with the command
umount -f <UUID>
.dsefs > umount -f 98ca0435-fb36-4344-b5b1-8d776d35c7d6
-
Run a file system check with the
fsck
command to make sure all blocks are replicated.dsefs > fsck
-
Continue with the normal steps for removing a node.
If data was written to a DSEFS node, more nodes were added to the cluster, and the original node was removed without running |
Removing old DSEFS directories
If you have changed the DSEFS data directory and the old directory is still visible, remove it using the umount option.
-
Start the DSEFS shell as a role with superuser privileges.
dse fs
-
Show the current DSEFS nodes with the
df
command.dsefs > df
-
Find the directory to be removed in the list and note the UUID value for it under the Location column.
-
Unmount it from DSEFS with the command
umount <UUID>
.dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6
-
Run a file system check with the
fsck
command to make sure all blocks are replicated.dsefs > fsck
If the file system check results in an IOException
, make sure all the nodes in the cluster are running.
Examples
Using the DSEFS shell, these commands put the local bluefile
to the remote DSEFS greenfile
:
dsefs / > ls -l
dsefs / > put file:/bluefile greenfile
To view the new file in the DSEFS directory:
dsefs / > ls -l
Type Permission Owner Group Length Modified Name
file rwxrwxrwx none none 17 2016-05-11 09:34:26+0000 greenfile
Using the dse command, these commands create the test2
directory and upload the local README.md
file to the new DSEFS directory.
dse fs "mkdir /test2" &&
dse fs "put README.md /test2/README.md"
To view the new directory listing:
dse fs "ls -l /test2"
Type Permission Owner Group Length Modified Name
file rwxrwxrwx none none 3382 2016-03-07 23:20:34+0000 README.md
You can use two or more dse commands in a single command line. This is faster because the JVM is launched and connected/disconnected with DSEFS only once. For example:
dse fs "mkdir / test2" "put README.md /test/README.md"
The following example shows how to use the --no-force-sync
flag on a directory, and how to check the state of the --force-sync
flag using stat
.
These commands are run from within the DSEFS shell.
dsefs> mkdir --no-force-sync /tmp
dsefs> put file:some-file.dat /tmp/file.tmp
dsefs> stat /tmp/file.tmp
FILE dsefs://127.0.0.1:5598/tmp/file.tmp
Owner none
Group none
Permission rwxrwxrwx
Created 2017-03-06 17:54:35+0100
Modified 2017-03-06 17:54:35+0100
Accessed 2017-03-06 17:54:35+0100
Size 1674
Block size 67108864
Redundancy 3
Compressed false
Encrypted false
Forces sync false
Comment