DSEFS command line tool

Options and command arguments for the DSE File System (DSEFS).

The DSEFS functionality supports operations including uploading, downloading, moving, and deleting files, creating directories, and verifying the DSEFS status.

DSEFS commands are available only in the logical datacenter. DSEFS works with secured and unsecured clusters, see DSEFS authentication.

You can interact with the DSEFS file system in several modes:

Configuring DSEFS shell logging

The default location of the DSEFS shell log file .dsefs-shell.log is the user home directory. The default log level is INFO. To configure DSEFS shell logging, edit the installation_location/resources/dse/conf/logback-dsefs-shell.xml file.

Using with the dse command line

Precede the DSEFS command with dse fs:
dse [dse_auth_credentials] fs dsefs_command  [options]
For example, to list the file system status and disk space usage in human-readable format:
dse -u user1 -p mypassword fs "df -h"

Optional command arguments are enclosed in square brackets. For example, [dse_auth_credentials] and [-R]

Variable values are italicized. For example, directory and [subcommand].

Working with the local file system in the DSEFS shell

You can refer to files in the local file system by prefixing paths with file:. For example the following command will list files in the system root directory:

dsefs dsefs://127.0.0.1:5598/ > ls file:/
bin   cdrom  dev  home  lib32  lost+found  mnt  proc  run   srv  tmp  var         initrd.img.old  vmlinuz.old  
boot  data   etc  lib   lib64  media       opt  root  sbin  sys  usr  initrd.img  vmlinuz         

If you need to perform many subsequent operations on the local file system, first change the current working directory to file: or any local file system path:

dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files > ls
conf  src  target  build.sbt  
dsefs file:/home/user1/path/to/local/files > cd ..
dsefs file:/home/user1/path/to/local > 

DSEFS shell remembers the last working directory of each file system separately. To go back to the previous DSEFS directory, enter:

dsefs file:/home/user1/path/to/local/files > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ > 

To go back again to the previous local directory:

dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files >

To refer to a path relative to the last working directory of the file system, prefix a relative path with either dsefs: or file:. The following session will create a directory new_directory in the directory /home/user1:

dsefs dsefs://127.0.0.1:5598/ > cd file:/home/user1
dsefs file:/home/user1 > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ > mkdir file:new_directory
dsefs dsefs://127.0.0.1:5598/ > realpath file:new_directory
file:/home/user1/new_directory
dsefs dsefs://127.0.0.1:5598/ > stat file:new_directory
DIRECTORY file:/home/user1/new_directory:
Owner           user1
Group           user1
Permission      rwxr-xr-x
Created         2017-01-15 13:10:06+0200
Modified        2017-01-15 13:10:06+0200
Accessed        2017-01-15 13:10:06+0200
Size            4096

To copy a file between two different file systems, you can also use the cp command with explicit file system prefixes in the paths:

dsefs file:/home/user1/test > cp dsefs:archive.tgz another-archive-copy.tgz
dsefs file:/home/user1/test > ls
another-archive-copy.tgz archive-copy.tgz  archive.tgz  

Authentication

For dse dse_auth_credentials you can provide user credentials in several ways, see . For authentication with DSEFS, see DSEFS authentication.

Wildcard support

Some DSEFS commands support wildcard pattern expansion in the path argument. Path arguments containing wildcards are expanded before method invocation into a set of paths matching the wildcard pattern, then the given method is invoked for each expanded path.

For example in the following directory tree:

dirA
|--dirB
|--file1
|--file2

Giving the stat dirA/* command would be transparently translated into three invocations: stat dirA/dirB, stat dirA/file1, and stat dirA/file2.

DSEFS supports the following wildcard patterns:

  • * matches any files system entry (file or directory) name, as in the example of stat dirA/*.
  • ? matches any single character in the file system entry name. For example stat dirA/dir? matches dirA/dirB.
  • [] matches any characters enclosed within the brackets. For example stat dirA/file[0123] matches dirA/file1 and dirA/file2.
  • {} matches any sequence of characters enclosed within the brackets and separated with ,. For example stat dirA/{dirB,file2} matches dirA/dirB and dirA/file2.

There are no limitations on the number of wildcard patterns in a single path.

For authentication with DSEFS, see DSEFS authentication.

Executing multiple commands

DSEFS can execute multiple commands on one line. Use quotes around the commands and arguments. Each command will be executed separately by DSEFS.

dse fs 'cat file1 file2 file3 file4' 'ls dir1'

Forcing synchronization

Before confirming writing a file, DSEFS by default forces all blocks of the file to be written to the storage devices. This behavior can be controlled with --no-force-sync and --force-fsync flags when creating files or directories in the DSEFS shell with mkdir, put, and cp commands. The force/no-force behavior is inherited from the parent directory, if not specified. For example, if a directory is created with --no-force-sync, then all files are created with --no-force-sync unless --force-fsync is explicitly set during file creation.

Turning off forced synchronization improves latency and performance at a cost of durability. For example, if a power loss occurs before writing the data to the storage device, you may lose data. Turn off forced synchronization only if you have a reliable backup power supply in your datacenter and failure of all replicas is unlikely, or if you can afford losing file data.

The Hadoop SYNC_BLOCK flag has the same effect as --force-sync in DSEFS. The Hadoop LAZY_PERSIST flag has the same effect as --no-force-sync in DSEFS.

Removing a DSEFS node

When removing a node running DSEFS from a DSE cluster, additional steps are needed to ensure proper correctness within the DSEFS data set.

Important: Make sure the replication factor for the cluster is greater than ONE before continuing.
  1. From a node in the same datacenter as the node to be removed, start the DSEFS shell.
    dse fs
  2. Show the current DSEFS nodes with the df command.
    dsefs > df
    Location                              Status  DC              Rack   Host               Address        Port  Directory            Used         Free    Reserved
    144e587c-11b1-4d74-80f7-dc5e0c744aca  up      GraphAnalytics  rack1  node1.example.com  10.200.179.38  5598  /var/lib/dsefs/data     0  29289783296  5368709120
    98ca0435-fb36-4344-b5b1-8d776d35c7d6  up      GraphAnalytics  rack1  node2.example.com  10.200.179.39  5598  /var/lib/dsefs/data     0  29302099968  5368709120
  3. Find the node to be removed in the list and note the UUID value for it under the Location column.
  4. If the node is up, unmount it from DSEFS with the command umount UUID.
    dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6
  5. If the node is not up (for example, after a hardware failure), force unmount it from DSEFS with the command umount -f UUID.
    dsefs > umount -f 98ca0435-fb36-4344-b5b1-8d776d35c7d6
  6. Run a file system check with the fsck command to make sure all blocks are replicated.
    dsefs > fsck
  7. Continue with the normal steps for removing a node.
Note: If data was written to a DSEFS node, more nodes were added to the cluster, and the original node was removed without running fsck, the data in the original node may be permanently lost.

Removing old DSEFS directories

If you have changed the DSEFS data directory and the old directory is still visible, remove it using the umount option.

  1. Start the DSEFS shell as a role with superuser privileges.
    dse fs
  2. Show the current DSEFS nodes with the df command.
    dsefs > df
  3. Find the directory to be removed in the list and note the UUID value for it under the Location column.
  4. Unmount it from DSEFS with the command umount UUID.
    dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6
  5. Run a file system check with the fsck command to make sure all blocks are replicated.
    dsefs > fsck

If the file system check results in an IOException, make sure all the nodes in the cluster are running.

Examples

Using the DSEFS shell, these commands put the local bluefile to the remote DSEFS greenfile:
dsefs / >  ls -l 
dsefs / >  put file:/bluefile greenfile
To view the new file in the DSEFS directory:
dsefs / >  ls -l 
Type  Permission  Owner  Group  Length  Modified                  Name                        
file  rwxrwxrwx   none   none       17  2016-05-11 09:34:26+0000  greenfile  
Using the dse command, these commands create the test2 directory and upload the local README.md file to the new DSEFS directory.
dse fs "mkdir /test2" &&
dse fs "put README.md /test2/README.md"
To view the new directory listing:
dse fs "ls -l /test2"
Type Permission Owner Group Length Modified Name
file rwxrwxrwx none none 3382 2016-03-07 23:20:34+0000 README.md
You can use two or more dse commands in a single command line. This is faster because the JVM is launched and connected/disconnected with DSEFS only once. For example:
dse fs "mkdir / test2" "put README.md /test/README.md"
The following example shows how to use the --no-force-sync flag on a directory, and how to check the state of the --force-sync flag using stat. These commands are run from within the DSEFS shell.
dsefs> mkdir --no-force-sync /tmp
dsefs> put file:some-file.dat /tmp/file.tmp
dsefs> stat /tmp/file.tmp
FILE dsefs://127.0.0.1:5598/tmp/file.tmp
Owner           none
Group           none
Permission      rwxrwxrwx
Created         2017-03-06 17:54:35+0100
Modified        2017-03-06 17:54:35+0100
Accessed        2017-03-06 17:54:35+0100
Size            1674
Block size      67108864
Redundancy      3
Compressed      false
Encrypted       false
Forces sync     false
Comment