DSEFS command line tool

The DSEFS functionality supports operations including uploading, downloading, moving, and deleting files, creating directories, and verifying the DSEFS status.

DSEFS commands are available only in the logical datacenter. DSEFS works with secured and unsecured clusters. See DSEFS authentication.

Interact with the DSEFS file system in several modes:

Interactive command line shell.

To start DSEFS and launch the DSE FS shell:
```
dse fs
```
As part of dse commands.
With a REST API.

Configuring DSEFS shell logging

The default location of the DSEFS shell log file .dsefs-shell.log is the user home directory. The default log level is INFO. To configure DSEFS shell logging, edit the INSTALLATION_LOCATION/resources/dse/conf/logback-dsefs-shell.xml` file.

Using DSEFS shell with the DSE command line

Precede the DSEFS command with dse fs:

dse [DSE_AUTH_CREDENTIALS] fs DSEFS_COMMAND  [OPTIONS]

For example, to list the file system status and disk space usage in human-readable format:

dse -u user1 -p mypassword fs "df -h"

Optional command arguments are enclosed in square brackets. For example, [DSE_AUTH_CREDENTIALS] and [-R]

Variable values are italicized. For example, <directory> and [<subcommand>].

Working with the local file system in the DSEFS shell

You can refer to files in the local file system by prefixing paths with file:. For example, the following command lists files in the system root directory:

dsefs dsefs://127.0.0.1:5598/ > ls file:/
bin   cdrom  dev  home  lib32  lost+found  mnt  proc  run   srv  tmp  var         initrd.img.old  vmlinuz.old
boot  data   etc  lib   lib64  media       opt  root  sbin  sys  usr  initrd.img  vmlinuz

If you need to perform many subsequent operations on the local file system, first change the current working directory to file: or any local file system path:

dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files > ls
conf  src  target  build.sbt
dsefs file:/home/user1/path/to/local/files > cd ..
dsefs file:/home/user1/path/to/local >

DSEFS shell remembers the last working directory of each file system separately. To go back to the previous DSEFS directory, enter:

dsefs file:/home/user1/path/to/local/files > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ >

To go back again to the previous local directory:

dsefs dsefs://127.0.0.1:5598/ > cd file:
dsefs file:/home/user1/path/to/local/files >

To refer to a path relative to the last working directory of the file system, prefix a relative path with either dsefs: or file:. The following session creates a directory new_directory in the directory /home/user1:

dsefs dsefs://127.0.0.1:5598/ > cd file:/home/user1
dsefs file:/home/user1 > cd dsefs:
dsefs dsefs://127.0.0.1:5598/ > mkdir file:new_directory
dsefs dsefs://127.0.0.1:5598/ > realpath file:new_directory
file:/home/user1/new_directory
dsefs dsefs://127.0.0.1:5598/ > stat file:new_directory
DIRECTORY file:/home/user1/new_directory:
Owner           user1
Group           user1
Permission      rwxr-xr-x
Created         2017-01-15 13:10:06+0200
Modified        2017-01-15 13:10:06+0200
Accessed        2017-01-15 13:10:06+0200
Size            4096

To copy a file between two different file systems, you can also use the cp command with explicit file system prefixes in the paths:

dsefs file:/home/user1/test > cp dsefs:archive.tgz another-archive-copy.tgz
dsefs file:/home/user1/test > ls
another-archive-copy.tgz archive-copy.tgz  archive.tgz

Authentication

For dse <dse_auth_credentials> you can provide user credentials in several ways, see Providing credentials from DSE tools. For authentication with DSEFS, see DSEFS authentication.

Wildcard support

Some DSEFS commands support wildcard pattern expansion in the path argument. Path arguments containing wildcards are expanded before method invocation into a set of paths matching the wildcard pattern, then the given method is invoked for each expanded path.

For example, in the following directory tree:

dirA
|--dirB
|--file1
|--file2

Running the stat dirA/* command transparently translates into three invocations: stat dirA/dirB, stat dirA/file1, and stat dirA/file2.

DSEFS supports the following wildcard patterns:

matches any files system entry (file or directory) name, as in the example of stat dirA/.
? matches any single character in the file system entry name. For example stat dirA/dir? matches dirA/dirB.
[] matches any characters enclosed within the brackets. For example stat dirA/file[0123] matches dirA/file1 and dirA/file2.
{} matches any sequence of characters enclosed within the brackets and separated with ,. For example stat dirA/{dirB,file2} matches dirA/dirB and dirA/file2.

There are no limitations on the number of wildcard patterns in a single path.

For authentication with DSEFS, see DSEFS authentication.

Executing multiple commands

DSEFS can execute multiple commands on one line. Use quotes around the commands and arguments. DSEFS executes each command separately.

dse fs 'cat file1 file2 file3 file4' 'ls dir1'

Forcing synchronization

Before confirming writing a file, DSEFS by default forces the writing of all blocks of the file to the storage devices. Control this behavior with --no-force-sync and --force-fsync flags when creating files or directories in the DSEFS shell with mkdir, put, and cp commands. The force/no-force behavior is inherited from the parent directory when no flag is set. For example, if a directory is created with --no-force-sync, then all files are created with --no-force-sync unless --force-fsync is explicitly set during file creation.

Turning off forced synchronization improves latency and performance at a cost of durability. For example, if a power loss occurs before writing the data to the storage device, you may lose data. Turn off forced synchronization only if you have a reliable backup power supply in your datacenter and failure of all replicas is unlikely, or if you can afford losing file data.

The Hadoop SYNC_BLOCK flag has the same effect as --force-sync in DSEFS. The Hadoop LAZY_PERSIST flag has the same effect as --no-force-sync in DSEFS.

Removing a DSEFS node

When you remove a node running DSEFS from a DSE cluster, you must perform additional steps to ensure data integrity within the DSEFS dataset.

Make sure the replication factor for the cluster is greater than ONE before continuing.

From a node in the same datacenter as the node you want to remove, start the DSEFS shell.
```
dse fs
```

Show the current DSEFS nodes with the df command.

dsefs > df

Location                              Status  DC              Rack   Host               Address        Port  Directory            Used         Free    Reserved
144e587c-11b1-4d74-80f7-dc5e0c744aca  up      GraphAnalytics  rack1  node1.example.com  10.200.179.38  5598  /var/lib/dsefs/data     0  29289783296  5368709120
98ca0435-fb36-4344-b5b1-8d776d35c7d6  up      GraphAnalytics  rack1  node2.example.com  10.200.179.39  5598  /var/lib/dsefs/data     0  29302099968  5368709120

Find the node you want to remove in the list, and note its UUID value under the Location column.
If the node is up, unmount it from DSEFS with the command umount UUID.
```
dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6
```
If the node is not up (for example, after a hardware failure), force unmount it from DSEFS with the command umount -f <UUID>.
```
dsefs > umount -f 98ca0435-fb36-4344-b5b1-8d776d35c7d6
```
Run a file system check with the fsck command to make sure all blocks are replicated.
```
dsefs > fsck
```
Continue with the normal steps for removing a node.

If, after writing data to a DSEFS node and then adding more nodes to the cluster, you remove the original node without running fsck, the data in the original node may be permanently lost.

Removing old DSEFS directories

After changing the DSEFS data directory, if you find the old directory is still visible, remove it using the umount option.

Start the DSEFS shell as a role with superuser privileges.
```
dse fs
```
Show the current DSEFS nodes with the df command.
```
dsefs > df
```
Find the directory you want to remove in the list, and note the UUID value for it under the Location column.

Unmount it from DSEFS with the command umount UUID.

dsefs > umount 98ca0435-fb36-4344-b5b1-8d776d35c7d6

Run a file system check with the fsck command to make sure replication is complete on all blocks.
```
dsefs > fsck
```

If the file system check results in an IOException, make sure all the nodes in the cluster are running.

Examples

Using the DSEFS shell, these commands put the local bluefile to the remote DSEFS greenfile:

dsefs / >  ls -l
dsefs / >  put file:/bluefile greenfile

To view the new file in the DSEFS directory:

dsefs / >  ls -l
Type  Permission  Owner  Group  Length  Modified                  Name
file  rwxrwxrwx   none   none       17  2016-05-11 09:34:26+0000  greenfile

Using the dse command, these commands create the test2 directory and upload the local README.md file to the new DSEFS directory.

dse fs "mkdir /test2" &&
dse fs "put README.md /test2/README.md"

To view the new directory listing:

dse fs "ls -l /test2"

Type Permission Owner Group Length Modified Name
file rwxrwxrwx none none 3382 2016-03-07 23:20:34+0000 README.md

You can use two or more dse commands in a single command line. This is faster because the launching of JVM and connecting or disconnecting it with DSEFS happens only once. For example:

dse fs "mkdir / test2" "put README.md /test/README.md"

The following example shows how to use the --no-force-sync flag on a directory, and how to check the state of the --force-sync flag using stat. Run these commands from within the DSEFS shell.

dsefs> mkdir --no-force-sync /tmp
dsefs> put file:some-file.dat /tmp/file.tmp
dsefs> stat /tmp/file.tmp
FILE dsefs://127.0.0.1:5598/tmp/file.tmp
Owner           none
Group           none
Permission      rwxrwxrwx
Created         2017-03-06 17:54:35+0100
Modified        2017-03-06 17:54:35+0100
Accessed        2017-03-06 17:54:35+0100
Size            1674
Block size      67108864
Redundancy      3
Compressed      false
Encrypted       false
Forces sync     false
Comment