Enabling DSEFS

DSEFS is automatically enabled on analytics nodes, and disabled on non-analytics nodes. You can enable the DSEFS service on any node in a DataStax Enterprise cluster. Nodes within the same datacenter with DSEFS enabled will join together to behave as a DSEFS cluster.

Procedure

On each node:

  1. In the dse.yaml file, uncomment and set the properties for the DSEFS options:

    # dsefs_options:
    #    enabled:
    #    keyspace_name: dsefs
    #    work_dir: /var/lib/dsefs
    #    public_port: 5598
    #    private_port: 5599
    #    data_directories:
    #      - dir: /var/lib/dsefs/data
    #        storage_weight: 1.0
    #        min_free_space: 268435456
    1. To enable DSEFS:

      enabled: true

      If enabled is blank or commented out, DSEFS starts only if the node is configured to run analytics workloads.

    2. Define the keyspace for storing the DSEFS metadata:

      keyspace_name: dsefs

      You can optionally configure multiple DSEFS file systems in a single datacenter.

    3. Define the work directory for storing the DSEFS metadata for the local node. The work directory should not be shared with other DSEFS nodes:

      work_dir: /var/lib/dsefs
    4. Define the public port on which DSEFS listens for clients:

      public_port: 5598

      DataStax recommends that all nodes in the cluster have the same value. Firewalls must open this port to trusted clients. The service on this port is bound to the native_transport_address.

    5. Define the private port for DSEFS internode communication:

      private_port: 5599

      Do not open this port to firewalls; this private port must be not visible from outside of the cluster.`

    6. Set the data directories where the file data blocks are stored locally on each node.

      data_directories:
          - dir: /var/lib/dsefs/data

      If you use the default `/var/lib/dsefs/dat`a data directory, verify that the directory exists and that you have root access. Otherwise, you can define your own directory location, change the ownership of the directory, or both:

      sudo mkdir -p /var/lib/dsefs/data; sudo chown -R  $USER:$GROUP /var/lib/dsefs/data

      Ensure that the data directory is writeable by the DataStax Enterprise user. Put the data directories on different physical devices than the database. Using multiple data directories on JBOD improves performance and capacity.

    7. For each data directory, set the weighting factor to specify how much data to place in this directory, relative to other directories in the cluster. This soft constraint determines how DSEFS distributes the data. For example, a directory with a value of 3.0 receives about three times more data than a directory with a value of 1.0.

      data_directories:
          - dir: /var/lib/dsefs/data
            storage_weight: 1.0
    8. For each data directory, define the reserved space, in bytes, to not use for storing file data blocks. See min_free_space.

      data_directories:
          - dir: /var/lib/dsefs/data
            storage_weight: 1.0
            min_free_space: 5368709120
  2. Restart the node.

  3. Repeat steps for the remaining nodes.

  4. With guidance from DataStax Support, you can tune advanced DSEFS properties:

    #     service_startup_timeout_ms: 60000
    #     service_close_timeout_ms: 600000
    #     server_close_timeout_ms: 2147483647 # Integer.MAX_VALUE
    #     compression_frame_max_size: 1048576
    #     query_cache_size: 2048
    #     query_cache_expire_after_ms: 2000
    #     gossip_options:
      #   round_delay_ms: 2000
      #   startup_delay_ms: 5000
      #   shutdown_delay_ms: 10000
    # rest_options:
      #   request_timeout_ms: 330000
      #   connection_open_timeout_ms: 55000
      #   client_close_timeout_ms: 60000
      #   server_request_timeout_ms: 300000
      #   idle_connection_timeout_ms: 60000
      #   internode_idle_connection_timeout_ms: 120000
      #   core_max_concurrent_connections_per_host: 8
    # transaction_options:
      #   transaction_timeout_ms: 3000
      #   conflict_retry_delay_ms: 200
      #   conflict_retry_count: 40
      #   execution_retry_delay_ms: 1000
      #   execution_retry_count: 3
    #     block_allocator_options:
    #         overflow_margin_mb: 1024
    #         overflow_factor: 1.05
  5. Continue with using DSEFS.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com