Configuring DSE Multi-Instance  

Steps to add DataStax Enterprise nodes on a host machine to configure DSE Multi-Instance.

The location of the cassandra.yaml file depends on the type of installation:
Installer-Services /etc/dse/cassandra/cassandra.yaml
Package installations /etc/dse/cassandra/cassandra.yaml
Installer-No Services install_location/resources/cassandra/conf/cassandra.yaml
Tarball installations install_location/resources/cassandra/conf/cassandra.yaml
The location of the dse.yaml file depends on the type of installation:
Installer-Services /etc/dse/dse.yaml
Package installations /etc/dse/dse.yaml
Installer-No Services install_location/resources/dse/conf/dse.yaml
Tarball installations install_location/resources/dse/conf/dse.yaml

With package installs and DataStax Installer installs, the dse add-node command simplifies adding and configuring nodes on a host machine.

Tarball installs do not support adding more nodes on a single host machine. To install DSE Multi-Instance in a tarball installation, unpack the tarball in multiple locations on a single host machine. Each tarball installation becomes a DataStax Enterprise node on the host machine.

On the host machine, the DSE Multi-Instance root directory is /etc/defaults. This default location is not configurable. The node type is defined in the /etc/defaults/dse-nodeId file.
Note: DSE Multi-Instance is supported only for package and DataStax Installer installations.
These actions occur for each node that is added:
  • The node configuration is modified according to the command arguments.
  • A script is created so that the node can be started and stopped.
  • The run levels are updated to the default values so that the node is started and stopped when the host machine is booted or halted.
  • The /etc/default/dse-nodeId file is created to set the default node type as a Cassandra transactional node.
  • With DSE Multi-Instance, when you run the dse command on a node in the host machine, the node configuration is read from:
    • Package and DataStax Installer installations: /etc/dse/serverconfig/dse-nodeId
    • Tarball installations: the /etc/dse directory is the default configuration location in each location where you installed DataStax Enterprise.
Note: With DSE Multi-Instance, multiple DataStax Enterprise nodes reside on a single host machine. To segregate the configuration for each DataStax Enterprise node, node-specific directory structures are used to store configuration and operational files. For example, in addition to /etc/dse/dse.yaml, the DSE Multi-Instance dse.yaml files are stored in /etc/dse-nodeId/dse.yaml locations. The server_id option is generated in DSE Multi-Instance /etc/dse-nodeId/dse.yaml files to uniquely identify the physical server on which multiple instances are running and is unique for each database instance.
Directories Description
/etc/dse /etc/dse/dse.yaml is the primary configuration file for DataStax Enterprise
/etc/dse-node1 /etc/dse-node1/dse.yaml is the configuration file for the DataStax Enterprise node in the dse-node1 directory
/etc/dse-node2 /etc/dse-node2/dse.yaml is the configuration file for the DataStax Enterprise node in the dse-node2 directory
For DSE Multi-Instance nodes, two files control the configuration of the node. For example, for the node named dse-node1:
  • /etc/dse/serverconfig/dse-node1 specifies the directories for the configuration files
  • /etc/dsefault/dse-node1 configures the node behavior, including node type and configures the number of retries for the DSE service to start.
For package installations, see directories for DSE Multi-Instance for a comprehensive list of file locations in a DSE Multi-Instance cluster.

Procedure

  1. Verify that your existing DataStax Enterprise installation has the default node configuration in the /etc/dse directory. The configuration files for the default node include /etc/dse/dse.yaml and /etc/dse/cassandra/cassandra.yaml.
  2. Give the default cluster a meaningful name. For example, change the default cluster named dse to payroll.
  3. Verify that the node binds to working IP addresses.
  4. Add DataStax Enterprise nodes to the DSE Multi-Instance cluster.
    • For Installer-Services and package installations, you can use the dse add-node command. For example, to add a Cassandra node that will join the cluster payroll on startup:
      $ sudo dse add-node --node-id nodeId --cluster payroll --listen-address unused_ip_of_server 
      --rpc-address unused_ip_of_server --seeds ip_of_default_node
  5. Before starting the new node, set the node type in the /etc/default/dse-nodeId file:
    • DSE Search:
      SOLR_ENABLED=1
    • DSE Analytics:
      SPARK_ENABLED=1
  6. Continue configuring the node as appropriate.
    1. To change default DataStax Enterprise configuration values, edit the configuration files in /etc/nodeId.
      Ensure that the JMX port is configured for each node.
    2. To change default Cassandra configuration values, edit the /etc/dse-nodeId/cassandra/cassandra.yaml file.
  7. After you make configuration changes, start the node.
    Starting DataStax Enterprise as a service runs a script that sets up the environment and launches the DSE service. After the DSE service is launched, the script verifies if the service is running. It is possible for the DSE service to take a few seconds to start, and this error might display:
    WARNING: Timed out while waiting for DSE to start. 
    However, this error does not necessarily mean that the DSE service failed to start. Check the log files, for example /var/log/cassandra/system.out.
  8. To wait longer until the service is declared not to launch successfully, you can change the number of times to check if the DSE service is running. The DSE start script checks if the DSE service is running once per second, so the number of checks is equal to the number of seconds. To configure the number of times to check if the DSE service is running, uncomment and edit the WAIT_FOR_START option in the /etc/default/dse-node1 file, and then restart the DSE service:
    # Uncomment if you want longer/shorter waits checking if the service is up
    WAIT_FOR_START=14
  9. Verify that the nodes are running and are part of the cluster.
    For example, to verify the cluster status from a local node named dse-node1 on a DSE Multi-Instance cluster:
    $ sudo dse dse-node1 dsetool ring
    With DSE Multi-Instance, the output includes the Server ID:
    Server ID          Address      DC          Rack   Workload    Graph  Status  State    Load       Owns     VNodes                                       Health [0,1)
    42-01-0A-F0-00-02  10.240.0.2   Cassandra   rack1  Cassandra   no     Up      Normal   92.13 KB   46.86%   -9223372036854775808                         0.17        
    42-01-0A-F0-00-02  127.0.0.1    Cassandra   rack1  Cassandra   no     Up      Normal   150.6 KB   53.14%   579561378715200106 
    Using the standard dsetool ring command provides the status of the default node dse:
    $ sudo dsetool ring
    When a DSE Multi-Instance server is present in the cluster, the output always includes the Server ID column, even when you run the command on a server that is a DSE Multi-Instance host machine:
    
    Server ID          Address      DC          Rack   Workload    Graph  Status  State    Load       Owns     VNodes                                       Health [0,1)
    42-01-0A-F0-00-02  10.240.0.2   Cassandra   rack1  Cassandra   no     Up      Normal   92.13 KB   46.86%   -9223372036854775808                         0.17        
    42-01-0A-F0-00-02  127.0.0.1    Cassandra   rack1  Cassandra   no     Up      Normal   150.6 KB   53.14%   579561378715200106 
  10. To run standard DataStax Enterprise commands for nodes on a DSE Multi-Instance host machine, specify the node name using this syntax:
    $ sudo dse dse-node-id tool [command_arguments]

    The node ID that is specified with the add-node command is automatically prefixed with dse-. In all instances except for add-node, the command syntax requires the dse- prefix.

    For example, with DSE Multi-Instance, the command to start a Spark shell on a node named dse-spark-node is:
    $ sudo dse dse-spark-node spark
    In contrast, the command to start a Spark shell without DSE Multi-Instance is:
    $ dse spark