Analytics node configuration for DSE Hadoop (deprecated)
Steps to configure analytic nodes for DSE Hadoop. Hadoop is deprecated for use with DataStax Enterprise. DSE Hadoop and BYOH (Bring Your Own Hadoop) are also deprecated.
- Disabling virtual nodes
- Setting the replication factor
- Configuring the verbosity of log messages
- Connecting to non-standard Cassandra native port
Advanced users can also configure DataStax Enterprise to run jobs remotely.
Setting the replication factor
Change the default replication factor to a production-appropriate value of at least 3.
Configuring the verbosity of log messages
To adjust the verbosity of log messages for Hadoop map/reduce tasks, add the following settings to the logback.xml file on each analytic node:
logback.logger.org.apache.hadoop.mapred=WARN
logback.logger.org.apache.hadoop.filecache=WARN
Installer-Services and Package installations | /etc/dse/cassandra/logback.xml |
Installer-No Services and Tarball installations | install_location/resources/cassandra/conf/logback.xml |
Connecting to non-standard Cassandra native port
If the Cassandra native port was changed to a port other than the default port 9042, you must change thecassandra.input.native.port
configuration setting for Hive and Hadoop to use the non-default port. The following
examples change the Cassandra native port protocol connections to use port 9999.- Inside the Hive shell, set the port after starting the DataStax Enterprise Hive
shell:
dse hive hive> set cassandra.input.native.port=9999;
- General Hive, add
cassandra.input.native.port
to the hive-site.xml file:There are two instances of the hive-site.xml file.For use with Spark, the default location of the hive-site.xml file is:
Installer-Services and Package installations /etc/dse/spark/hive-site.xml Installer-No Services and Tarball installations install_location/resources/spark/conf/hive-site.xml For use with Hive, the default location of the hive-site.xml file is:
Installer-Services and Package installations /etc/dse/hive/hive-site.xml Installer-No Services and Tarball installations install_location/resources/hive/conf/hive-site.xml <property> <name>cassandra.input.native.port</name> <value>9999</value> </property>
- For Hadoop, add
cassandra.input.native.port
to the core-site.xml file:The default location of the core-site.xml file depends on the type of installation:Installer-Services and Package installations /etc/dse/hadoop/conf/core-site.xml Installer-No Services and Tarball installations install_location/resources/hadoop/conf/core-site.xml <property> <name>cassandra.input.native.port</name> <value>9999</value> </property>
Configuration for running jobs on a remote cluster
This information is intended for advanced users.
Procedure
To connect to external addresses: