Configuring Apache Spark™ logging options
You can configure Spark logging options for the Spark logs.
Log directories
The Spark logging directory is the directory where the Spark components store individual log files. DataStax Enterprise places logs in the following locations:
- Executor logs
-
-
SPARK_WORKER_DIR/worker-n/application_id/executor_id/stderr
-
SPARK_WORKER_DIR/worker-n/application_id/executor_id/stdout
-
- Spark Master/Worker logs
-
Spark Master: the global
system.log
Spark Worker:
SPARK_WORKER_LOG_DIR/worker-n/worker.log
The default
SPARK_WORKER_LOG_DIR
location is/var/log/spark/worker
. - Default log directory for Spark SQL Thrift server
-
The default log directory for starting the Spark SQL Thrift server is
$HOME/spark-thrift-server
. - Spark Shell and application logs
-
Spark Shell and application logs are output to the console.
- SparkR shell log
-
The default location for the SparkR shell is
$HOME/.sparkR.log
- Log configuration file
-
Log configuration files are located in the same directory as
spark-env.sh
.
Where is the spark-env.sh
file?
The default location of the spark-env.sh
file depends on the type of installation:
Installation Type | Location |
---|---|
Package installations + Installer-Services installations |
|
Tarball installations + Installer-No Services installations |
|
Procedure
To configure Spark logging options:
-
Configure logging options, such as log levels, in the following files:
Spark logging configuration options Option Description Executors
logback-spark-executor.xml
Spark Master
logback.xml
Spark Worker
logback-spark-server.xml
Spark Driver (Spark Shell, Spark applications)
logback-spark.xml
SparkR
logback-sparkR.xml
Where is the
logback.xml
file?The location of the
logback.xml
file depends on the type of installation:Installation Type Location Package installations + Installer-Services installations
/etc/dse/cassandra/logback.xml
Tarball installations + Installer-No Services installations
<installation_location>/resources/cassandra/conf/logback.xml
-
If you want to enable rolling logging for Spark executors, add the following options to
spark-daemon-defaults.conf
.Enable rolling logging with 3 log files retained before deletion. The log files are broken up by size with a maximum size of 50,000 bytes.
spark.executor.logs.rolling.maxRetainedFiles 3 spark.executor.logs.rolling.strategy size spark.executor.logs.rolling.maxSize 50000
The default location of the Spark configuration files depends on the type of installation:
-
Package installations and Installer-Services:
/etc/dse/spark/
-
Tarball installations and Installer-No Services:
<installation_location>/resources/spark/conf
-
-
Configure a safe communication channel to access the Spark user interface.
When user credentials are specified in plain text on the dse command line, like
dse -u username -p password
, the credentials are present in the logs of Spark workers when the driver is run in cluster mode.The Spark Master, Spark Worker, executor, and driver logs might include sensitive information. Sensitive information includes passwords and digest authentication tokens for Kerberos guidelines mode that are passed in the command line or Spark configuration. DataStax recommends using only safe communication channels like VPN and SSH to access the Spark user interface.
Authentication credentials can be provided in several ways, see Connecting to authentication enabled clusters.