Configure DataStax Studio
Studio is a desktop application that automatically saves your work and, typically, runs locally. For basic usage, you probably don’t need to edit the default configuration.
To change configuration settings, do the following:
-
Stop Studio if it is running.
-
In your Studio installation directory, open the
configuration.yaml
file, edit the settings as needed, and then save the file.For all options, see the configuration.yaml section.
-
To edit JVM settings, create a
setenv
file for your Studio installation, as explained in JVM settings. -
Restart Studio to apply the changes.
configuration.yaml
The Studio configuration.yaml
file is located at the root of your Studio install directory.
In the configuration.yaml
file, top-level options are outdented with no leading whitespace.
Child options, if any, are indented by two spaces.
For example:
server:
httpPort: 9091
The following example configuration file includes the default settings.
Advanced options are also available, but they aren’t set by default in the configuration file. Only use advanced options if you are certain you need them. See the following sections for advanced options. |
Example Studio configuration file
# Maximum number of items returned per cell execution. Result set will be
# truncated to this number of rows, edges, vertices, etc.
# Default: 1000 items
resultSizeLimit: 1000
# Graph: maximum content length returned for a cell result (bytes).
# Default: 524288 bytes
maxResultSizeBytes: 524288
# Cell execution timeout (milliseconds). A value of 0 indicates no timeout and
# will depend on the DSE server timeouts configured in dse.yaml.
# Default: 0 milliseconds (no timeout)
executionTimeoutMs: 0
# Determines if selected editor text can be executed, instead of the entire
# cell content. When set to 'true':
# Graph cells: only the exact selected text is executed
# CQL and Spark SQL cells: the selected text is expanded to full statement(s)
# before being executed.
# Default: true
executeSelectionEnabled: true
# The timezone used to display Date and Timestamp values.
# Examples: UTC, GMT, CET, America/Los_Angeles, America/New_York, Europe/Paris, Asia/Kolkata, etc
# These zone IDs are defined by Java in java.time.ZoneId
# Some special values are also supported:
# * SYSTEM - use the default system timezone.
# * TIMESTAMP - display as timestamp value (milliseconds since epoch)
# Default: UTC
displayTimezone: UTC
# Studio web server options.
server:
# Studio web server http port
# Default: 9091
httpPort: 9091
# WARNING: Changing the setting from the default (localhost) can pose a
# security risk as users on external machines can gain access to notebooks
# and the DSE clusters those notebooks are connected to.
# Studio is designed to be used as a desktop application. Distributed
# deployment introduces potential security risks.
# Default: localhost
httpBindAddress: localhost
# Logging options.
logging:
# Log file name.
# Default: studio.log
fileName: studio.log
# Max log file size.
# Default: 250 MB
maxLogFileSize: 250 MB
# Max number of archived logs to retain.
# Default: 10
maxFiles: 10
# Log directory.
# Default: ./logs
directory: ./logs
# Spark SQL log level.
# 0: Disable all logging
# 1: Log severe error events that abort the driver
# 2: Log errors that may allow driver to continue
# 3: Log events that might results in an error
# 4: Log general driver progress information
# 5: Log detailed driver debug information
# 6: Log all driver activity
# Default: 0
sparkSQLLogLevel: 0
# User application data options
userData:
# Application data directory location for user data, including connection
# and notebook data, events, and history.
# A value of 'null' will translate to ~/.datastax_studio
# Default: null
baseDirectory: null
# Save frequency for minor cell revisions (seconds). Changes considered 'minor'
# include: cell editor and settings changes. Major changes, such as cell
# execution and new results always create a new revision immediately.
# Default: 300 seconds
historySaveFrequencyInSeconds: 300
# Determines if notebook revisions should be deleted based on age.
# When there are more than 'minHistoryRevisionsToKeep' history revisions,
# revisions older than 'maxDaysOfHistoryToKeep' days are deleted.
# Default: true
pruneRevisionHistoryEnabled: true
# Minimum number of revisions to retain before pruning by date is enforced.
# Default: 25
minHistoryRevisionsToKeep: 25
# Maximum amount of time notebook revision history files are kept before
# being deleted (days). This age limit is only applied after the minimum
# number of history revisions to keep ('minHistoryRevisionsToKeep') has
# been exceeded.
# Default: 30 days
maxDaysOfHistoryToKeep: 30
# Database connection options.
connectionManagement:
# Java driver: socket options connection timeout (milliseconds).
# Default: 5000 milliseconds
connectTimeoutInMillis: 5000
# Java driver: socket options read timeout (milliseconds).
# Default: 3000 milliseconds
readTimeoutInMillis: 3000
# Java driver: Constant reconnect policy delay (milliseconds).
# Default: 10000 milliseconds
constantReconnectPolicyDelayInMillis: 10000
# Tinkerpop driver: Gremlin server port.
# Default: 8182 using a standard connection, 30046 when using a secure bundle (Astra)
defaultGremlinPort: 8182
# Tinkerpop driver: Time limit for establishing a connection (milliseconds).
# Default: 5000 milliseconds
maxWaitForConnection: 5000
# Spark JDBC driver: Spark SQL port.
# Default: 10000
defaultSparkSQLPort: 10000
General options
These are general Studio options that aren’t indented under a particular configuration section. For example:
resultSizeLimit: 1000
maxResultSizeBytes: 524288
executionTimeoutMs: 0
executeSelectionEnabled: true
displayTimezone: UTC
- resultSizeLimit
-
Maximum number of items returned per cell execution. Additional items will be truncated.
Default: 1000
- maxResultSizeBytes
-
Maximum size of a cell result in bytes. If a cell result exceeds this size then the cell execution will fail.
Default: 524288
- executionTimeoutMs
-
Cell execution timeout in milliseconds.
Set to
0
for no timeout.For DSE connections, when this options is set to
0
, Studio uses the DSE server timeouts configured indse.yaml
. For Astra connections, the Astra server-side timeout applies, if any.Default: 0
- executeSelectionEnabled
-
Limits execution to statements selected in the editor.
Default: true
- displayTimezone
-
The timezone used to display Date and Timestamp values. Zone IDs are set in
java.time.ZoneId
, as well as special values likeSYSTEM
(use the system timezone) andTIMESTAMP
(timestamp value in milliseconds since epoch).Default:
UTC
- schemaRefreshIntervalMs (advanced)
-
Schema refresh polling interval in milliseconds.
Default: 3000 (3 seconds)
Server options
Use these options to configure the Studio web server.
These options are indented under the server:
section in the configuration file.
For example:
server:
httpPort: 9091
httpBindAddress: localhost
- httpPort
-
The port on which the Studio server is running.
Default: 9091
- httpBindAddress
-
The IP address to which the Studio server is bound.
DataStax Studio is designed to be a local desktop application; distributed deployment introduces potential security risks.
Changing the
httpBindAddress
setting from the default (localhost
) can pose a security risk because users on external machines can gain access to your Studio notebooks and, by extension, the data in the DSE clusters connected to your notebooks.Default: localhost
Logging options
Use these options to configure logging.
These options are indented under the logging:
section in the configuration file.
For example:
logging:
fileName: studio.log
maxLogFileSize: 250 MB
maxFiles: 10
directory: ./logs
sparkSQLLogLevel: 0
- fileName
-
Name of the log file.
Default:
studio.log
- maxLogFileSize
-
Maximum size of a log file.
Default: 250 MB
- maxFiles
-
Maximum number of log files.
Default: 10
- directory
-
Path of the directory in which log files are stored.
Default:
./logs
- sparkSQLLogLevel
-
Spark SQL log level 0-6:
-
0 (default): Disable all logging
-
1: Log severe error events that cause the driver to stop
-
2: Log errors that may allow driver to continue
-
3: Log events that might results in an error
-
4: Log general driver progress information
-
5: Log detailed driver debug information
-
6: Log all driver activity
-
User data options
The following user data management options are available.
These options are indented under the userData:
section in the configuration file.
For example:
userData:
baseDirectory: null
historySaveFrequencyInSeconds: 300
pruneRevisionHistoryEnabled: true
minHistoryRevisionsToKeep: 25
maxDaysOfHistoryToKeep: 30
- baseDirectory
-
The path on the local file system where DataStax Studio stores notebooks, revision history, connections, and other user data. The default is a
.datastax_studio
folder in your home directory, such as~/.datastax_studio
. Set to a non-null value to override.The
.datastax_studio
folder is for creating backups of user data. Don’t edit or copy files from Studio’s user data directories.User data in DataStax Studio 2.0
If you are upgrading to DataStax Studio 6.8 from 2.0, note that the user data and notebook history functionality has changed. In DataStax Studio 2.0, notebook snapshots were stored across multiple files in the
$userdata/eventlog
and$userdata/snapshots
directories.Snapshots are still automatically captured in DataStax Studio 6.8, and the notebook history is integrated into the UI. Studio automatically manages all notebook history and user data in the
.datastax_studio
folder (set by thebaseDirectory
option). Don’t edit the.datastax_studio
folder or its contents. - historySaveFrequencyInSeconds
-
Time interval between notebook revision saves when only minor changes are made. For example, revising cell code and changing settings. Major changes, such as executing a cell and getting a new result, always create a revision history unless the result is identical to the prior values.
Default: 300
- pruneRevisionHistoryEnabled
-
Enable pruning of notebook history revisions. When
minHistoryRevisionsToKeep
history revisions is exceeded, revisions older thanmaxDaysOfHistoryToKeep
days are deleted.Default: true
- minHistoryRevisionsToKeep
-
Minimum number of notebook revisions to retain before enforcing pruning by date.
Default: 25
- maxDaysOfHistoryToKeep
-
Maximum number of days to retain notebook history revisions.
Default: 30
- connectionsDirectory (advanced)
-
The directory where connections are stored.
Default: connections
- snapshotSaveIntervalInSeconds (advanced)
-
Default: 300
- entityCacheIdleTimeoutInSeconds (advanced)
-
Default: 3600
- maxKeyspaceSessionsPerConnection (advanced)
-
Maximum number of sessions associated with a specific keyspace to keep open at a time. Least recently used sessions are closed first.
Default: 5
- eventReplayTimeoutInSeconds (advanced)
-
Default: 600
- eventReplayBatchSize (advanced)
-
Default: 10
Connection options
The following connection management options are nested under the connectionManagement:
section in the configuration file.
For example:
connectionManagement:
connectTimeoutInMillis: 5000
readTimeoutInMillis: 3000
constantReconnectPolicyDelayInMillis: 10000
defaultGremlinPort: 8182
maxWaitForConnection: 5000
defaultSparkSQLPort: 10000
- connectTimeoutInMillis
-
Connection timeout used in Java driver socket options.
Default: 5000
- readTimeoutInMillis
-
Read timeout used in Java driver socket options.
Default: 3000
- constantReconnectPolicyDelayInMillis
-
Constant reconnect policy delay used in Java driver socket options.
Default: 10000
- defaultGremlinPort
-
The port on a DSE node running a Gremlin Server. For DSE connections, this port must match the value of
gremlin_server
in the cluster’sdse.yaml
file.The default value depends on the connection type. For DSE connections, the default is 8182. For Astra Classic connections, the default is 30046.
This setting is commented out by default in the configuration file. Only uncomment it if you need to connect to a DSE node running a Gremlin Server on a non-default port.
- maxWaitForConnection
-
Maximum time in milliseconds to wait for a connection to the Gremlin Server.
Default: 5000
- defaultSparkSQLPort
-
The Thrift port on a DSE node running AlwaysOn SQL (AOSS). This port must match the value of
thrift_port
in the cluster’sdse.yaml
file.This setting isn’t relevant to Astra connections.
Default: 10000
- idleTimeoutInSeconds (advanced)
-
How long before an unused connection expires and is closed when it isn’t in use.
Default: 3600 (1 hour)
Security options
These options are indented under a security:
section in the configuration file.
This section and options aren’t included by default because they are advanced options.
- encryptionPasswordFile (advanced)
-
To make encryption of passwords unique for your installation, you can change the password in this file. Use a strong generated password. DataStax recommends following security best practices, including avoiding simple words and phrases.
Default:
conf/security/security.properties
JVM settings
You can specify JVM command-line options for the local Studio server.
Studio runs with the following default JVM settings:
-
Minimum heap (
Xms
): 256 MB -
Maximum heap (
Xmx
): 4 GB -
Temporary directory (
temp dir
):/tmp
These default values are expressed as follows:
export STUDIO_JVM_ARGS="-Xms256m -Xmx4g -Djava.io.tmpdir=/tmp"
You can adjust them as needed for your environment. To adjust these settings, do the following:
-
Stop Studio if it is running.
-
Create a
setenv
file:-
Linux and macOS: Create
setenv.sh
. -
Microsoft Windows: Create
setenv.bat
.
-
-
Add your Studio JVM arguments to the
setenv
file using the following format:export STUDIO_JVM_ARGS="JVM_OPTIONS"
For example, to change the maximum heap size to 8 GB, add the following line to the
setenv
file:export STUDIO_JVM_ARGS="$JVM_OPTS -Xmx8g"
-
Save the file in the same directory as your Studio
server.sh
file, such asdatastax-studio/bin/setenv.sh
. -
Restart Studio to apply the changes.