DataStax Enterprise 5.1 release notes

DataStax Enterprise release notes cover cluster requirements, upgrade guidance, components, security updates, changes and enhancements, issues, and resolved issues for DataStax Enterprise 5.1.

DataStax Enterprise release notes cover cluster requirements, upgrade guidance, components, changes and enhancements, issues, and resolved issues for DataStax Enterprise (DSE) 5.1.
Note: Each point release includes a highlights and executive summary section to provide guidance and add visibility to important improvements.

Requirement for Uniform Licensing

All nodes in each cluster must be uniformly licensed to use the same subscription. For example, if a cluster contains 5 nodes, all 5 nodes within that cluster must be either DataStax Distribution of Apache Cassandra™, or all 5 nodes must be DataStax Enterprise. Mixing different subscriptions within a cluster is not permitted. The DataStax Advanced Workloads Pack may be added to any DataStax Enterprise (not DataStax Distribution of Apache Cassandra) cluster in an incremental fashion. For example, a 10-node DSE cluster may be extended to include 3 nodes of the Advanced Workloads Pack. “Cluster” means a collection of nodes running the software which communicate with one another using gossip. See Enterprise Terms.

Note: For third-party software, see DataStax Enterprise 5.1.x third-party software (not all entries apply to DDAC).

Before you upgrade

Upgrade advice Compatibility
Before you upgrade to a later major version, upgrade to the latest patch release (5.1.17) on your current version. Be sure to read the relevant upgrade documentation. Upgrades to DSE 5.1 are supported from:
Check the compatibility page for your products. DSE 5.1 product compatibility
See Upgrading DataStax drivers. DataStax Drivers: You may need to recompile your client application code.
Use DataStax Bulk Loader for loading and unloading data. Loads data into DSE 5.0 or later and unloads data from any Apache Cassandra™ 2.1 or later data source.

DSE 5.1.17

Release notes for DataStax Enterprise 5.1.17.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.17.
Important: DataStax recommends the latest patch release for most environments.

2 October 2019

Table 1. DSE functionality

5.1.17 Components

All components from DSE 5.1.17 are listed. Components that are updated for DSE 5.1.17 are indicated with an asterisk (*).
  • Apache Solr™ 6.0.1.0.2524 *
  • Apache Spark™ 2.0.2.27 *
  • Apache TinkerPop™ 3.2.11
  • Apache Tomcat® 8.0.53
  • DataStax Spark Cassandra Connector 2.0.11
  • DSE Java Driver 1.2.8 *
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.239 requires compatible API *
  • Select Hadoop libraries

DSE 5.1.17 is compatible with Apache Cassandra™ 3.11 and adds production-certified changes to Cassandra.

5.1.17 Highlights

High-value benefits of upgrading to DSE 5.1.17 include these highlights:

5.1.17 DataStax Enterprise database core highlights

  • New options to select cipher suite and protocol to configure KMIP encryption when connecting to a KMIP server. (DSP-17294)
  • Standalone cqlsh client tool provides an interface for developers to interact with the database and issue CQL commands without having to install the database software. From DataStax Labs, download the version of CQLSH that corresponds to your DataStax database version. (DSP-18694)

5.1.17 DSE Graph highlights

  • Fixed an issue where T values get hidden by property keys of the same name in valueMap(). (DSP-19261)

5.1.17 DSE Search highlights

  • Improved logging and tools to identify and troubleshoot expensive Solr queries to prevent performance issues. (DSP-18693)

5.1.17 DataStax Enterprise core

Changes and enhancements:
  • Improved troubleshooting. A log entry is now created when autocompaction is disabled or enabled for a table. (DB-1635)
  • Reformatted StatusLogger output to reduce details in the INFO level system.log. The detailed output is still present in the debug.log. (DB-2552)
  • Prevent changing the replication strategy of system keyspaces. (DB-2960)
  • New nodetool commands to get current values: getcachecapacity, getcachekeystosave, and gethintedhandoffthrottlekb. (DB-3618)
  • New options to select cipher suite and protocol to configure KMIP encryption when connecting to a KMIP server. (DSP-17294)
  • Upgrade Jackson Databind to address CVE-2018-11307 and CVE-2018-19361. (DB-2911, DSP-18099, DSP-19319)
  • Standalone cqlsh client. (DSP-18694)
  • Update Jackson Databind to 2.9.9.1 for all components except DataStax Bulk Loader. (DSP-19441)
Resolved issues:
  • Fix to prevent NPE during repair in mixed-version clusters. (DB-1985)
  • Tarball installs to create two instances on the same physical server with remote JMX access with binding the separated IPs to port 7199 causes JMX error of Address already in use (Bind failed) because com.sun.management.jmxremote.host is ignored. (DB-2483)
  • DSE fails to start with ERROR Attempted serializing to buffer exceeded maximum of 65535 bytes. Improved error to identify a workaround for commitlog corruption. (DB-3162)
  • sstablepartitions incorrectly handles -k and -x options. (DB-3442)

    Workaround: To specify multiple keys, repeat the -k or -x option several times.

  • After upgrades from DSE 4.8 to DSE 5.0 or DSE 5.1, deleted data might be resurrected when rows were deleted on DSE 4.8 tables with multiple collection columns. (DB-3492)

    This fix provides protection from this potential condition. Reappearing rows are naturally fixed by compaction.

    Note: If you experience reappearing data after upgrading from DSE 4.8, DataStax recommends running nodetool scrub to correct potentially affected SSTables.
  • Reads against older version ma and mc SSTables hit more SSTables than necessary due to the bug fixed by CASSANDRA-14861. (DB-3691)
    Attention: DataStax recommends reading and following all upgrade instructions in Upgrade DataStax Enterprise documentation. Do not skip this step:
    Upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    If the SSTables are already on the current version, the command returns immediately and no action is taken. See DataStax Enterprise, Apache Cassandra, CQL, and SSTable compatibility.
  • Upgraded Apache MINA Core library to 2.0.21 to prevent a security issue where Apache MINA Core was vulnerable to information disclosure. (DSP-19213)
  • Error in custom provider prevents DSE node startup. With this fix, the node will start up but insights is not active. See the DataStax Support Knowledge Base for steps to resolve existing missing or incorrect keyspace replication problems. (DSP-19521)
  • Latency metrics, like dse_client_request_latency_bucket, are not present. (DSP-19549)
Known issues:
  • On Oracle Linux 7.x, StorageService.java:4970 exception occurs with DSE package installation. (DSP-19625)

    Workaround: On Oracle Linux 7.x operating systems, install DSE using the binary tarball.

5.1.17 DSE Graph

Changes and enhancements:
  • New graph truncate command to remove all data from graph. (DSP-17609)
  • Support for ifExists() before truncate(), like system.graph("foo").ifExists().truncate(), in DSE Graph (classic graph) API. (DSP-19357)
Resolved issues:
  • T values are hidden by property keys of the same name in valueMap(). (DSP-19261)
  • Credentials are not masked in the debug level logs for Spark Jobserver and Spark submitted jobs. (DSP-19490)

5.1.17 DSE Search

Changes and enhancements:
  • Improved logging and tools to identify and troubleshoot expensive Solr queries to prevent performance issues. (DSP-18693)
  • For token ranges dictated by distribution, filter cache warming occurs when a node is restarted, a search index is rebuilt, or when node health score is up to 0.9. New per-core metrics for metric type WarmupMetrics and other improvements. (DSP-8621)
Resolved issues:
  • Solr CQL count query incorrectly returns the count as all data count but should return all data count minus start offset. (DSP-16153)
  • Validation error does not get returned when docValues are applied if types do not allow docValues. (DSP-16884)
    With this fix, the following exception behavior is applied:
    • Throw exception when docValues:true is specified for a column and column type does not support docValues.
    • Do not throw exception and ignore docValues:true for columns with types that do not support docValues if docValues:true is set for *.
  • While using live indexing, also known as RT or real-time indexing, a race condition can be triggered when concurrently indexing and running heavy facet queries. The race condition fails an assertion that, in turn, fails searcher opening and leaves the index in an inconsistent state. (DSP-18786)
  • When driver uses paging, CQL query fails when using a Solr index to query with a sort on a field that contains the primary key name in the field: InvalidRequest: Error from server: code=2200 [Invalid query] message="Cursor functionality requires a sort containing a uniqueKey field tie breaker". (DSP-19210)
  • The count() query with Solr enabled can be inaccurate or inconsistent. (DSP-19401)

Cassandra enhancements for DSE 5.1.17

A list of DataStax Enterprise 5.1.17 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.17 includes all changes from previous releases. These production-certified changes are enhancements to Apache Cassandra 3.11. (For Cassandra updates, see CHANGES.txt.)

  • Skipping illegal legacy cells can break reverse iteration of indexed partitions. (CASSANDRA-15178)
  • Skip cells with illegal column names when reading legacy SSTables. (CASSANDRA-15086)
  • SSTable min/max metadata can cause data loss. (CASSANDRA-14861)

General upgrade advice for DSE 5.1.17

General upgrade advice for DataStax Enterprise 5.1.17.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.17

A list of DataStax Enterprise 5.1.17 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.17 includes DataStax Spark Cassandra Connector 2.0.11 with all changes from earlier versions.

TinkerPop changes for DSE 5.1.17

Enhancements to Apache TinkerPop 3.2.11.

DataStax Enterprise (DSE) 5.1.17 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.11. For TinkerPop changes, see TinkerPop Upgrade Information.

DSE 5.1.16

Release notes for DataStax Enterprise 5.1.16.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.16.
Important: DataStax recommends the latest patch release for most environments.

9 July 2019

5.1.16 Components

All components from DSE 5.1.16 are listed. Components that are updated for DSE 5.1.16 are indicated with an asterisk (*).
  • Apache Solr™ 6.0.1.0.2463 *
  • Apache Spark™ 2.0.2.25
  • Apache TinkerPop™ 3.2.11
  • Apache Tomcat® 8.0.53
  • DataStax Spark Cassandra Connector 2.0.11
  • DSE Java Driver 1.2.7
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries

DSE 5.1.16 is compatible with Apache Cassandra™ 3.11 and includes all DataStax enhancements from earlier versions.

5.1.16 DataStax Enterprise

Important bug fix:
  • Fixed possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.

Cassandra enhancements for DSE 5.1.16

A list of DataStax Enterprise 5.1.16 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.16 includes all changes from previous releases that are enhancements to Apache Cassandra 3.11. (For Cassandra updates, see CHANGES.txt.)

General upgrade advice for DSE 5.1.16

General upgrade advice for DataStax Enterprise 5.1.16.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.16

A list of DataStax Enterprise 5.1.16 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.16 includes DataStax Spark Cassandra Connector 2.0.11 with all changes from earlier versions.

TinkerPop changes for DSE 5.1.16

Enhancements to Apache TinkerPop 3.2.11.

DataStax Enterprise (DSE) 5.1.16 includes all changes from previous releases. For TinkerPop changes, see TinkerPop Upgrade Information.

DSE 5.1.15

Release notes for DataStax Enterprise 5.1.15.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml
Release notes for DataStax Enterprise 5.1.15.
Important: DataStax recommends the latest patch release for most environments.

11 June 2019

Table 2. DSE functionality

5.1.15 Components

All components from DSE 5.1.15 are listed. Components that are updated for DSE 5.1.15 are indicated with an asterisk (*).
  • Apache Solr™ 6.0.1.0.2463 *
  • Apache Spark™ 2.0.2.25
  • Apache TinkerPop™ 3.2.11
  • Apache Tomcat® 8.0.53
  • DataStax Spark Cassandra Connector 2.0.11
  • DSE Java Driver 1.2.7
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries

DSE 5.1.15 is compatible with Apache Cassandra™ 3.11 and adds production-certified changes to Cassandra.

5.1.15 Highlights

High-value benefits of upgrading to DSE 5.1.15 include these highlights:

5.1.15 DSE Analytics highlights

  • When DSE authentication is enabled, Spark security is forced to be enabled. (DSP-17274)

5.1.15 DSE Graph highlights

  • DseGraphFrame cannot directly copy graph from one cluster to another. You can now dynamically pass cluster and connection configuration for different graph objects. (DSP-18605)
  • UnsatisfiedLinkError when insert multi edge with DseGraphFrame in BYOS (Bring Your Own Spark). (DSP-18916)

5.1.15 DSE Search highlights

  • Performance improvements to Solr deletes that correspond to Cassandra rows. (DSP-17419)
  • Changes to correct uneven distribution of shard requests with the STATIC set cover finder. (DSP-18197)
  • New recommended method for case-insensitive text search, faceting, grouping, and sorting with new LowerCaseStrField Solr field type. This type sets field values as lowercase and stores them as lowercase in docValues. (DSP-18763)
  • The queryExecutorThreads and timeAllowed Solr parameters can be used together. (DSP-18717)

5.1.15 DataStax Enterprise

Resolved issues:

  • Improved logging identifies which client, keyspace, table, and partition key is rejected when mutation exceeds size threshold. (DB-1051)
  • Nodes in a cluster continue trying to connect to a decommissioned node. (DB-2886)
  • Bootstrap should fail if the node is not able to fetch the schema from other nodes in the cluster. (DB-3186)
  • Slow startup or node hangs when encryption is used. (DB-3050)

Known issue:

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.

5.1.15 DSE Analytics

Changes and enhancements:
  • A warning message is displayed when DSE authentication is enabled, but Spark security is not enabled. (DSP-17273)
  • When DSE authentication is enabled, Spark security is forced to be enabled. (DSP-17274)
    dse.yaml Spark security is enforced
    authentication_options When enabled: true
    spark_security_enabled This setting is ignored.
    spark_security_encryption_enabled This setting is ignored.

Known issues:

  • When the Spark security options are not configured in dse.yaml, the native CQL protocol authentication can be sidestepped with direct access to the Netty RPC client. Although this access should fail to run Spark applications, the CQL authentication can be bypassed on systems with an open Netty port 7077 using Spark RPC. (DSP-17271)
    Solution: Configure the Spark security options in dse.yaml:
    spark_shared_secret_bit_length: 256
    spark_security_enabled: true
    spark_security_encryption_enabled: true

5.1.15 DSE Graph

Resolved issues:
  • DseGraphFrame cannot directly copy graph from one cluster to another. You can now dynamically pass cluster and connection configuration for different graph objects. (DSP-18605)
    Workaround for earlier versions:
    1. Export graph to DSEFS:
      g.V.write.format("csv").save("dsefs://culster1/tmp/vertices")
      g.E.write.format("csv").save("dsefs://culster1/tmp/edges")
    2. Import graph to the other cluster:
      g.updateVertices(spark.read.format("csv").load("dsefs://culster1/tmp/vertices")
      g.updateEdges(spark.read.format("csv").load("dsefs://culster1/tmp/edges")
  • UnsatisfiedLinkError when insert multi edge with DseGraphFrame in BYOS (Bring Your Own Spark). (DSP-18916)
  • DSE Graph does not use primary key predicate in Search/.has() predicate. (DSP-18993)

5.1.15 DSE Search

Changes and enhancements:
  • Changes to correct uneven distribution of shard requests with the STATIC set cover finder. (DSP-18197)
  • New recommended method for case-insensitive text search, faceting, grouping, and sorting with new LowerCaseStrField custom Solr field type. This type sets field values as lowercase and stores them as lowercase in docValues. (DSP-18763)
    Note: DataStax does not support using the TextField Solr field type with solr.KeywordTokenizer and solr.LowerCaseFilterFactory to achieve single-token, case-insensitive indexing on a CQL text field.
Resolved issues:
  • SASI queries don't work on tables with row level access control (RLAC). (DB-3082)
  • Documents might not be removed from the index if a key element has value equal to a Solr reserved word. (DSP-17419)
  • FQ broken with queryExecutorThreads and timeAllowed set. (DSP-18717)
  • Search should error out, rather than timeout, on Solr query with non-existing field list (fl) fields. (DSP-18218)

Cassandra enhancements for DSE 5.1.15

A list of DataStax Enterprise 5.1.15 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.15 includes all changes from previous releases. This production-certified change is an enhancements to Apache Cassandra 3.11. (For Cassandra updates, see CHANGES.txt.)

General upgrade advice for DSE 5.1.15

General upgrade advice for DataStax Enterprise 5.1.15.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.15

A list of DataStax Enterprise 5.1.15 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.15 includes DataStax Spark Cassandra Connector 2.0.11 with all changes from earlier versions, and adds these production-certified changes:
  • Added case in StringConverter to properly output InetAddress. (SPARKC-559)
  • Added java.time.Instant -> java.util.Data conversion. (SPARKC-560)
  • RegularStatements not cached by SessionProxy. (SPARKC-558)
  • Fix CassandraSourceRelation option Parsing in Spark 2.0. (SPARKC-551)

TinkerPop changes for DSE 5.1.15

Enhancements to Apache TinkerPop 3.2.11.

DataStax Enterprise (DSE) 5.1.15 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.11. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Graph OLAP: secret tokens are redacted in log files.
  • Masked sensitive configuration options in the logs of KryoShimServiceLoader.
  • Changes to the SSL configuration in Gremlin Server. See the TinkerPop SSL Security documentation.

DSE 5.1.14

Release notes for DataStax Enterprise 5.1.14.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml
  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.14.
Important: DataStax recommends the latest patch release for most environments.

16 April 2019

5.1.14 Components

All components from DSE 5.1.14 are listed. Components that are updated for DSE 5.1.14 are indicated with an asterisk (*).
  • Apache Solr™ 6.0.1.0.2414 *
  • Apache Spark™ 2.0.2.25 *
  • Apache TinkerPop™ 3.2.11 *
  • Apache Tomcat® 8.0.53
  • DataStax Spark Cassandra Connector 2.0.11 *
  • DSE Java Driver 1.2.7
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries

DSE 5.1.14 is compatible with Apache Cassandra™ 3.11 and adds production-certified changes to Cassandra.

Table 3. DSE functionality

5.1.14 Highlights

Executive summary highlights for DSE 5.1.14: The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.14 DataStax Enterprise highlights

  • DataStax Enterprise Metrics Collector aggregates DSE metrics and integrates with existing monitoring solutions to facilitate problem resolution and remediation. (DSP-17869)
  • Fixed anti-compaction transaction for atomicity and index building. (DB-3016)
  • Remedy deadlock during node startup when calculating disk boundaries. (DB-3028)
  • Correct handling of dropped UDT columns in SSTables. (DB-3031)

    Workaround: If issues with UDTs in SSTables exist after upgrade from DSE 5.0.x, run sstablescrub -e fix-only offline on the SSTables that have or had UDTs that were created in DSE 5.0.x.

5.1.14 DSE Analytics and DSEFS highlights

  • Fixed an issue where properties unattached to vertex show up with null values. (DSP-12300)
  • DSEFS auth demo is fixed. (DSP-17700)
  • Fixed a leak in BulkTableWriter. (DSP-18513)

5.1.14 DSE Graph highlights

  • Time, date, inet, and duration data types are now supported in graph search indexes. (DSP-17694)
  • Data caching improvements during DSE GraphFrame operations. (DSP-17870)
  • DseGraphFrame supports properties with symbols, like period (.), in names. (DSP-17818)
  • Improved graph robustness in resource-constrained environments. (DSP-18005)
  • Graph OLAP: secret tokens are redacted in log files. (DSP-18074)
  • Some minor DSE GraphFrame code fixes. (DSP-18215)

5.1.14 DSE Search highlights

  • Fixed a class of SSTable reference leaks. (DSP-17975)
  • Indexing rows that contain frozen maps is supported. (DSP-18073)
  • Fixed timestamp PK routing with solr_query. (DSP-18223)
  • Fixed facets and stats queries when using queryExecutorThreads. (DSP-18237, DSP-18665)

5.1.14 Known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE Analytics: Spark application Web UI: When TLS is enabled and a driver is submitted in cluster mode, the driver starts on port 4040 and is not secured. (DSP-16926)
    Workaround: To enable SSL for a Spark application Web UI with secure HTTPS on port 4440, see the Spark documentation for SSL Configuration.. To pass the SSL configuration with standard Spark commands, use the dse spark-sql --conf command:
    dse spark-submit --conf spark.ssl.ui.enabled=true 
    --conf spark.ssl.ui.keyPassword=ctool_keystore 
    --conf spark.ssl.ui.keyStore=/home/automaton/ctool_security/ctool_keystore 

5.1.14 DataStax Enterprise

Resolved issues:

  • Native server Message.Dispatcher.Flusher task stalls under heavy load. (DB-1814)
  • Reference leak in SSTableRewriter in sstableupgrade when keepOriginals is true. (DB-2944)
  • Anti-compaction transaction causes temporary data loss. (DB-3016)
  • Check of two versions of metadata for a column fails on upgrade from DSE 5.0.x when type is not of same class. Loosen the check from CASSANDRA-13776 to prevent Trying to compare 2 different types ERROR on upgrades. (DB-3021)
  • Deadlock during node startup when calculating disk boundaries. (DB-3028)
  • Correct handling of dropped UDT columns in SSTables. (DB-3031)
  • Mishandling of frozen in complex nested types. (DB-3081)
  • cqlsh EXECUTE AS command does not work. (DB-3098)
  • Security: java-xmlbuilder is vulnerable to XML external entities (XXE). (DSP-13962)
  • Timestamp PK routing on solr_query fails. (DSP-18223)
  • Leak in BulkTableWriter. (DSP-18513)

5.1.14 DSE Analytics

Resolved issues

  • dse client-tool configuration byos-export does not export required Spark properties. (DSP-15938)
  • CVE-2018-1334 Apache Spark local privilege escalation vulnerability. (DSP-16715)
  • Downloaded Spark JAR files are executable for all users. (DSP-17692)
  • Spark Cassandra Connector does properly cache manually prepared RegularStatements, see SPARKC-558. (DSP-18075)
  • Invalid options show for dse spark-submit command line help. (DSP-18293)

5.1.14 DSEFS

Resolved issues

  • DSEFS demo does not work. (DSP-17700)
  • Change dsefs:// default port when the DSEFS setting public_port is changed in dse.yaml. (DSP-17962)
  • SparkContext closing is faulty with significantly increased shutdown time. (DSP-17699)
  • DSEFS WebHDFS API GETFILESTATUS op returns AccessDeniedException for the file even when user has correct permission. (DSP-18044)

5.1.14 DSE Graph

Resolved issues

  • Do not report errors for leases when a DC is removed. (DSP-16801)
  • Properties unattached to vertex show up with null values. (DSP-12300)
  • g.V().repeat(...).until(...).path() returns incomplete path without edges. (DSP-17933)
  • DseGraphFrame fail to read properties with symbols, like period (.), in names. (DSP-17818)
  • DSE GraphFrame operations cache but do not explicitly uncache. (DSP-17870)
  • Inconsistent results when using gremlin on static data. (DSP-18005)
  • Graph OLAP: secret tokens are unmasked in log files. (DSP-18074)
  • Unexpected gossip failure. java.lang.NullPointerException: null. (DSP-18194)
  • OLAP traversal duplicates the partition key properties: OLAP g.V().properties() prints 'first' vertex n times with custom ids. (DSP-15688)
  • Time, date, inet, and duration data types are not supported in graph search indexes. (DSP-17694)

5.1.14 DSE Search

Resolved issues

  • java.lang.AssertionError: rtDocValues.maxDoc=5230 maxDoc=4488 error is thrown in the system.log during indexing and reindexing. (DSP-17529)
  • Strong self-ref loop detected after reindex is finished. (DSP-17975)
  • Loading frozen map columns fails during search read-before-write. (DSP-18073)
  • Avoid interrupting request threads when an internode handshake fails so that the Lucene file channel lock cannot be interrupted. (DSP-18211)
  • Facets and stats queries broken when using queryExecutorThreads. (DSP-18237, DSP-18665)

Cassandra enhancements for DSE 5.1.14

A list of DataStax Enterprise 5.1.14 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.14 includes all changes from previous releases. This production-certified change is an enhancements to Apache Cassandra 3.11. (For Cassandra updates, see CHANGES.txt.)

  • Severe concurrency issues in STCS,DTCS,TWCS,TMD.Topology,TypeParser (CASSANDRA-14781)

General upgrade advice for DSE 5.1.14

General upgrade advice for DataStax Enterprise 5.1.14.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.14

A list of DataStax Enterprise 5.1.14 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.14 includes DataStax Spark Cassandra Connector 2.0.11 with all changes from earlier versions, and adds these production-certified changes:
  • Added case in StringConverter to properly output InetAddress (SPARKC-559)
  • Added java.time.Instant -> java.util.Data conversion (SPARKC-560)
  • RegularStatements not Cached by SessionProxy (SPARKC-558)
  • Fix CassandraSourceRelation option Parsing in Spark 2.0 (SPARKC-551

TinkerPop changes for DSE 5.1.14

Enhancements to Apache TinkerPop 3.2.11.

DataStax Enterprise (DSE) 5.1.14 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.11. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Graph OLAP: secret tokens are redacted in log files.
  • Masked sensitive configuration options in the logs of KryoShimServiceLoader.
  • Changes to the SSL configuration in Gremlin Server. See the TinkerPop SSL Security documentation.

DSE 5.1.13

Release notes for DataStax Enterprise 5.1.13.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.13.
Important: DataStax recommends the latest patch release for most environments.

27 February 2019

5.1.13 Components

All components from DSE 5.1.13 are listed.

  • Apache Solr™ 6.0.1.0.2370
  • Apache Spark™ 2.0.2.22
  • Apache TinkerPop™ 3.2.9-20181026-f24c1d4b
  • Apache Tomcat® 8.0.53
  • DataStax Spark Cassandra Connector 2.0.10
  • DSE Java Driver 1.2.7
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries

DSE 5.1.13 is compatible with Apache Cassandra™ 3.11 and includes all production-certified changes from previous releases.

5.1.13 Resolved issue

  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)

    If the DSE 5.0.x schema contains user-defined types (UDTs), the SSTable serialization headers are fixed when DSE is started with DSE 5.1.13 or later.

5.1.13 Known issue

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.

Cassandra enhancements for DSE 5.1.13

A list of DataStax Enterprise 5.1.13 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.13 includes all changes from previous releases that are enhancements to Apache Cassandra 3.11. (For Cassandra updates, see CHANGES.txt.)

General upgrade advice for DSE 5.1.13

General upgrade advice for DataStax Enterprise 5.1.13.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.13

Upgrading
  • SSTables for tables using with a frozen UDT written by DSE 5.0 or Cassandra 3.0 appear as corrupted. See DB-2954, CASSANDRA-15035 in the DSE resolved issues release notes.

Spark Cassandra Connector changes for DSE 5.1.13

A list of DataStax Enterprise 5.1.13 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.13 includes DataStax Spark Cassandra Connector 2.1.10 and all production-certified changes from earlier versions.

TinkerPop changes for DSE 5.1.13

Enhancements to Apache TinkerPop 3.2.9.

DataStax Enterprise (DSE) 5.1.13 includes all changes from previous releases. For TinkerPop changes, see TinkerPop Upgrade Information.

DSE 5.1.12

Release notes for DataStax Enterprise 5.1.12.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/cassandra/conf/cassandra.yaml
Release notes for DataStax Enterprise 5.1.12.
Important: DataStax recommends the latest patch release for most environments.

26 December 2018

Table 4. DSE functionality

5.1.12 Components

All components from DSE 5.1.12 are listed. Components that are updated for DSE 5.1.12 are indicated with an asterisk (*).

  • Apache Solr™ 6.0.1.0.2370 *
  • Apache Spark™ 2.0.2.22 *
  • Apache TinkerPop™ 3.2.9-20181026-f24c1d4b *
  • Apache Tomcat® 8.0.53 *
  • DataStax Spark Cassandra Connector 2.0.10
  • DSE Java Driver 1.2.7 *
  • Netty 4.0.54.Final
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries

DSE 5.1.12 is compatible with Apache Cassandra™ 3.11 and includes production-certified changes to Cassandra.

5.1.12 Highlights

Executive summary highlights for DSE 5.1.12: The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.12 DataStax Enterprise core highlights

  • Skip fetching streamed range in repair during consistent-replace. (DB-2596)
  • Fixed user-defined aggregates (UDAs) that instantiate user-defined types (UDTs) break after restart. (DB-2771)
  • Fixed NullPointerException that can occur during compaction if users use TWCS and allow_unsafe_aggressive_sstable_expiration. (DB-2472)
  • Fixed resource leak related to streaming operations that affects tiered storage users. Excessive number of TieredRowWriter threads causing java.lang.OutOfMemoryError. (DB-2463)
  • General stability improvements:
    • Invalidate chunk cache on SSTable rename. (DB-2594)
    • Fixes to several thread-safety bugs. (DB-2602, DB-2609)
    • Fix for static and regular collision when using same ColumnIdentifier and ComposedTypes. (DB-1630)
  • Upgrade improvements:
    • Fixed handling of deletions for dropped collections in static rows in mixed-version clusters. (DB-2341)
  • Operational improvements:
    • Support for QUORUM/LOCAL_QUORUM consistent replace_address. (DB-1577, DB-2596)
    • Expose information about stored hints by using JMX/nodetool listendpointspendinghints. (DB-1674)
    • New sstablepartitions tool to identify large partitions. (DB-803)
    • Fixed incorrect order of application of nodetool garbagecollect leaves tombstones that should be deleted. (DB-2658)
    • Custom HeapDumpPath is not overwritten. (DB-714)
    • By default, rebuild only locally replicated keyspaces. (DB-2301)

5.1.12 DSE Analytics and DSEFS highlights

  • Jetty 9.4.1 upgrade addresses security vulnerabilities in Spark dependencies packaged with DSE. (DSP-16893)
  • DSE 5.0.x DSEFS client is now able to list files when connected to DSE 5.1.x and later DSEFS server. (DSP-17600)

5.1.12 DSE Graph highlights

  • Fix unresponsive nodes following Gremlin timeouts. (DSP-16544)
    • Graph/Search escaping fixes. (DSP-17216, DSP-17277, DSP-17816)

5.1.12 DSE Search highlights

  • Security fixes. (DSP-17029, DSP-17303)
  • Critical memory leak and corruption fixes for encrypted indexes. (DSP-17111)
  • Change to the default merge scheduler configuration. See config option MaxMergeCount. (DSP-17597)
  • CQL timestamp field can be part of a Solr unique key. (DSP-17761)
  • Minor query memory usage improvements. (DSP-17147)

5.1.12 DataStax Enterprise

Changes and enhancements:

  • New DSE start-up parameter -Ddse.consistent_replace improves LOCAL_QUORUM and QUORUM consistency on new node after node replacement. (DB-1577)
  • New nodetool listendpointspendinghints command prints hint information about the endpoints this node has hints for. (DB-1674)
  • New sstablepartitions tool to identify large partitions. (DB-803)
  • New JMX operations for graph MBeans. (DSP-15928)
    • adjacency-cache.size - adjacency cache size attribute
    • adjacency-cache.clear - operation to clean adjacency cache
    • index-cache.size - vertex cache size attribute
    • index-cache.clear - operation to clean vertex cache
    JMX operations are not cluster-aware. Invoke on each node as appropriate to your environment.
  • Improved encryption key error reporting. (DSP-17723)
  • New -Dcassandra.range_tombstone_bound_check_chance start-up parameter checks for bad range tombstones on a percentage of queries. (DSP-17969)

Resolved issues:

  • Custom HeapDumpPath is overwritten. (DB-714)
  • Deleting a static column and adding it back as a non-static column introduces corruption. (DB-1630)
  • Rebuild should not fail when a keyspace is not replicated to other datacenters. (DB-2301)
  • Corrupted static collection deletions for dropped collections in mixed-version clusters. (DB-2341)
  • repair may skip some ranges due to received range cache. (DB-2432)
  • Excessive number of TieredRowWriter threads causing java.lang.OutOfMemoryError (DB-2463)
  • NullPointerException during compaction on table with TimeWindowCompactionStrategy (TWCS). (DB-2472)
  • Prevent potential SSTable corruption with nodetool refresh. (DB-2594)
  • The nodetool gcstats command output incorrectly reports the GC reclaimed metric in bytes, instead of the expected MB. (DB-2598)
  • TypeParser is not thread safe. (DB-2602)
  • STCS, DTCS, TWCS, TMD aren't thread-safe. (DB-2609)
  • Incorrect order of application of nodetool garbagecollect leaves tombstones that should be deleted. (DB-2658)
  • User-defined aggregates (UDAs) that instantiate user-defined types (UDTs) break after restart. (DB-2771)
  • Fix sstableloader error when internode encryption, client_encryption, and config encryption are enabled. (DSP-17536)
  • EverywhereStrategy picks non-token-owning nodes as endpoints. (DSP-16951)

5.1.12 DSE Analytics

Changes and enhancements:

  • Improved error handling: only submission-related error exceptions from Spark submitted applications are wrapped in a Dse Spark Submit Bootstrapper Failed to Submit error. (DSP-16359)
  • Jetty 9.4.1 upgrade addresses security vulnerabilities in Spark dependencies packaged with DSE. (DSP-16893)
  • dse spark-submit kill and status commands support optionally explicit master address. (DSP-16910, DSP-16991)

Resolved issues:

  • Redirect to cluster mode for Spark applications whose public DNS is set. (DSP-15705)
  • Race condition allows Spark Executor working directories to be removed before stopping those executors. (DSP-15769)
  • Restore DseGraphFrame support in BYOS and spark-dependencies artifacts. Include graph frames python library in graphframe.jar. (DSP-16383)
  • Search optimizations for search analytics Spark SQL queries are applied to a datacenter that no longer has search enabled. Queries launched from a search-enabled datacenter cause search optimizations even when the target datacenter does not have search enabled. (DSP-16465)
  • DSE 5.0.x DSEFS client is not able to list files when connected to 5.1.x (and up) DSEFS server. (DSP-17600)
  • dse spark-sql-metastore-migrate does not work with DSE Unified Authentication and internal authentication. (DSP-17632)

5.1.12 DSEFS

Changes and enhancements:

  • Improved error message when no available chunks are found. (DSP-16623)

Resolved issues:

  • DSEFS throws exceptions and cannot initialize when listen_address is left blank. (DSP-16296)
  • Timeout issues in DSEFS startup. (DSP-16875)
    Initialization would fail with error messages similar to:
    com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried)
  • DSEFS exit code not set in some cases (DSP-17266)
  • DSEFS does not support listen_on_broadcast_address as configured in cassandra.yaml. (DSP-17363)
  • Moving a directory under itself causes data loss and orphan data structures. (DSP-17347)

5.1.12 DSE Graph

Resolved issues:

  • Graph OLAP KryoException for geometry types. (DSP-16955)
  • A Gremlin query with search predicate containing \u2028 or \u2029 characters fails. (DSP-17227)
  • Geo.inside predicate with Polygon no longer works on secondary index if JTS is not installed. (DSP-17284)
  • Search indexes on key fields work only with non-tokenized queries. (DSP-17386)

5.1.12 DSE Search

Changes and enhancements:

  • If a client executes a query that results in a shard attempting to send an internode frame larger than the size specified in frame_length_in_mb, the client receive an error message with a message like this:
    Attempted to write a frame of <n> bytes with a maximum frame size of <n> bytes
    

    In earlier versions, the query timed out with no message. Information was provided only as error in the logs.

  • Avoid unnecessary exception and error creation in the Solr query parser. (DSP-17147)
  • In earlier releases, CQL search queries failed with UTFDataFormatException on very large SELECT clauses and when tables have a very large number of columns. (DSP-17220)

    With this fix, CQL search queries fail with UTFDataFormatException only when SELECT clauses constitute a string larger than 64k UTF-8 encode bytes.

  • Requesting a core reindex with dsetool reload_core or REBUILD SEARCH INDEX no longer builds up a queue of reindexing tasks on a node. Instead, a single starting reindexing task handles all reindex requests that are already submitted to that node. (DSP-17045, DSP-13030)
  • Security improvements:
    • Upgrade Apache Tomcat to prevent Denial Of Service (DoS), CVE-2018-1336. (DSP-17303)
    • Upgrade Apache Commons Compress to prevent Denial Of Service (DoS) vulnerability present in Commons Compress 1.16.1, CVE-2018-11771. (DSP-17019)
  • The calculated value for maxMergeCount is changed to improve indexing performance. (DSP-17597)
    max(max(<maxThreadCount * 2>, <num_tokens * 8>), <maxThreadCount + 5>)
    where num_tokens is the number of token ranges to assign to the virtual node (vnode) as configured in cassandra.yaml. See config option MaxMergeCount.

Resolved issues:

  • Memory leak and corruption for encrypted indexes. (DSP-17111)
  • Solr parsing error on Gremlin statement that contains OR, AND, or NOT and uses a search index. (DSP-17216)
  • CQL search queries failed with UTFDataFormatException on very large SELECT clauses and when tables have a very large number of columns. (DSP-17220)
  • CQL timestamp field can be part of a Solr unique key. (DSP-17761)

5.1.12 Known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • Upgrades from DSE 5.0.x to DSE 5.1.x on RHEL-based systems incorrectly install DSE 6.x when demos are installed. (DSP-15937)
    Workaround: For upgrades on RHEL-based systems that have demos installed, you must specify the package installation in a single line, and specify the version for dse-full and dse-demos. For example:
    sudo yum install dse-full-5.1.17-1 dse-demos-5.1.17-1
  • If the wrong DSE version was incorrectly installed:
    1. Uninstall the incorrect DSE version:
      sudo yum remove "dse-*" "datastax-*"
    2. Install the DSE 5.1.x version again:
      sudo yum install dse-full-5.1.17-1 dse-demos-5.1.17-1
  • Spark shutdown stops executors but does not wait for everything else to close, causing CoarseGrainedScheduler errors on app termination: org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it has been stopped. (DSP-16751)

Cassandra enhancements for DSE 5.1.12

A list of DataStax Enterprise 5.1.12 enhancements to Apache Cassandra™ 3.11.

DataStax Enterprise (DSE) 5.1.12 includes all changes from previous releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11. (For Cassandra updates, see CHANGES.txt.)

  • Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823)
  • Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824)
  • Reverse order reads can return incomplete results (CASSANDRA-14803)
  • Avoid calling iter.next() in a loop when notifying indexers about range tombstones (CASSANDRA-14794)
  • Fix purging semi-expired RT boundaries in reversed iterators (CASSANDRA-14672)
  • DESC order reads can fail to return the last Unfiltered in the partition (CASSANDRA-14766)
  • Fix corrupted collection deletions for dropped columns in 3.0 <-> 2.{1,2} messages (CASSANDRA-14568)
  • Handle failures in parallelAllSSTableOperation (cleanup/upgradesstables/etc) (CASSANDRA-14657)
  • Improve TokenMetaData cache populating performance avoid long locking (CASSANDRA-14660)
  • Fix static column order for SELECT * wildcard queries (CASSANDRA-14638)
  • sstableloader should use discovered broadcast address to connect intra-cluster (CASSANDRA-14522)
  • Fix reading columns with non-UTF names from schema (CASSANDRA-14468)
  • Fix incorrect cqlsh results when selecting same columns multiple times (CASSANDRA-13262)
  • Returns null instead of NaN or Infinity in JSON strings (CASSANDRA-14377)

General upgrade advice for DSE 5.1.12

General upgrade advice for DataStax Enterprise 5.1.12.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.12

Operations

  • A new property cassandra.range_tombstone_bound_check_chance checks for bad range tombstone on a percentage of queries. The default is 0.01 (can be set in range of 0.0 - 1.0).

Spark Cassandra Connector changes for DSE 5.1.12

A list of DataStax Enterprise 5.1.12 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.12 includes DataStax Spark Cassandra Connector 2.1.10 and all production-certified changes from earlier versions.

TinkerPop changes for DSE 5.1.12

Enhancements to Apache TinkerPop 3.2.8.

DataStax Enterprise (DSE) 5.1.12 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.9. For TinkerPop changes, see TinkerPop Upgrade Information.

Resolved issues:

  • Gremlin materializes snapshots lazily. (DSP-17576, TINKERPOP-2081)

DSE 5.1.11

Release notes for DataStax Enterprise 5.1.11.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml
  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.11.
Important: DataStax recommends the latest patch release for most environments.

14 September 2018

Table 5. DSE functionality

5.1.11 Components

All components from DSE 5.1.11 are listed. Components that are updated for DSE 5.1.11 are indicated with an asterisk (*).

  • Apache Solr™ 6.0.1.0.2304 *
  • Apache Spark™ 2.0.2.21 *
  • Apache TinkerPop™ 3.2.9-20180507-f6ead8b2
  • Apache Tomcat® 8.0.47
  • DataStax Spark Cassandra Connector 2.0.10 *
  • DSE Java Driver 1.2.6
  • Netty 4.0.54.Final *
  • Spark Jobserver 0.6.2.238 requires compatible API
  • Select Hadoop libraries
DSE 5.1.11 is compatible with Apache Cassandra™ 3.11.0 and adds production-certified changes and enhancements.

5.1.11 Highlights

Executive summary highlights for DSE 5.1.11: The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.11 DSE Analytics and DSEFS highlights

  • Improved security with Spark user isolation. (DSP-16093)
  • Client and internode connection improvements. Configurable connections and pools. (DSP-14284, DSP-16065)
  • Improved security: DSEFS uses an isolated native memory pool for file data and metadata sent between nodes. This isolation makes it harder to exploit potential memory management bugs. (DSP-16492)
  • Fix for duration type in a keyspace that prevented DSEFS from starting. (DSP-16825)
  • Fix for failures in Spark when wrong type of exceptions occur on file not found. (DSP-16933)

5.1.11 DSE Graph highlights

  • Fix unresponsive nodes following Gremlin timeouts. (DSP-16544)

5.1.11 DSE Search highlights

  • Fixes NoSuchMethodError or NoClassDefFoundError exceptions when attempting to use a Snowball-generated stemmer. (DSP-16116)
  • DSE will not start without appropriate Tomcat JAR scanning exclusions. (DSP-16841)

5.1.11 DSE

Changes and enhancements:

  • Connections on non-serialization errors are not dropped. (DB-2233)
  • Create a log message when DDL statements are executed. (DB-2383)
  • Improved error handling and logging for TDE encryption key management. (DSP-15314)
  • sstableloader supports custom config file locations. (DSP-16092)
  • DataStax does more extensive testing on OpenJDK 8 due to the end of public updates for Oracle JRE/JDK 8. (DSP-16179)

Resolved issues:

  • Set MX4J_ADDRESS to 127.0.0.1 if not explicitly set. (DB-1950)
  • Digest mismatch for same data between nodes with flushed memtables and nodes with non-flushed memtables. (DB-1980)
  • Fix handling of start bound in legacy paged queries. (DB-1984)
  • Move TWCS message "No compaction necessary for bucket size" to Trace level or NoSpam. (DB-2022)
  • Limit max cached direct buffer on NIO to 1 MB. (DB-2028)
  • Compaction strategy instantiation errors don't generate meaningful error messages, instead return only InvocationTargetException. (DB-2404)
  • Non-portable syntax (MX4J bash-isms) in cassandra-env.sh broke service scripts. (DB-2123)
  • nodetool describecluster incorrectly shows DseDelegateSnitch instead of the snitch configured in cassandra.yaml. (DSP-16158)
  • nodetool upgradesstables fails with 20-year TTL. After upgrade to 5.1.11, take required action. (DB-2109)
  • Add missing equality sign to SASI schema snapshot. (DB-2129)
  • For tables using DSE Tiered Storage, nodetool cleanup places cleaned SSTables in the wrong tier. (DB-2173)
  • sstableloader options assume the RPC/native (client) interface is the same as the internode (node-to-node) interface. (DB-2184)
  • Audit events for CREATE ROLE and ALTER ROLE with incorrect spacing exposes PASSWORD in plain text. (DB-2285)
  • Client warnings are not always propagated via LocalSessionWrapper. (DB-2304)
  • Timestamps inserted with ISO 8601 format are saved with wrong millisecond value. (DB-2312)
  • Compaction fails with IllegalArgumentException: null. (DB-2329)
  • BulkLoader class exits without printing the stack trace for throwable error. (DB-2377)
  • sstableloader does not decrypt passwords using config encryption in DSE. (DSP-13492)
  • Support creating system keys before the output directory is configured in dse.yaml. (DSP-15380)
  • Using geo types does not work when memtable allocation type is set to offheap_objects. (DSP-16302)
  • Improved compatibility with external tables stored in the DSE Metastore in remote systems. (DSP-16561)
  • Heap-size calculation is incorrect for RpcCallStatement + SearchIndexStatement. (DSP-16731)
  • Non-internal users are unable to use permissions granted on CREATE. (DSP-16824)
  • The -graph option for the cassandra-stress tool failed on generating the target output html in the JAR file. (DSP-17046)

5.1.11 DSE Analytics

Changes and enhancements:

  • DSE client applications, like Spark, hard stop if user home is not defined, does not exist, or the current user does not have write permissions. (DSP-15476)

Resolved issues

  • A Spark application can be registered twice in rare instances. (DSP-15247)
  • Java driver in Spark Connector uses daemon threads to prevent shutdown hooks from being blocked by driver thread pools. (DSP-16051)
  • dse client-tool spark sql-schema --all exports definitions for solr_admin keyspace. (DSP-16073)
  • Improved security prevents run_as runner for Spark from running a malicious program. (DSP-16093)
  • DSEFS silently fails when TCP port 5599 is not open between nodes. (DSP-16101)
  • cassandra nonsuperuser gets dsefs AccessDeniedException due to Insufficient permissions. (DSP-16713)
  • Unable to get available memory before Spark Workers are registered. (DSP-16790)

5.1.11 DSEFS

Changes and enhancements:

  • DSEFS operations: chown, chgrp, and chmod support recursive (-R) and verbose (-v) flag. (DSP-14238)
  • Client and internode connection improvements. (DSP-14284, DSP-16065)
  • Improved error message when performing an operation on a corrupted path. (DSP-16340)
  • Security improvements:
    • Only super users are able to remove corrupted non-empty directories when authentication is enabled for DSEFS. (DSP-16340)
    • DSEFS uses an isolated native memory pool for file data and metadata sent between nodes. This isolation makes it harder to exploit potential memory management bugs. (DSP-16492)

5.1.11 DSEFS resolved issues

Resolved issues

  • DSEFS fails to start when there is a table with duration type or other type DSEFS that can't understand. (DSP-16825)
  • Under high loads, DSEFS reports temporary incorrect state for various files/directories. (DSP-17178)
  • IllegalStateException during plugin shutdown causes Failed to abort request body error. (DSP-17003)

5.1.11 DSE Graph

Changes and enhancements:

  • Improved Gremlin console authentication configuration. (DSP-9905)
  • Maximum evaluation timeout is 1094 days. (DSP-16709)
  • Default write consistency level (CL) for Graph is LOCAL_QUORUM. (DSP-17140)
    Attention: In earlier DSE versions, the default QUORUM write consistency level (CL) was not appropriate for multi-datacenter production environments.
  • Added convenience methods for reading graph configuration: getEffectiveAllowScan and getEffectiveSchemaMode. (DSP-16650)
  • The hardcoded default schema_mode is changed from Development to Production. (DSP-16650)

Resolved issues

  • Search indexes are broken for multi cardinality properties. (DSP-14802)
  • Changing search index schema using a gremlin script might fail with Search index may not be modified while it is being reindexed. Please wait until reindexing has finished. message. (DSP-15831)
  • Align query behavior using geo.inside() predicate for polygon search with and without search indexes. (DSP-16108)
  • Classpath conflict between Lucene and SASI versions of Snowball. (DSP-16116)
  • Avoid looping indefinitely when a thread making internode requests is interrupted while trying to acquire a connection. (DSP-16544)
  • Setting graph.traversal_sources.g.evaluation_timeout breaks graph. (DSP-16709)
  • Deleting a search index that was defined inside a graph fails. (DSP-16765)
  • DSEFS Hadoop layer doesn't properly translate DSEFS exceptions to Hadoop exceptions in some methods. (DSP-16933)

5.1.11 DSE Search

Changes and enhancements:

  • Log fewer messages at INFO level in TTLIndexRebuildTask. (DSP-15600)
  • Search index permissions can be applied at keyspace level. (DSP-15385)
  • CQL solr_query supports Solr facet heatmaps. (DSP-16404)
  • Drop operations (ALTER SEARCH INDEX SCHEMA DROP) on the schema now require including at least one attribute on the element being dropped and support dropping only one element at a time. (DSP-15947)

    The required attributes by element are:

    • field - name
    • fieldType - name
    • dynamicField - name
    • copyField - source, dest

Resolved issues:

  • Avoid accumulating redundant router state updates during schema disagreement. (DSP-15615)
  • Servlet container shutdown (Tomcat) prematurely stops logback context. (DSP-15807)
  • DSE should not start without appropriate Tomcat JAR scanning exclusions. (DSP-16841)
  • Node health score of 1 is not obtainable. Search node gets stuck at 0.00 node health score after replacing a node in a cluster. (DSP-17107)

5.1.11 Known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.11

A list of DataStax Enterprise 5.1.11 enhancements to Apache Cassandra™ 3.1.10.

DataStax Enterprise (DSE) 5.1.11 includes all changes from previous releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Fix static column order for SELECT * wildcard queries (CASSANDRA-14638)
  • sstableloader should use discovered broadcast address to connect intra-cluster (CASSANDRA-14522)
  • Fix reading columns with non-UTF names from schema (CASSANDRA-14468)
  • Validate supported column type with SASI analyzer (CASSANDRA-13669)
  • Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929)
  • Reduce nodetool GC thread count (CASSANDRA-14475)
  • Fix New SASI view creation during Index Redistribution (CASSANDRA-14055)
  • Remove string formatting lines from BufferPool hot path (CASSANDRA-14416)
  • Update metrics to 3.1.5 (CASSANDRA-12924)
  • Detect OpenJDK jvm type and architecture (CASSANDRA-12793)
  • Don't use guava collections in the non-system keyspace jmx attributes (CASSANDRA-12271)
  • Fix corrupted static collection deletions in 3.0 -> 2.{1,2} messages (CASSANDRA-14568)
  • Fix potential IndexOutOfBoundsException with counters (CASSANDRA-14167)
  • Always close RT markers returned by ReadCommand#executeLocally() (CASSANDRA-14515)
  • Reverse order queries with range tombstones can cause data loss (CASSANDRA-14513)
  • Fix regression of lagging commitlog flush log message (CASSANDRA-14451)
  • Add Missing dependencies in pom-all (CASSANDRA-14422)
  • Cleanup StartupClusterConnectivityChecker and PING Verb (CASSANDRA-14447)
  • Fix deprecated repair error notifications from 3.x clusters to legacy JMX clients (CASSANDRA-13121)
  • Cassandra not starting when using enhanced startup scripts in windows (CASSANDRA-14418)
  • Fix progress stats and units in compactionstats (CASSANDRA-12244)
  • Better handle missing partition columns in system_schema.columns (CASSANDRA-14379)
  • Delay hints store excise by write timeout to avoid race with decommission (CASSANDRA-13740)
  • Incorrect counting of pending messages in OutboundTcpConnection (CASSANDRA-11551)
  • Fix compaction failure caused by reading un-flushed data (CASSANDRA-12743)

General upgrade advice for DSE 5.1.11

General upgrade advice for DataStax Enterprise 5.1.11.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.11

A list of DataStax Enterprise 5.1.11 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.11 includes DataStax Spark Cassandra Connector 2.1.10 with all changes from earlier versions, and adds these production-certified changes.

2.0.9
  • All updates to 1.6.1
2.0.8
  • Allow non-cluster prefixed options in sqlConf (SPARKC-531)
  • Change Str Literal Match to Be Greedy (SPARKC-532)
  • Restore support for various timezone formats to TimestampParser (SPARKC-533)
  • UDT converters optimization (SPARKC-536)

TinkerPop changes for DSE 5.1.11

Enhancements to Apache TinkerPop 3.2.9.

DataStax Enterprise (DSE) 5.1.11 includes all changes from previous releases. There are no production-certified enhancements to Apache TinkerPop™ 3.2.9. For TinkerPop changes, see TinkerPop Upgrade Information.

DSE 5.1.10

Release notes for DataStax Enterprise 5.1.10.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml
  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.10.
Important: DataStax recommends the latest patch release for most environments.

5 June 2018

Table 6. DSE functionality

5.1.10 Components

All components from DSE 5.1.10 are listed. Components that are updated for DSE 5.1.10 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.2323 *
  • Apache Solr™ 6.0.1.0.2284 *
  • Apache Spark™ 2.0.2.19 *
  • Apache TinkerPop™ 3.2.9-20180507-f6ead8b2 *
  • Apache Tomcat® 8.0.47
  • DataStax Spark Cassandra Connector 2.0.7
  • DSE Java Driver 1.2.6 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.238 requires compatible API *
  • Select Hadoop libraries

5.1.10 Highlights

Executive summary highlights for DSE 5.1.10: The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.10 DSE Analytics and DSEFS highlights

  • Resolved an issue with reading corrupted data from DSEFS caused by incorrect handling of file offsets, if requested offset does not align exactly at the file block boundary. This critical issue was triggered by some Spark usages. (DSP-15907)
  • Rare problems with multiple Spark Masters are resolved. Improved Spark Master and Spark Worker stability. (DSP-15636, DSP-15906, DSP-14405, DSP-15801)
  • Resolved the missing /tmp directory in DSEFS after fresh cluster installation. (DSP-16058)
  • Parquet files with partitions is improved. (DSP-16067)

5.1.10 DSE Graph highlights

  • Improved performance with DSE Graph fluent API. (DSP-15686)
  • Support for non-text IDs when using graph frames for bulk loading data. (DSP-15614)

5.1.10 DSE Search highlights

  • Search index TTL Expiration thread loops without effect with live indexing (RT indexing). (DSP-16038)
  • Solr 6.0.1 security upgrades. (DSP-15978)

5.1.10 DataStax Enterprise

Changes and enhancements:

Resolved issues:

  • CVE-2017-7525: FasterXML Jackson-databind is prone to a remote-code execution vulnerability. (DSP-14784)
  • Fix legacy complex range tombstone serialization+deserialization for static and regular columns. (DSP-15878)
  • Fix error in MVs referencing a function with uppercase letters on its name. (DSP-15878)
  • Ignore empty Counter cells on digest calculation (DSP-16096)
  • Upgrade netty to 4.0.54. Ignore log spam for unclean client shutdown. (DSP-16096)
  • Avoid log spam for unclean client shutdown. (DSP-16096)
  • Reusing table ID with CREATE TABLE causes failure on restart. (DSP-16096)
  • Add getConcurrentCompactors to JMX to avoid loading DatabaseDescriptor to check its value in nodetool. (DSP-16096)
  • Fix binding JMX to any address. (DSP-16192)

5.1.10 DSE Advanced Replication

Resolved issues:

  • dse client-tool help doesn't work if ~/.dserc file exists. (DSP-15869)

5.1.10 DSE Analytics

Changes and enhancements:

  • Decreased the number of exceptions logged during master move from node to node. (DSP-14405)
  • Spark Master REST API is disabled. If enabled in spark-defaults.conf, the following error is logged: ERROR Spark Master REST API is not available in DSE. (DSP-15491)
  • DSEFS fetching a file from an offset returns empty content. (DSP-15907)
  • In Portfolio demo, pricer is no longer required to be run with sudo. (DSP-15970)

Resolved issues:

  • Running Spark processes as separate users does not work. (DSP-15723)
  • Multiple Spark masters can be started on the same machine. (DSP-15636)
  • DSE client tool returns wrong Spark Master address. (DSP-15801)
  • Unnecessary Spark Worker restarts. (DSP-15906)
  • Portfolio demo does not work on package installs. (DSP-15970)
  • During misconfigured cluster bootstrap, the AlwaysOn SqlServer does not start due to missing /tmp/hive directory in DSEFS. (DSP-16058)
  • CassandraHiveMetastore is prevented from adding multiple partitions for File based datasources. Fixes MSCK REPAIR TABLE command. (DSP-16067)

5.1.10 DSE Graph

Changes and enhancements:

  • DseGraphFrame performance improvement reduces number of joins for count() and other id-only queries. (DSP-15554)
  • Performance improvements for traversal execution with Fluent API and script-based executions. (DSP-15686)

Resolved issues:

  • GraphSON parsing error prevents proper type detection under certain conditions. (DSP-14066)
  • When using graph frames, cannot upload edges when ids for vertices are complex non-text ids. (DSP-15614)
  • DseGraphFrame fails with StackOverflowError if property is meta-property. (DSP-15939)

5.1.10 DSE Search

Changes and enhancements:

  • Solr 6.0.1 security upgrades. (DSP-15978)
  • Output Solr foreign filter cache warning only on classes other than DSE classes. (DSP-15625)

Resolved issues:

  • A shard request timeout caused an assertion error from Lucene getNumericDocValues in the log. (DSP-14216)
  • Offline sstable tools fail is DSE Search index is present on a table. (DSP-15628)
  • HTTP read on solr_stress doesn't inject random data into placeholders. (DSP-15727)
  • ERROR 500 on distributed http json.facet with non-zero offset. (DSP-15946)
  • Search index TTL Expiration thread loops without effect with live indexing (RT indexing). (DSP-16038)

5.1.10 Known issues

DataStax Enterprise:
  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
DSE Analytics:
  • The Spark Jobserver demo has an incorrect version for the Spark Jobserver API. (DSP-15832)

    Workaround: In the demo's gradle.properties file, change the version from 0.6.2 to 0.6.2.238.

  • If manually deleted, the DSEFS keyspace (dsefs) is not automatically recreated by a node restart. DSE will not start if the DSEFS keyspace was dropped in a datacenter that was removed and then added back to a cluster as a new datacenter. (DSP-16785)
    Workaround:
    • To reuse a DSEFS keyspace that was manually deleted, you must manually create the DSEFS keyspace for the datacenter being added back to the cluster before starting DSE.
    • If you manually deleted the DSEFS keyspace named dsefs, you can define a new DSEFS keyspace name with a different name. For example, if you deleted dsefs in the old datacenter, create a new DSEFS keyspace named dsefs2. Be sure to specify the case-sensitive DSEFS keyspace name in the dse.yaml file. See Using DSEFS.
    • Do not delete the DSEFS keyspace that points to the previously removed datacenter.
    • DataStax recommends not manually deleting the DSEFS keyspace or system keyspaces.

Cassandra enhancements for DSE 5.1.10

A list of DataStax Enterprise 5.1.10 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.10 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Allow existing nodes to use all peers in shadow round (CASSANDRA-13851)
  • Fix cqlsh to read connection.ssl cqlshrc option again (CASSANDRA-14299)
  • Downgrade log level to trace for CommitLogSegmentManager (CASSANDRA-14370)
  • CQL fromJson(null) throws NullPointerException (CASSANDRA-13891)
  • Serialize empty buffer as empty string for json output format (CASSANDRA-14245)
  • Deprecate background repair and probablistic read_repair_chance table options (CASSANDRA-13910)
  • Add missed CQL keywords to documentation (CASSANDRA-14359)
  • Avoid deadlock when running nodetool refresh before node is fully up (CASSANDRA-14310)
  • Handle all exceptions when opening sstables (CASSANDRA-14202)
  • Handle incompletely written hint descriptors during startup (CASSANDRA-14080)
  • Handle repeat open bound from SRP in read repair (CASSANDRA-14330)
  • Fix JSON queries with IN restrictions and ORDER BY clause (CASSANDRA-14286)
  • Check checksum before decompressing data (CASSANDRA-14284)

General upgrade advice for DSE 5.1.10

General upgrade advice for DataStax Enterprise 5.1.10.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.10

A list of DataStax Enterprise 5.1.10 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.10 includes DataStax Spark Cassandra Connector 2.0.7 and all production-certified changes from earlier versions.

TinkerPop changes for DSE 5.1.10

Enhancements to Apache TinkerPop 3.2.9.

DataStax Enterprise (DSE) 5.1.10 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.9. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Performance enhancement to Bytecode deserialization. (TINKERPOP-1936)
  • Path history isn't preserved for keys in mutations. (TINKERPOP-1947)
  • Traversal construction performance enhancements (TINKERPOP-1950)
  • Bump to Groovy 2.4.15 - resolves a Groovy bug preventing Lambda creation in GLVs in some cases. (TINKERPOP-1953)

DSE 5.1.9

Release notes for DataStax Enterprise 5.1.9.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.9.
Important: DataStax recommends the latest patch release for most environments.

Avoid upgrading to DSE 5.1.9 or DSE 5.1.8 if you use TTL (time-to-live) with DSE Search live indexing (RT indexing). (DSP-16038)

24 April 2018

5.1.9 Components

All components from DSE 5.1.9 are listed. Components that are updated for DSE 5.1.9 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.2261
  • Apache Solr™ 6.0.1.0.2224
  • Apache Spark™ 2.0.2.17
  • Apache TinkerPop™ 3.2.8-20180327-292ccbfd
  • Apache Tomcat® 8.0.47
  • DataStax Spark Cassandra Connector 2.0.7
  • DSE Java Driver 1.2.6 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.237 requires compatible API
  • Select Hadoop libraries

DSE 5.1.9 includes Apache Cassandra 3.11 and includes all additional production-certified enhancements from earlier DSE versions.

5.1.9 Resolved issue

Fix LDAP library issue. (DSP-15927)

5.1.9 Known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • DSE Search: Search index TTL Expiration thread loops without effect with live indexing (RT indexing). (DSP-16038)
  • DSE Graph: LIMIT clause does not work in a graph traversal with search predicate TOKEN. (DSP-16292)

General upgrade advice for DSE 5.1.9

General upgrade advice for DataStax Enterprise 5.1.9.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.8

Release notes for DataStax Enterprise 5.1.8.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.8.
Important: DataStax recommends the latest patch release for most environments.

Avoid upgrading to DSE 5.1.9 or DSE 5.1.8 if you use TTL (time-to-live) with DSE Search live indexing (RT indexing). (DSP-16038)

5 April 2018

Table 7. DSE functionality

5.1.8 Components

All components from DSE 5.1.8 are listed. Components that are updated for DSE 5.1.8 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.2261 *
  • Apache Solr™ 6.0.1.0.2224 *
  • Apache Spark™ 2.0.2.17 *
  • Apache TinkerPop™ 3.2.8-20180327-292ccbfd *
  • Apache Tomcat® 8.0.47 *
  • DataStax Spark Cassandra Connector 2.0.7
  • DSE Java Driver 1.2.6 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.237 requires compatible API *
  • Select Hadoop libraries

5.1.8 Highlights

Executive summary highlights for DSE 5.1.8: The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.8 DSE Advanced Replication highlights

  • Fixed misleading warning messages about a non-replicating cluster in a multi-datacenter source cluster. (DSP-15808)

5.1.8 DSE Analytics and DSEFS highlights

  • Fixed a permissions issue affecting Spark History Server results visibility through the web UI. (DSP-15693)
  • Fixed a permission issue affecting non-superusers and DSEFS. (DSP-15276)

5.1.8 DSE Search highlights

  • Fixed reindexing and query performance regression for delete heavy workload. (DSP-15653, DSP-15667)

5.1.8 DataStax Enterprise

Changes and enhancements:

  • Automatic fallback of GossipingPropertyFileSnitch to PropertyFileSnitch (cassandra-topology.properties) is disabled by default and can be enabled by using the -Dcassandra.gpfs.enable_pfs_compatibility_mode=true startup flag. (DB-1663)
  • Improved security: Decimals with a scale > 100 are no longer converted to a plain string to prevent DecimalSerializer.toString() being used as an attack vector. (DB-1848)
  • DSE demos use Jetty Runner 9.4.8. (DSP-14772)
  • ANY, SUBMISSION, and WORKPOOL are unreserved keywords and can be used as keyspace, table, and column identifiers. (DSP-15353)
  • Improve replace fail messages when a replace is retried before QUARANTINE_DELAY. (DSP-15824)
  • Harden txn log files against exceptions when adding records and improve log messages. (DSP-15824)

Resolved issues:

  • The JVM version check in conf/cassandra-env.sh does not work. (DB-1882)
  • Enabling and disabling dbsummary and clustersummary performance objects through dsetool does not work. (DSP-15539)
  • Delay closing connection when nodes are removed to allow inflight commands to complete. (DSP-15824)
  • JVM startup check not working. (DSP-15824)
  • Materialized view schema file for snapshots is created as tables. (DSP-15486)
  • Init timestamp with Long.MIN_VALUE instead of -1. (DSP-15486)
  • AssertionError in ThrottledUnfilteredIterator due to empty UnfilteredRowIterator. (DSP-15486)
  • Make sstableloader use cassandra.config.loader instead of hard-coded YamlConfigurationLoader. (DSP-15486)
  • Backport CASSANDRA-9241, fix nodetool toppartitions. (DSP-15486)
  • Ignore lost+found directory on startup checks. (DSP-15486)
  • Protect against BigDecimals with large scale. (DSP-15486)

5.1.8 DSE Advanced Replication

Changes and enhancements:

  • To ensure tombstones are removed often by compaction, the default value for gc_grace_seconds is reduced from 86400 (10 days) to 600 (10 minutes) for the dse_advrep.transmissions_crc table. (DSP-15749)

Resolved issues:

  • Replog count never goes down to zero in a multi-node source cluster. (DSP-15060)
  • Plugin error during shutdown: Error while fetching mutations. (DSP-15342)
  • Add support again for empty quoted name ("") as selectable to select SuperColumns. (DSP-15486)
  • Read connection ssl option from cqlshrc. (DSP-15486)
  • SASI AND/OR semantics are incorrect for StandardAnalyzer. (DSP-15486)
  • NPE Error whilst purging staled mutation files. (DSP-15502)
  • Channel creation fails with NPE when using mixed case destination name. (DSP-15538)
  • Unable to recover metadata from block file error due to NoSuchFileException. (DSP-15627)
  • Errors during shutdown. (DSP-15637)
  • advrep replog count command does not work with mixed case keyspace or table names. (DSP-15641)
  • AdvRep CommitLogConsumer logging NoSuchFileException. (DSP-15753)
  • Incorrect status that CDC was active when only a single advrep channel was defined in the datacenter. (DSP-15808)

5.1.8 DSE Analytics

Changes and enhancements:

  • Improve logging on unsupported operation failure and remove the failed mutation from replog. (DSP-15043)
  • Spark Master REST API is disabled. If enabled in spark-defaults.conf, the following error is logged: ERROR Spark Master REST API is not available in DSE. (DSP-15491)

Resolved issues:

  • JSch is susceptible to a path traversal vulnerability. (DSP-13961)
  • Worker UI does not display the actual class name of driver application running in cluster mode. (DSP-15028)
  • DSEFS transactions not always replayed at startup. (DSP-15462)
  • Running Spark processes as separate users does not work. (DSP-15723)

5.1.8 DSEFS

Changes and enhancements:

  • Improved security with default file permissions -770 for event log files. Change permissions with spark.eventLog.permissions. (DSP-15693)
  • DSEFS programmatic access demos are available. (DSP-13799)

Resolved issues:

  • InvalidTypeException is thrown while running DSEFS commands on node upgraded from 5.0.x to 5.1.x. (DSP-15266)
  • Timeout when trying to umount a dsefs location. (DSP-15453)
  • Exception is thrown by DseFsPlugin during shutdown and is not reported. (DSP-15474)
  • DSE might not shutdown properly when DSEFS encounters a problem, and exceptions are not logged. (DSP-15482)
  • DSEFS programmatic access demo project is available. (DSP-13799)
  • SPARK/DSEFS non-super users are unable to run SQL queries in secured DSEFS. Spark SQL applications utilize a scratch directory in DSEFS. This scratch directory is automatically created in DSE 5.1.7 and later. (DSP-15276)
  • Insufficient permissions to path / error when putting a file with the dse hadoop -put command on secured DSEFS cluster. (DSP-15480)
  • Small probability of duplicated predefined directories (/tmp/hive) when bootstrapping cluster with multiple datacenters and incorrect NetworkTopologyStrategy (SimpleStrategy). (DSP-15639)

5.1.8 DSE Graph

Changes and enhancements:

  • Improved performance of anonymous traversals and bytecode-based traversals that made use of withStrategy() configurations. (DSP-15673)

Resolved issues:

  • 0 (zero) is not treated as unlimited abort of max num errors. (DGL-307)
  • Synchronization hurts graph OLAP on multi-core executors. Improve scalability of OLAP queries with remote traverses. (DSP-15068)
  • Failures reported from CassandraPersistenceEngine during upgrade, especially in Graph Analytics workloads. (DSP-15130)
  • DseGRaphFrame timestamp base query do not work for bot java.sql.Timestamp and String representations. (DSP-15146)
  • graph solr phrase() predicate shows IndexOutofBound error. (DSP-15408)
    • Single-character tokens used in search index queries, for example with predicate token("a") are erroneously dropped.
    • Search index queries using phrase(...) predicates fail exceptionally when processing values that end in a prefix of the search phrase.
  • DseGraphFrames throws InvalidQueryException when search index is enabled. (DSP-15411)
  • g.V().hasId([]) and g.V().has(id, []) query results are incorrect in DseGraphFrames. (DSP-15501)
  • toJSON() does not always work with geo types. (DSP-15650)
  • ObjectMapper contention for fluent API requests. (DSP-15732)

5.1.8 DSE Search

Changes and enhancements:

  • Reduce the overhead of DeleteByQueryWrapper used by Solr deleteByQuery(). (DSP-15667)
  • Streamline misleading Solr filter cache eviction logging. (DSP-15741)
  • Support for specifying different Solr field types for each CQL map key. (DSP-15622)

Resolved issues:

  • NPE during loading data with RT geonames. (DSP-12361)
  • Solr resource reading failure on init after copying data from another cluster. (DSP-15419)
  • Prohibit Solr timeAllowed use with partial results and allow it with deep paging. (DSP-15475)
  • deleteById and deleteByQuery overflow prepared statement cache. (DSP-15620)
  • ERROR 500 on distributed http json.facet with non-zero offset. (DSP-15633)
  • Reindex with tombstones in the data performs slower than earlier DSE versions. (DSP-15653)

5.1.8 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • DSE Analytics: Additional configuration is required when enabling context-per-jvm in the Spark Jobserver. (DSP-15163)
  • DSE Analytics: Spark Master does not launch successfully after upgrade from DSE 5.1.x to DSE 5.1.8. (DSP-15679)
    To resolve the issue:
    dsetool sparkmaster cleanup
    dsetool sparkworker restart
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • DSE Search: Search index TTL Expiration thread loops without effect with live indexing (RT indexing). (DSP-16038)

Cassandra enhancements for DSE 5.1.8

A list of DataStax Enterprise 5.1.8 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.8 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
  • Fix Loss of digits when doing CAST from varint/bigint to decimal (CASSANDRA-14170)
  • SASI tokenizer for simple delimiter based entries (CASSANDRA-14247)
  • Fix Loss of digits when doing CAST from varint/bigint to decimal (CASSANDRA-14170)
  • RateBasedBackPressure unnecessarily invokes a lock on the Guava RateLimiter (CASSANDRA-14163)
  • Fix wildcard GROUP BY queries (CASSANDRA-14209)
  • Use zero as default score in DynamicEndpointSnitch (CASSANDRA-14252)
  • Respect max hint window when hinting for LWT (CASSANDRA-14215)
  • Adding missing WriteType enum values to v3, v4, and v5 spec (CASSANDRA-13697)
  • Don't regenerate bloomfilter and summaries on startup (CASSANDRA-11163)
  • Fix NPE when performing comparison against a null frozen in LWT (CASSANDRA-14087)
  • Log when SSTables are deleted (CASSANDRA-14302)
  • Fix batch commitlog sync regression (CASSANDRA-14292)
  • Write to pending endpoint when view replica is also base replica (CASSANDRA-14251)
  • Chain commit log marker potential performance regression in batch commit mode (CASSANDRA-14194)
  • Fully utilise specified compaction threads (CASSANDRA-14210)
  • Pre-create deletion log records to finish compactions quicker (CASSANDRA-12763)
  • Backport circleci yaml (CASSANDRA-14240)
  • CVE-2017-5929 Security vulnerability in Logback warning in NEWS.txt (CASSANDRA-14183)
  • Fix ReadCommandTest (CASSANDRA-14234)
  • Remove trailing period from latency reports at keyspace level (CASSANDRA-14233)
  • Correctly count range tombstones in traces and tombstone thresholds (CASSANDRA-8527)
  • Add MinGW uname check to start scripts (CASSANDRA-12840)
  • Use the correct digest file and reload sstable metadata in nodetool verify (CASSANDRA-14217)
  • Handle failure when mutating repaired status in Verifier (CASSANDRA-13933)
  • Protect against overflow of local expiration time (CASSANDRA-14092)

General upgrade advice for DSE 5.1.8

General upgrade advice for DataStax Enterprise 5.1.8.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.8

A list of DataStax Enterprise 5.1.8 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.7 includes DataStax Spark Cassandra Connector 2.0.7 and all production-certified changes from earlier versions.

TinkerPop changes for DSE 5.1.8

Enhancements to Apache TinkerPop 3.2.8.

DataStax Enterprise (DSE) 5.1.8 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.8. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Fixed a bug in NumberHelper that led to wrong min/max results if numbers exceeded the Integer limits. (TINKERPOP-1873)
  • Improved error messaging for failed serialization and deserialization of request/response messages.
  • Fixed bug in handling of Direction.BOTH in Messenger implementations to pass the message to the opposite side of the `StarGraph` in VertexPrograms for OLAP traversals. (TINKERPOP-1862)
  • Fixed a bug in Gremlin Console which prevented handling of gremlin.sh flags that had an equal sign (=) between the flag and its arguments. (TINKERPOP-1879)
  • Fixed a bug where SparkMessenger was not applying the edgeFunction`from MessageScope`in VertexPrograms for OLAP-based traversals. (TINKERPOP-1872)
  • TinkerPop drivers prior to 3.2.4 won't authenticate with Kerberos anymore. A long-deprecated option on the Gremlin Server protocol was removed.

DSE 5.1.7

Release notes for DataStax Enterprise 5.1.7.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.7.
Important: DataStax recommends the latest patch release for most environments.

15 February 2018

Table 8. DSE functionality

5.1.7 Components

All components from DSE 5.1.7 are listed. Components that are updated for DSE 5.1.7 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.2130 *
  • Apache Solr™ 6.0.1.0.2139 *
  • Apache Spark™ 2.0.2.16 *
  • Apache TinkerPop™ 3.2.8-20180125-cd910875 *
  • Apache Tomcat® 8.0.47 *
  • DataStax Spark Cassandra Connector 2.0.7 *
  • DSE Java Driver 1.2.2
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.7 Highlights

Executive summary highlights for DSE 5.1.7:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.7 DataStax Enterprise highlights

  • Fix for the possible data loss scenario caused by the TTL expiration timestamps susceptible to the year 2038 problem. (DSP-15412)

    When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

5.1.7 DSE Search highlights

  • Better defaults when using JTS for polygon queries. (DSP-15182)
  • More responsive shutdown and index unloading while index rebuild is in progress. (DSP-12452)

5.1.7 DataStax Enterprise

Changes and enhancements:

  • Custom index and iTrigger implementations are not supported. Use only implementations bundled with DSE.
  • Default number of threads used by performance objects is increased from 1 to 4; configure threads with new dse.yaml performance_core_threads parameter. (DSP-14643)
  • New nodetool getseeds and reloadseeds commands. (DSP-15412)

Resolved issues:

  • dbsummary does not work with default performance_core_threads. (DSP-14643)
  • CVE-2017-15095 jackson-databind is vulnerable to remote code execution (RCE) attacks. (DSP-15096)
  • Fix for possible data loss scenario caused by the TTL expiration timestamps susceptible to the year 2038 problem. (DSP-15412)

    When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

  • Remove invalid path from compaction-stress script, populate data base on initial size. (DSP-15412)
  • Fix infinite loop when replaying a truncated commit log file and truncation is tolerated. (DSP-15412)
  • Kerberos protocol and QoP parameters are not correctly propagated. (DSP-15455)
  • Fetch/query no columns in priming connections to avoid errors if system.local columns are changed. (DSP-15484)
  • Upgrade from DSE 5.0.11 to DSE 5.1.6 fails with deserialization exception on column "workloads". (DSP-15484)
  • Fix connections per host in nodetool getstreamthroughput. (DSP-15412)
  • Avoid hibernate on startup for boostrap node to avoid WTE due to not being marked alive. (DSP-15412)
  • Prevent received SSTables with tombstones during repair from being compacted. (DSP-15412)
  • Non-disruptive seed node list reload. (DSP-15412)
  • Make `ReservedKeywords` mutable. (DSP-15412)
  • Fix tpc connection being reset due to dc compression and flush socket before reset. (DSP-15412)
  • Skip legacy range tombstones if only their clustering is corrupted. (DSP-15412)
  • Fix AssertionError in ReadResponse$Serializer.serializedSize. (DSP-15412)
  • Allow ALTER of system_distributed keyspace tables. (DSP-15412)
  • Improve live-node-replacement. (DSP-15412)
  • Allow skipping commit log replay does not fail on descriptor errors. (DSP-15435)

5.1.7 DSE Analytics

Resolved issues:

  • Fix for possible scenario where newly-added nodes can have a schema mismatch for system keyspaces. (DSP-11787)
  • Message is not consistently displayed when SparkContext is created with different configuration. (DSP-14758)
  • Spark SQL applications with DSE authentication enabled will throw errors if the DSEFS scratch directory doesn't exist. (DSP-15276)

5.1.7 DSEFS

Resolved issues:

  • DSEFS does not use ssl_native_port for internal connections between DSEFS node and Cassandra when client encryption is enabled. (DSP-15029)
  • SPARK/DSEFS non-super users are unable to run sql queries in secured DSEFS. (DSP-15276)
  • Rare NullPointerException during DSEFS startup. (DSP-15289)
  • Occasional NoHostAvailable exceptions when shutting down DSE with DSEFS enabled. (DSP-15404)
  • Setting permissions/owner on a file in DSEFS through Hadoop's interfaces does not take effect. (DSP-15255)

5.1.7 DSE Graph

Resolved issues:

  • Do not log or send back full Groovy script when the script is too large. (DSP-14410)
  • Retryable failures have severity DEBUG. Only terminal failures have severity ERROR or WARN. (DSP-15045)

5.1.7 DSE Search changes and enhancements

Changes and enhancements:

  • Wikipedia demo path error. (DSP-11327)
  • DeleteById is deprecated. (DSP-13436)

Resolved issues:

  • dsetool search commands should return non-zero if operation was not successful. (DSP-9631)
  • Add warnings to DSE Search reload and reindex that reloads impact entire datacenter and reindex is asynchronous. (DSP-9820)
  • CQL solr queries with JSON clause miss singlePass optimizations. (DSP-11407)
  • Inconsistent behavior from dsetool when SSL is enabled. (DSP-15171)
  • Default useJtsMulti to false to avoid performance issues with JTS multipolygon handling. (DSP-15182)
  • Incorrect connection limiter scheduler shutdown order for internode transport clients. (DSP-14256)
  • Avoid potentially indefinite shutdown delay with active reindexing. (DSP-12452)

5.1.7 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • CVE-2017-15095 jackson-databind is vulnerable to remote code execution (RCE) attacks. Applies only to workloads using --framework spark-2.0 spark-submit. (DSP-15441)
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: With DSE 5.1.7 and later, DSE provides troubleshooting strategies to protect against overflow of local expiration time.
  • Spark Master might not recover after upgrades from DSE 5.1.0 through 5.1.5 to DSE 5.1.6 or 5.1.7. (DSP-15679)
    In some scenarios, the Spark Master might not recover directly after upgrade, and all the Spark applications must be stopped and restarted. Follow these steps to ensure Spark Master launches successfully for upgrades from any DSE 5.1.x to 5.1.8:
    dsetool sparkmaster cleanup
    dsetool sparkworker restart
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.7

A list of DataStax Enterprise 5.1.7 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.7 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Add DEFAULT, UNSET, MBEAN and MBEANS to `ReservedKeywords` (CASSANDRA-14205)
  • Add Unittest for schema migration fix (CASSANDRA-14140)
  • Print correct snitch info from nodetool describecluster (CASSANDRA-13528)
  • Close socket on error during connect on OutboundTcpConnection (CASSANDRA-9630)
  • Enable CDC unittest (CASSANDRA-14141)
  • Acquire read lock before accessing CompactionStrategyManager fields (CASSANDRA-14139)
  • Split CommitLogStressTest to avoid timeout (CASSANDRA-14143)
  • Set encoding for javadoc generation (CASSANDRA-14154)
  • RPM package spec: fix permissions for installed jars and config files (CASSANDRA-14181)
  • More PEP8 compliance for cqlsh (CASSANDRA-14021)

General upgrade advice for DSE 5.1.7

General upgrade advice for DataStax Enterprise 5.1.7.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.7

Upgrading

Automatic fallback of GossipingPropertyFileSnitch to PropertyFileSnitch (cassandra-topology.properties) is disabled by default and can be enabled via the -Dcassandra.gpfs.enable_pfs_compatibility_mode=true startup flag.

Spark Cassandra Connector changes for DSE 5.1.7

A list of DataStax Enterprise 5.1.7 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.7 includes DataStax Spark Cassandra Connector 2.0.7 with all changes from earlier versions, and adds these production-certified changes.
  • Adds Timestamp, Improve Conversion Perf (SPARKC-522)
  • Allow setting spark.cassandra.concurrent.reads (SPARKC-520)
  • Allow splitCount to be set for Dataframes (SPARKC-527)

TinkerPop changes for DSE 5.1.7

Enhancements to Apache TinkerPop 3.2.8.

DataStax Enterprise (DSE) 5.1.7 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.7. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Performance enhancement for OLAP: n^2 synchronious operation in OLAP WorkerExecutor.execute() method. (TINKERPOP-1870)
  • union() can produce extra traversers. (TINKERPOP-1867)

DSE 5.1.6

Release notes for DataStax Enterprise 5.1.6.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/cassandra/conf/cassandra.yaml
Release notes for DataStax Enterprise 5.1.6.
Important: DataStax recommends the latest patch release for most environments.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

22 January 2018

Table 9. DSE functionality

5.1.6 Components

All components from DSE 5.1.6 are listed. Components that are updated for DSE 5.1.6 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.2070 *
  • Apache Solr™ 6.0.1.0.2123 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.7-20171213-77c0c764 *
  • Apache Tomcat® 8.0.44
  • DataStax Spark Cassandra Connector 2.0.6 *
  • DSE Java Driver 1.2.2
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.6 Highlights

Executive summary highlights for DSE 5.1.6:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.6 DataStax Enterprise highlights

DSE Advanced Replication highlights

  • Improved handling and bug fixes in scenarios where the source cluster has multiple logical data centers. (DSP-14767, DSP-14515, DSP-15121)

5.1.6 DSE Analytics and DSEFS highlights

  • Fixed a DSEFS issue that could prevent upgrades from 5.0.x to 5.1.5. (DSP-15237)
  • Fixed a bug in DSEFS that in rare circumstances could cause a live lock on the server when reading files, manifesting with high CPU usage and timeouts. (DSP-15107)
  • Fixed an infrequent bug where Spark worker directories could be deleted while the job is running. (DSP-15076, SPARK-22976).

5.1.6 DSE Graph highlights

  • Graph loader supports GraphSON V2.
  • Resolved issue of retrieving multiple edges by ID. (DSP-14580)
  • Allow vertex lookup through index on id property keys. (DSP-9028)

5.1.6 DSE Search highlights

  • Performance and corruption issues with encrypted indexes are addressed with a full reindex after upgrade. (DSP-14943, DSP-14485, DSP-15265).
  • All installations from DSE 5.0.x or earlier versions of DSE 5.1.x should upgrade to DSE 5.1.6 to avoid potentially incorrect queries while nodes are at different versions during upgrade. (DSP-14898, DSP-14993)
  • Improved protection against abusing the Solr filter cache with too many entries. (DSP-14534)
  • Performance improvements with RF=(# nodes) DCs. (DSP-12962)

5.1.6 DataStax Enterprise

Changes and enhancements:

  • New seed_gossip_probability property in cassandra.yaml reduces the time for gossip changes to propagate across the cluster. (DB-671)
  • New metric for replayed batchlogs and trace-level logging include the age of the replayed batchlog. (DB-1314)
  • By default, enable heap histogram logging on OutOfMemoryError. To disable, set the cassandra.printHeapHistogramOnOutOfMemoryError system property to false. (DB-1498)
  • Generate Kerberos debug output. (DSP-12430)
  • JMX SSL is supported for use with dsetool and dse advrep. See Setting up SSL for nodetool, dsetool, and dse advrep. (DSP-14200)
  • New skip-read-validation flag for stress test error handling. (DSP-14775)
  • Ensure that the list and set selectors elements are all of the same type. (DSP-14775)
  • Do not leak body buffer in case of protocol exceptions and upgrade Netty to 4.0.52. (DSP-14775)
  • Added -Dcassandra.native_transport_startup_delay_seconds start-up parameter to delay startup of native transport. (DSP-14839)
  • Add nodetool rebuild mode reset-no-snapshot option. (DSP-14827)
  • Add nodetool abortrebuild command. (DSP-14827)
  • Add metrics on coordination of read commands; see type=ReadCoordination . (DSP-14775)
  • Add cross_dc_rtt_in_ms to cross dc requests, default 0. (DSP-14775)
  • New metrics for batchlog-replays. (DSP-14839)
  • New CQL ALTER TABLE DROP COMPACT STORAGE option to remove Thrift-compatibility from tables. (DSP-14839)
  • Handle continuous paging state for empty partitions with static rows. (DSP-14959)
  • Skip building views during base table streams on range movements. (DSP-14959)
  • Allow DiskBoundaryManager to cache different directories. (DSP-15024)
  • Do not apply read timeouts to aggregated queries and use a minimum internal page size. New cassandra.yaml aggregated_request_timeout_in_ms setting. (DSP-15024)
  • Only MODIFY permission is required on base when updating table with MV. (DSP-15087).
  • Generate LDAP debug output. (DSP-15176)

Resolved issues:

  • Audit logging does not support UNSET values from prepared statements. (DSP-13043)
  • dsetool does not work with JMX SSL. To use, follow steps in Setting up SSL for nodetool, dsetool, and dse advrep. (DSP-14200)
  • DataStax Installer upgrades within 5.1.x prevent Spark shell from working. (DSP-14637)
  • Memory leak causes executor descriptions to accumulate in DSE process. (DSP-14868)
  • Handle continuous paging state for empty partitions with static rows. (DSP-14959)
  • Skip building views during base table streams on range movements. (DSP-14959)
  • Add invalid-sstable-root JVM argument to all relevant test entries in build.xml. (DSP-14827)
  • Do not leak body buffer in case of protocol exceptions and upgrade Netty to 4.0.52 (DSP-14775)
  • Ensure that the list and set selectors elements are all of the same type. (DSP-14775)
  • nodetool arguments with spaces print script errors. (DSP-14959)
  • Change token allocation to use RF=1 method when RF equals rack count. (DSP-14959)
  • Failed bootstrap streaming leaves auth uninitialized. (DSP-14839)
  • Eliminate thread roundtrip for version handshake. (DSP-14827)
  • Make nodetool assassinate more resilient to missing tokens. (DSP-14827)
  • Throttle base partitions during MV repair streaming to prevent OOM. (DSP-14775)
  • Register SizeEstimatesRecorder earlier and enable cleanup of invalid entries. (DSP-15024)
  • Only serialize failed batchlog replay mutations to hints. (DSP-15024)
  • Allow selecting static column only when querying static index. (DSP-15087)
  • Force sstableloader exit to prevent hanging due to non-daemon threads running. (DSP-15087)
  • Add autoclosable to CompressionMetadata and fix leaks in SSTableMetadataViewer. (DSP-15087)
  • Use all columns to calculate estimatedRowSize for aggregation internal query. (DSP-15087)
  • Prevent continuous schema exchange between DSE 5.0 and DSE 5.1 nodes. (DSP-15087)
  • Separate commit log replay and commit throwable inspection and policy handling. (DSP-15087)
  • Fix for local DC when connections are compressed despite internode_compression: dc. (DSP-15087)
  • Expanded hinted handoff instrumentation. (DSP-15087)
  • Improve gossip dissemination time. (DSP-15087)
  • Use more intelligent level picking for non-l0 file. (DSP-15087)
  • LCS levels are not respected for nodetool refresh and replacing a node. (DSP-15087)
  • Keep SSTable level for decommission, remove, and move operations. (DSP-15087)
  • More quickly detect down nodes for batchlogs using the incoming connections. (DSP-15087)
  • Fixes for waitForGossiper. (DSP-15087
  • Print heap histogram on OOM errors by default. (DSP-15087)
  • Support frozen collection list and set in stress. (DSP-15087)
  • Improved streams logging. (DSP-15087)
  • Make migration-delay configurable. (DSP-15087)
  • Improved schema migration logging. (DSP-15087)
  • Switch RMIExporter to dynamic proxy. (DSP-15277).
  • Do not fetch columns that are not in the filter fetched set. (DSP-15277)
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)

5.1.6 DSE Advanced Replication

Changes and enhancements:

Resolved issues:

  • Datacenter not consistently passed into TokenService causes multi-datacenter replication errors. (DSP-14767)
  • Incompatibility with durable_writes=false, but no warning/error. (DSP-15205)
  • CDC on a table should be disabled only when no channels are enabled for that source table. (DSP-15121)
  • CDC files are left in a DC that's not collecting. (DSP-15105)

5.1.6 DSE Analytics

Changes and enhancements:

  • Default logging level for org.apache.spark.rpc has been changed to ERROR. (DSP-14651)
  • Improved Spark shell startup time. (DSP-14704)
  • Spark executors are not restarted if the driver port is closed or unreachable. (DSP-14824)
  • Notebooks and other third-party tool integration with Spark. (DSP-14489)

Resolved issues:

  • dse client-tool configuration export/import incorrectly uses cfs as the default file system. (DSP-14535)
  • Spark shuffle service fails to update secret on application re-attempts. (DSP-15038)
  • Need a dedicated user to run Graph OLAP Spark Driver. (DSP-14869)
  • Logs from Spark Jobserver job are missing. (DSP-14981)
  • Poor handling of task notifications in Spark Driver, including possible memory leak. (DSP-15044)
  • Cluster-deployed drivers are not cleaned up by the Spark Worker cleanup service. (DSP-15076)

5.1.6 DSEFS

Resolved issues:
  • "ERROR: Request body rejected, ConnectionClosedException" message is not logged in system.log if the client disconnects in the middle of the request. (DSP-14597)
  • Added getScheme, getDefaultPort, and concat method implementations to DseFileSystem Hadoop API. (DSP-14605)
  • Reads incorrectly show Response body rejected errors. (DSP-14615)
  • DSEFS authorization is enabled when DSE authorization is enabled. DSEFS supports DseAuthorizer transitional mode. (DSP-14616)
  • DSEFS does not retry queries. (DSP-14649)
  • Incorrect return of 0 exit code for failed command execution. (DSP-14652)
  • Performing cat operation on a directory is prohibited and causes a Not a regular file <path> message. (DSP-14696)
  • User name/password was not provided warning is in the DSEFS shell log when security is not enabled. (DSP-14708)
  • DSEFS fsck command does not fix File not found: / problem which can occur in rare cases after new cluster nodes are started in parallel. (DSP-15048)
  • A live lock on the server when reading files manifests with high CPU usage and timeouts. (DSP-15107)
  • DSEFS files created through Hadoop API do not properly inherit RF and block size from the parent directory. (DSP-15139)
  • "Promise already completed" error in DSEFS connection pool. (DSP-15122)
  • No check if parent element of a given target path is a directory for mkdir, put, move operations. (DSP-15100)

5.1.6 DSE Graph

Changes and enhancements:

  • Gremlin console plugins.txt is read-only by default. (DSP-13372)
  • Traversal does not timeout with the Fluent API. (DSP-13156)
  • Graph traversals over a vertex-centric index with an ordering and result limit are more efficient. (DSP-15191)
  • CQL Statement latency metrics. (DSP-15124)
  • Improve error messaging on failed bytecode translation. Long forms of e and -i are working. (DSP-15091)

Resolved issues:

  • Whitelist org.apache.tinkerpop.gremlin.spark.structure.Spark in sandbox so that Apache TinkerPop Spark-Gremlin application can be stopped programmatically. (DSP-14678)
  • Queries with multiple conditions using heterogeneous operators that cover the same property value cause an error. (DSP-14623)
  • Error when retrieving multiple edges by edge IDs when the list of IDs is greater than 3. (DSP-14580)
  • Unlabelled index queries occur even when labels were indexed by the appropriate key. (DSP-14579)
  • graph.io read does not work with custom IDs. Limitations apply, intended for use with small graphs only. (DSP-14568)
  • Setting a TraversalSource option from the DSE Driver isn't effective. (DSP-14713)
  • QueryStrategy illegally moves HasStep condition across edge traversal. (DSP-15081)
  • Date, Time, Duration, Timestamp, Blob Graph types are represented by incorrect java types in OLAP. Converters were added to have the same types as in OLTP. (DSP-15104)

5.1.6 DSE Search

Changes and enhancements:

  • Avoid token filtering on single-node CQL solr_query. (DSP-12962)
  • Maximum number of entries in SolrFilterCache is limited to 32K. (DSP-14534)
  • CREATE SEARCH INDEX indexed true|false option for more performant indexes. (DSP-14364)
  • Eliminate delay for scheduled snapshot collection for DSE Search performance objects. (DSP-14561)
  • Added log message for filter cache evictions. (DSP-14944)
  • After compact storage is dropped from a table that also has a search index, HTTP writes and deletes-by-ID on the search index are disabled. (DSP-14966)

Resolved issues:

  • NPE when dropping the Solr core while indexing is in progress. (DSP-13252)
  • dsetool upgrade_index_files does not work with authentication enabled. (DSP-14114)
  • UpdateMetrics::Latency::Mean is "unavailable" when writes are in progress. (DSP-14392)
  • When executing CQL search queries with a keyspace RF=(number of nodes), then the token filter is no longer created resulting in faster queries. (DSP-14468)
  • EncryptedFSDirectory#outputLengthCache corruption makes encrypted index files unreadable. (DSP-14485)
  • Solr filter cache fails after restart. (DSP-14608)
  • CREATE SEARCH INDEX does not have direct control over tuple and UDT fields. (DSP-14639)
  • Remove code execution vulnerability: CVE-2016-6809. (DSP-14747)
  • Infinite parsing loop possible with Extended DisMax (eDisMax) query parser and local parameters. (DSP-14748)
  • Internal server error 500 on solr/admin/cores?action=STATUS&memory=true. (DSP-14783)
  • ExtendedDismaxQParser (edismax) ignores Boolean OR when q.op=AND and mm is not explicitly set. (DSP-14799)
  • Grouping by TrieDateField and DatePointField fails. (DSP-14808)
  • Token filtering might be missed on mixed versions clusters. (DSP-14898)
  • Support the json.facet parameter in Solr UI. (DSP-14893)
  • Excessive time spent reading unencrypted segment sizes during search index (Solr core) loading. Slow startup on nodes with large encrypted indexes is resolved after upgrade to DSE 5.1.6 is completed with a full reindex for all search indexes using encryption. (DSP-14943, DSP-14485, DSP-15265)
  • Shutdown order in SolrCore causes RejectedExecutionExceptions around CommitTracker. (DSP-15040)
  • Cannot create core using HTTP due to missing create permission. (DSP-15046)

5.1.6 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • Spark SQL applications with DSE authentication enabled will throw errors if the DSEFS scratch directory doesn't exist. (DSP-15276)

    Spark SQL applications utilize a scratch directory located in DSEFS. Make sure the dsefs://tmp/hive directory exists and that it has 733 permissions. If dsefs://tmp/hive does not exist, it must be created by a role with superuser permissions. Create the scratch directory with proper permissions:

    dse fs 'mkdir -p -m 733 /tmp/hive'
  • Spark Master might not recover after upgrades from DSE 5.1.0 through 5.1.5 to DSE 5.1.6 or 5.1.7. (DSP-15679)
    In some scenarios, the Spark Master might not recover directly after upgrade, and all the Spark applications must be stopped and restarted. Follow these steps to ensure Spark Master launches successfully for upgrades from any DSE 5.1.x to 5.1.8:
    dsetool sparkmaster cleanup
    dsetool sparkworker restart

Cassandra enhancements for DSE 5.1.6

A list of DataStax Enterprise 5.1.6 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.6 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Switch RMIExporter to dynamic proxy. (DSP-15277)
  • Do not fetch columns that are not in the filter fetched set. (DSP-15277)
  • Allow selecting static column only when querying static index. (DSP-15087)
  • Require only MODIFY permission on base when updating table with MV. (DSP-15087)
  • Force sstableloader exit to prevent hanging due to non-daemon threads running. (DSP-15087)
  • Add autoclosable to CompressionMetadata and fix leaks in SSTableMetadataViewer. (DSP-15087)
  • Use all columns to calculate estimatedRowSize for aggregation internal query. (DSP-15087)
  • Prevent continuous schema exchange between DSE 5.0 and DSE 5.1 nodes (DSP-15087)
  • Allow DiskBoundaryManager to cache different Directories (DSP-15024)
  • Do not apply read timeouts to aggregated queries and use a minimum internal page size. (DSP-15024)
  • Handle cont paging state for empty partitions with static rows (DSP-14959)
  • Skip building views during base table streams on range movements. (DSP-14959)
  • Add invalid-sstable-root JVM argument to all relevant test entries in build.xml. (DSP-14827)
  • Do not leak body buffer in case of protocol exceptions and upgrade Netty to 4.0.52. (DSP-14775)
  • Ensure that the list and set selectors elements are all of the same type. (DSP-14775)

General upgrade advice for DSE 5.1.6

General upgrade advice for DataStax Enterprise 5.1.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.6

Upgrading
  • Upgrades from DSE 5.0 might have produced unnecessary schema migrations while there was at least one DSE 5.0 node in the cluster. It is therefore highly recommended to upgrade from DSE 5.0 to at least DSE 5.1.6. The root cause of this schema mismatch was a difference in the way how schema digests were computed in DSE 5.0 and DSE 5.1. To mitigate this issue, DSE 5.1.6 and newer announce DSE 5.0 compatible digests as long as there is at least one DSE 5.0 node in the cluster. Once all nodes have been upgraded, the "real" schema version will be announced. Note: this fix is required only for DSE 5.1. (DB-1477)
  • DSE is now relying on the JVM options to properly shutdown on OutOfMemoryError. By default it will rely on the OnOutOfMemoryError option as the ExitOnOutOfMemoryError and CrashOnOutOfMemoryError options are not supported by the older 1.7 and 1.8 JVMs. A warning will be logged at startup if none of those JVM options are used. See CASSANDRA-13006 for more details
  • Improved gossip settling added. On startup DSE waits till all nodes are seen before fully joining the cluster. This improves latency spikes when restarting nodes. - LeveledCompactionStrategy SSTables will keep their existing level on nodetool refresh, nodetool move, and nodetool decommission.

Metrics

New storage metrics were added:
  • TotalHintsReplayed: how many hints were successfully replayed on the _target_ node.
  • HintsOnDisk: how many hints are currently persistent on disk on this node. Metric is updated for the amount of hints contained in the hints file when hints file is written or removed. Values is restored on node startup.

New features

  • Statistics file component was added to Hint Store in order to provide information about amount of hints contained in the hints file without replaying it. Stats component is completely backward-compatible; hint files without this component will not be counted. All new hint files will be created with this component. See DB-853 for more details.

Spark Cassandra Connector changes for DSE 5.1.6

A list of DataStax Enterprise 5.1.6 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.6 includes DataStax Spark Cassandra Connector 2.0.6 with all changes from earlier versions, including these production-certified changes.
  • All patches up to 1.6.10

TinkerPop changes for DSE 5.1.6

Enhancements to Apache TinkerPop 3.2.7.

DataStax Enterprise (DSE) 5.1.6 includes all changes from previous releases. These production-certified changes are enhancements to Apache TinkerPop™ 3.2.7. For TinkerPop changes, see TinkerPop Upgrade Information.
  • Improve type-safety in Gremlin.Net methods. (TINKERPOP-1752)
  • Fix for problems with hasId() fails for empty collections. (TINKERPOP-1802)
  • Python supports GraphSON types g:Date, g:Timestamp and g:UUID. (TINKERPOP-1807)
  • Improve error messaging on failed bytecode translation. (TINKERPOP-1811)
  • Graph API removed from usage in the process test suite. (TINKERPOP-1813/TINKERPOP-1814)
  • Consistent behavior of self-referencing edges. (TINKERPOP-1821)
  • Improve flexibility of detachment for EventStrategy. (TINKERPOP-1829)
  • Race condition in TinkerGraph index creation. (TINKERPOP-1830)
  • Bug fix in TraversalHelper.replaceStep. (TINKERPOP-1832)
  • API fix for DetachedEdge.Builder#setInV and setOutV doesn't return the builder. (TINKERPOP-1833)
  • Long forms of e and -i are now working. (TINKERPOP-1851)

DSE 5.1.5

Release notes for DataStax Enterprise 5.1.5.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.5.
Important: DataStax recommends the latest patch release for most environments.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

19 October 2017

5.1.5 Components

All components from DSE 5.1.5 are listed. Components that are updated for DSE 5.1.5 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.1900
  • Apache Solr™ 6.0.1.0.1984 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.7-20170926-2e5c13b7
  • Apache Tomcat® 8.0.44
  • DataStax Spark Cassandra Connector 2.0.5
  • DSE Java Driver 1.2.2
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.5 Highlight

A single change for DSE Search:
  • Due to CVE-2017-12629, added Solr XMLParser protection from XML External Entity (XXE) attacks and removed Solr RunExecutableListener to harden security for DSE Search enabled clusters. (DSP-14618)

5.1.5 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Spark SQL applications with DSE authentication enabled will throw errors if the DSEFS scratch directory doesn't exist. (DSP-15276)

    Spark SQL applications utilize a scratch directory located in DSEFS. Make sure the dsefs://tmp/hive directory exists and that it has 733 permissions. If dsefs://tmp/hive does not exist, it must be created by a role with superuser permissions. Create the scratch directory with proper permissions:

    dse fs 'mkdir -p -m 733 /tmp/hive'
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

DSE 5.1.4

Release notes for DataStax Enterprise 5.1.4.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.4.
Important: DataStax recommends the latest patch release for most environments.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

12 October 2017

Table 10. DSE functionality

5.1.4 Components

All components from DSE 5.1.4 are listed. Components that are updated for DSE 5.1.4 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.1900 *
  • Apache Solr™ 6.0.1.0.1949 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.7-20170926-2e5c13b7 *
  • Apache Tomcat® 8.0.44
  • DataStax Spark Cassandra Connector 2.0.5
  • DSE Java Driver 1.2.2
  • DSEFS 5.1.2 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.4 Highlights

Executive summary highlights for DSE 5.1.4:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.4 DSE Graph highlights

  • Security: Graph Sandbox is enabled and configured by default. (DSP-11679)
  • Vertices with custom IDs return ID components as properties. (DSP-14262)

5.1.4 DSE Search highlights

  • Improved stability and performance when dealing with non-indexed fields. (DSP-6501)

    Full validation on all schema fields might result in validation failures after upgrade. See RNdse.html#RNdse514__514search.

  • Fixed the search performance objection regression issues. (DSP-14241)
  • Fixed the memory leak issue when encrypting the index. (DSP-13826)

5.1.4 DataStax Enterprise

Changes and enhancements:

  • scrub validates the partition key. Validation is added to schema mutation creation. (DSP-14366)
  • Always define execution_profiles in cqlsh.py. (DSP-14494)
  • Issue warning before running full repair when increasing replication factor. (DSP-14494)
  • Add anti-compaction metrics and warn users when incremental repair is inefficient. (DSP-14494)

Resolved issues:

  • Node does not start with unable to activate HistogramInfoPlugin message after upgrade to DSE 5.1. (DSP-13301)
  • Apache HttpClient directory traversal through malformed URI. (DSP-13580)
  • Token create, cancel, and renew security needs tightening. (DSP-14311)
  • stress-tool does not output rows. (DSP-14494)

5.1.4 DSE Advanced Replication

Changes and enhancements:

  • Command line interface should use non-zero exit code for unknown commands. (DSP-13590)

Resolved issues:

5.1.4 DSE Analytics

Resolved issues:

  • Session management in Hive metastore is broken. (DSP-12363)
  • When an application is submitted by a user without submit permission, exception message does not identify problem. (DSP-13234)
  • Spark shell not usable after standalone installation with services option. (DSP-14361)
  • Port setting not respected in DseCassandraConnectionFactory. (DSP-14442)
  • Spark Master/Worker Web UI should bind to RPC listen address and advertise RPC broadcast address by default. (DSP-14433)

5.1.4 DSEFS

Changes and enhancements:

  • Improved error message for DSEFS shell commands. (DSP-14157)
  • Improved error messages are passed to the DSEFS clients, including DSEFS shell, if error occurs while reading a file. (DSP-14371)
  • Improved error message when Spark fails to connect to DSEFS server. (DSP-14388)
  • HTTP communication logging level changed from DEBUG to TRACE to improve filtering. (DSP-14400)
  • Improve DSEFS stability on large workloads: DSEFS is less likely to overload Java Cassandra driver and cause BusyPoolException. Fixed edge-cases that might cause StackOverflowException and DSEFS lockup. (DSP-14408)

Resolved issues:

  • The service dse stop command does not wait for the process to be completely stopped. (DSP-14014)
  • DSEFS does not support symlink for data directories. (DSP-14110)
  • DSEFS fsck always prints number of blocks processed, even if file system is empty. (DSP-14235)

5.1.4 DSE Graph

Changes and enhancements:

  • Enable and configure the graph sandbox by default to improve security. (DSP-11679)
  • GraphFrame 0.5 fixes graph frame algorithms. (DSP-14271)
  • Gremlin console uses the default plugins.txt in the DSE distribution. If a user home is specified with bin/dse gremlin-console ~/gremlin-console then extra checks are performed to ensure that plugins.txt is populated. (DSP-14286)
  • Prevent multi-properties for the partition/clustering key. (DSP-14300)
  • graph.tx().commit(); call is not allowed on graph.tx().commit(); graph.tx().config().option("allow_scan", true).open(); g.V().count(). Instead, use graph.tx().config().option("allow_scan", true).open(); g.V().count(). (DSP-14482)

Resolved issues:

  • Vertex index on id property keys doesn't work. (DSP-9208)
  • Unnecessary INSERT and DELETE to dse_security.digest_tokens for every graph statement executed over native protocol. (DSP-13670)
  • Streamline configuration for gremlin-console connection to cluster with Kerberos authentication enabled. (DSP-14164)
  • DataFrames deletes do not leverage range or partition level tombstones. (DSP-14249)
  • Vertices with custom IDs do not return ID components as properties (as in g.V().properties() or g.V().values() for OLTP, OLAP, and GraphFrames. (DSP-14262)
  • DseResourceManager warning message when shutting down Spark+Graph nodes. (DSP-14276)
  • Graph sandbox should have org.apache.tinkerpop.gremlin.structure.io whitelisted by default. (DSP-14540)

5.1.4 DSE Search

Changes and enhancements:

  • Full validation on all schema fields might result in validation failures after upgrade. (DSP-6501)
    • All field definitions in the schema are validated and must be DSE Search compatible, even if the fields are not indexed, have docValues applied, or used for copy-field source.
    • Tune the schema before you upgrade. All field definitions in the schema are validated and must be DSE Search compatible, even if the fields are not indexed, have docValues applied, or used for copy-field source. With the tuned index, performance gains are especially recognized for unused large blobs.

Resolved issues:

  • Allow dynamic multi-valued fields without a corresponding CQL column. (DSP-13277)
  • Non-indexed frozen map column produces unexpected results without error message. (DSP-13997)
  • Non-indexed field prevents data from being indexed. (DSP-14001)
  • Single-pass CQL Solr queries cannot select some data types. (DSP-14022)
  • Text field does not work for group by operations; unexpected docvalues type SORTED_SET error message for text fields. (DSP-14106)
  • Parsing error on cleanup of Solr secondary index with empty string in partition ID. (DSP-14234)
  • Solr indexing statistics are not collected for solr_index_stats_options. (DSP-14241)
  • CPU layout assertions on startup should show in log file instead of stopping startup. (DSP-14281)
  • Cannot turn tracing off after running queries with tracing on. (DSP-14439)
  • Indexing wiki demo fails when solrslowlog is enabled. (DSP-14521)
  • Search performance objects are not working. (DSP-14241)
  • Memory leak during index encryption. (DSP-13826)

5.1.4 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Spark SQL applications with DSE authentication enabled will throw errors if the DSEFS scratch directory doesn't exist. (DSP-15276)

    Spark SQL applications utilize a scratch directory located in DSEFS. Make sure the dsefs://tmp/hive directory exists and that it has 733 permissions. If dsefs://tmp/hive does not exist, it must be created by a role with superuser permissions. Create the scratch directory with proper permissions:

    dse fs 'mkdir -p -m 733 /tmp/hive'
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.4

A list of DataStax Enterprise 5.1.4 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.4 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Handle limit correctly on tables with strict liveness (CASSANDRA-13883)
  • AbstractTokenTreeBuilder#serializedSize returns wrong value when there is a single leaf and overflow collisions (CASSANDRA-13869)
  • BTree.Builder memory leak (CASSANDRA-13754)
  • Revert CASSANDRA-10368 of supporting non-pk column filtering due to correctness (CASSANDRA-13798)
  • Fix cassandra-stress hang issues when an error during cluster connection happens (CASSANDRA-12938)
  • Better bootstrap failure message when blocked by (potential) range movement (CASSANDRA-13744)
  • "ignore" option is ignored in sstableloader (CASSANDRA-13721)
  • Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
  • Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512)
  • Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
  • Fix support for SuperColumn tables (CASSANDRA-12373)
  • Remove non-rpc-ready nodes from counter leader candidates (CASSANDRA-13043)
  • Improve short read protection performance (CASSANDRA-13794)
  • Fix sstable reader to support range-tombstone-marker for multi-slices (CASSANDRA-13787)
  • Fix short read protection for tables with no clustering columns (CASSANDRA-13880)
  • Make isBuilt volatile in PartitionUpdate (CASSANDRA-13619)
  • Prevent integer overflow of timestamps in CellTest and RowsTest (CASSANDRA-13866)
  • Fix counter application order in short read protection (CASSANDRA-12872)
  • Don't block RepairJob execution on validation futures (CASSANDRA-13797)
  • Wait for all management tasks to complete before shutting down CLSM (CASSANDRA-13123)
  • INSERT statement fails when Tuple type is used as clustering column with default DESC order (CASSANDRA-13717)
  • Fix pending view mutations handling and cleanup batchlog when there are local and remote paired mutations (CASSANDRA-13069)
  • Improve config validation and documentation on overflow and NPE (CASSANDRA-13622)
  • Range deletes in a CAS batch are ignored (CASSANDRA-13655)
  • Avoid assertion error when IndexSummary > 2G (CASSANDRA-12014)
  • Change repair midpoint logging for tiny ranges (CASSANDRA-13603)
  • Better handle corrupt final commitlog segment (CASSANDRA-11995)
  • StreamingHistogram is not thread safe (CASSANDRA-13756)
  • Fix MV timestamp issues (CASSANDRA-11500)
  • Better tolerate improperly formatted bcrypt hashes (CASSANDRA-13626)
  • Fix race condition in read command serialization (CASSANDRA-13363)
  • Fix AssertionError in short read protection (CASSANDRA-13747)
  • Don't skip corrupted sstables on startup (CASSANDRA-13620)
  • Fix the merging of cells with different user type versions (CASSANDRA-13776)
  • Copy session properties on cqlsh.py do_login (CASSANDRA-13640)
  • Potential AssertionError during ReadRepair of range tombstone and partition deletions (CASSANDRA-13719)
  • Don't let stress write warmup data if n=0 (CASSANDRA-13773)
  • Gossip thread slows down when using batch commit log (CASSANDRA-12966)
  • Randomize batchlog endpoint selection with only 1 or 2 racks (CASSANDRA-12884)
  • Fix digest calculation for counter cells (CASSANDRA-13750)
  • Fix ColumnDefinition.cellValueType() for non-frozen collection and change SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  • Skip materialized view addition if the base table doesn't exist (CASSANDRA-13737)
  • Drop table should remove corresponding entries in dropped_columns table (CASSANDRA-13730)
  • Log warn message until legacy auth tables have been migrated (CASSANDRA-13371)
  • Fix incorrect [2.1 <- 3.0] serialization of counter cells created in 2.0 (CASSANDRA-13691)
  • Fix invalid writetime for null cells (CASSANDRA-13711)
  • Fix ALTER TABLE statement to atomically propagate changes to the table and its MVs (CASSANDRA-12952)
  • Fix Digest mismatch Exception if hints file has UnknownColumnFamily (CASSANDRA-13696)
  • Fixed ambiguous output of nodetool tablestats command (CASSANDRA-13722)
  • Purge tombstones created by expired cells (CASSANDRA-13643)
  • Make concat work with iterators that have different subsets of columns (CASSANDRA-13482)
  • Set test.runners based on cores and memory size (CASSANDRA-13078)
  • Allow different NUMACTL_ARGS to be passed in (CASSANDRA-13557)
  • Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
  • Nodetool listsnapshots output is missing a newline, if there are no snapshots (CASSANDRA-13568)
  • sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
  • Safely handle empty buffers when outputting to JSON (CASSANDRA-13868)
  • Copy session properties on cqlsh.py do_login (CASSANDRA-13847)
  • Fix load over calculated issue in IndexSummaryRedistribution (CASSANDRA-13738)
  • Fix compaction and flush exception not captured (CASSANDRA-13833)
  • Uncaught exceptions in Netty pipeline (CASSANDRA-13649)
  • Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
  • Fix queries with LIMIT and filtering on clustering columns (CASSANDRA-11223)
  • Fix potential NPE when resume bootstrap fails (CASSANDRA-13272)
  • Fix toJSONString for the UDT, tuple and collection types (CASSANDRA-13592)
  • Fix nested Tuples/UDTs validation (CASSANDRA-13646)
  • Clone HeartBeatState when building gossip messages. Make its generation/version volatile (CASSANDRA-13700)
  • Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
  • Replace string comparison with regex/number checks in MessagingService test (CASSANDRA-13216)
  • Fix formatting of duration columns in CQLSH (CASSANDRA-13549)
  • Fix the problem with duplicated rows when using paging with SASI (CASSANDRA-13302)
  • Allow CONTAINS statements filtering on the partition key and it’s parts (CASSANDRA-13275)
  • Fall back to even ranges calculation in clusters with vnodes when tokens are distributed unevenly (CASSANDRA-13229)
  • Fix duration type validation to prevent overflow (CASSANDRA-13218)
  • Forbid unsupported creation of SASI indexes over partition key columns (CASSANDRA-13228)
  • Reject multiple values for a key in CQL grammar. (CASSANDRA-13369)
  • UDA fails without input rows (CASSANDRA-13399)
  • Fix compaction-stress by using daemonInitialization (CASSANDRA-13188)
  • V5 protocol flags decoding broken (CASSANDRA-13443)
  • Use write lock not read lock for removing sstables from compaction strategies. (CASSANDRA-13422)
  • Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors (CASSANDRA-13329)
  • Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)
  • Add charset to Analyser input stream (CASSANDRA-13151)
  • Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
  • Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
  • Tracing payload not passed from QueryMessage to tracing session (CASSANDRA-12835)
  • Ensure int overflow doesn't occur when calculating large partition warning size (CASSANDRA-13172)
  • Ensure consistent view of partition columns between coordinator and replica in ColumnFilter (CASSANDRA-13004)
  • Failed unregistering mbean during drop keyspace (CASSANDRA-13346)
  • nodetool scrub/cleanup/upgradesstables exit code is wrong (CASSANDRA-13542)
  • Fix the reported number of sstable data files accessed per read (CASSANDRA-13120)
  • Fix schema digest mismatch during rolling upgrades from versions before 3.0.12 (CASSANDRA-13559)
  • Upgrade JNA version to 4.4.0 (CASSANDRA-13072)
  • Interned ColumnIdentifiers should use minimal ByteBuffers (CASSANDRA-13533)
  • ReverseIndexedReader may drop rows during 2.1 to 3.0 upgrade (CASSANDRA-13525)
  • Fix repair process violating start/end token limits for small ranges (CASSANDRA-13052)
  • Add storage port options to sstableloader (CASSANDRA-13518)
  • Properly handle quoted index names in cqlsh DESCRIBE output (CASSANDRA-12847)
  • Avoid reading static row twice from old format sstables (CASSANDRA-13236)
  • Fix NPE in StorageService.excise() (CASSANDRA-13163)
  • Expire OutboundTcpConnection messages by a single Thread (CASSANDRA-13265)
  • Fail repair if insufficient responses received (CASSANDRA-13397)
  • Fix SSTableLoader fail when the loaded table contains dropped columns (CASSANDRA-13276)
  • Avoid name clashes in CassandraIndexTest (CASSANDRA-13427)
  • Handling partially written hint files (CASSANDRA-12728)
  • Interrupt replaying hints on decommission (CASSANDRA-13308)
  • Handling partially written hint files (CASSANDRA-12728)
  • Fix NPE issue in StorageService (CASSANDRA-13060)
  • Make reading of range tombstones more reliable (CASSANDRA-12811)
  • Fix startup problems due to schema tables not completely flushed (CASSANDRA-12213)
  • Fix view builder bug that can filter out data on restart (CASSANDRA-13405)
  • Fix 2i page size calculation when there are no regular columns (CASSANDRA-13400)
  • Fix the conversion of 2.X expired rows without regular column data (CASSANDRA-13395)
  • Fix hint delivery when using ext+internal IPs with prefer_local enabled (CASSANDRA-13020)
  • Nodetool upgradesstables/scrub/compact ignores system tables (CASSANDRA-13410)
  • Fix schema version calculation for rolling upgrades (CASSANDRA-13441)
  • Nodes started with join_ring=False should be able to serve requests when authentication is enabled (CASSANDRA-11381)
  • cqlsh COPY FROM: increment error count only for failures, not for attempts (CASSANDRA-13209)
  • Avoid starting gossiper in RemoveTest (CASSANDRA-13407)
  • Fix weightedSize() for row-cache reported by JMX and NodeTool (CASSANDRA-13393)
  • Fix JVM metric names (CASSANDRA-13103)
  • Coalescing strategy sleeps too much (CASSANDRA-13090)
  • Fix 2ndary index queries on partition keys for tables with static columns (CASSANDRA-13147)
  • Fix ParseError unhashable type list in cqlsh copy from (CASSANDRA-13364)

General upgrade advice for DSE 5.1.4

General upgrade advice for DataStax Enterprise 5.1.4.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.4

A list of DataStax Enterprise 5.1.4 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.4 includes DataStax Spark Cassandra Connector 2.0.5 includes all changes from earlier versions.

DSE 5.1.3

Release notes for DataStax Enterprise 5.1.3.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.3.
Important: DataStax recommends the latest patch release for most environments.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

6 September 2017

Table 11. DSE functionality

5.1.3 Components

All components from DSE 5.1.3 are listed. Components that are updated for DSE 5.1.3 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.1855 *
  • Apache Solr™ 6.0.1.0.1833 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.6-20170821-ac1bbb27 *
  • Apache Tomcat® 8.0.44 *
  • DataStax Spark Cassandra Connector 2.0.5 *
  • DSE Java Driver 1.2.2
  • DSEFS 5.1.2 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.3 Highlights

Executive summary highlights for DSE 5.1.3:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.3 DataStax Enterprise highlights

  • Incremental repairs are no longer the default for nodetool repair. Even with nodetool repair -full or nodetool repair -pr, DSE 5.1.0-5.1.2 were run as incremental and marked sstables as repaired causing anti-compaction. (DSP-14464)

    After upgrades from DSE 5.1.0-5.1.2 to DSE 5.1.3 or later, you must follow instructions in the upgrade guide to migrate off of incremental repairs. To continue running incremental repairs, use nodetool repair -inc.

5.1.3 DSE Analytics and DSEFS highlights

  • New -framework option for dse spark commands to accommodate applications that were originally written for open source Apache Spark. Specify which classpath is used, either the DSE version (default) or a similar path to open source Spark 2.0. (DSP-12954)
  • DSEFS includes several important stability fixes and performance improvements. To use DSEFS in production, DataStax strongly recommends upgrading to DSE 5.1.3 to leverage these improvements.

5.1.3 DSE Graph highlights

  • Significantly improved graph query performance. (DSP-11534)
  • Domain specific language support. (DSP-13545)
  • Graph custom id support for multiple keyed vertices. (DGL-258)

5.1.3 DSE Search highlights

5.1.3 DataStax Enterprise

Changes and enhancements:

  • nodetool rebuild and nodetool bootstrap improvements. (DSP-13870, DB-581)
    • New nodetool rebuild operations:
      • refetch - resets locally available ranges. Streams all ranges but leaves current data untouched.
      • reset - resets locally available ranges. Removes all locally present data (like a TRUNCATE). Streams all ranges.
  • Simplify role-permissions handling. (DSP-14159)

    The table system_auth.resource_role_permissons_index is no longer used. Drop this table after all nodes are upgraded to DSE 5.0.10. Upgrades from DSE 5.0.10+ to DSE versions earlier than 5.1.3 are not recommended. See Restrictions when upgrading to DSE 5.1.3.

  • New nodetool nodetool mark_unrepaired command unifies repaired and unrepaired compaction buckets. (DSP-14255)
  • Changes to nodetool repair. (DSP-14464)
    • When run without options on new tables, the default behavior is nodetool repair -full. (Earlier versions were incremental when no options were specified.)
    • When run without options on a keyspace or set of tables, nodetool repair runs incremental repair on tables previously repaired and full repair on new tables.
    • Anti-compaction is no longer run after full repairs. Use nodetool repair --run-anticompaction to restore the previous behavior.
    • Incremental repair is no longer supported on tables with MVs and CDC. An incremental repair executed on table with MVs or CDC will run full repair instead.

    After upgrades from DSE 5.1.0-5.1.2 to DSE 5.1.3 or later, you must follow instructions in the upgrade guide to migrate off of incremental repairs. To continue running incremental repairs, use nodetool repair -inc.

Resolved issues:

  • Adjust and check directory ownership when starting DSE. (DSP-13245)
  • CVE-2017-7957 xstream-core is vulnerable to Denial of Service (DoS) attacks. (DSP-13419)
  • After restore, data cannot be queried after streaming SSTables with sstableloader to tiered storage. (DSP-14188)
  • MemoryOnlyStrategy regions not immediately loaded into physical memory with new kernels. (DSP-14169)
  • Make full repair default and disallow incremental repair on MV/CDC tables. (DSP-14255)
  • Revert CASSANDRA-11223 behavior in AbstractReadCommandBuilder. (DSP-14135)
  • Prevent marking remote SSTables shadowing compacted data as repaired. (DSP-14141)
  • Rebuild logging always says 0 bytes. (DSP-13870)
  • Allow aggressive expiration of fully expired sstables without timestamp/key overlap checks. (DSP-13870)
  • SSTable index files can become corrupted due to StreamingHistogram bug. (DSP-14279)

5.1.3 DSE Advanced Replication

Resolved issues:
  • DataStax installer does not set up DSE Advanced Replication correctly. (DSP-13472)
  • Ingestion might miss or drop data at higher insertion rates. CDC log file might be deleted even if not processed. (DSP-14043)
  • DSEFS clients unnecessarily switch between remote nodes. (DSP-14108)
  • Race condition under heavy load sent confusing exceptions to the log file. (DSP-14180)

5.1.3 DSE Analytics

Changes and enhancements:
  • Improved error on Spark:// Master URLs. (DSP-13366)
  • New -framework option for dse spark commands to accommodate applications that were originally written for open source Apache Spark. Specify which classpath is used, either the DSE version (default) or a similar path to open source Spark 2.0. (DSP-12954)
  • Improved error messages when no target datacenter provided for Spark application. (DSP-13236)
Resolved issues:
  • Decrease logging level for RPC methods failures. (DSP-13282)
  • JoinWithCassandra and SaveToCassandra blocked on adding to requests to the async execute pool. (DSP-14178)

5.1.3 DSEFS

Changes and enhancements:
  • Expand DSEFS repair capability. DSEFS fsck checks if data blocks exist on the remote node that claims to have them. Mixed versions during upgrades are not supported. Upgrade all nodes in the cluster before using DSEFS fsck. (DSP-13081)
  • DSEFS read performance is improved. (DSP-13309)
  • Launch DSEFS shell with precedence given to the specified hosts. (DSP-14108)
  • Connection reuse is improved. Closing idle connections is disabled by default. New idle_connection_timeout_ms option in dse.yaml defines how long to wait before an idle client-server connection is closed. (DSP-14010)
  • Protocol change improves efficiency of passing JSON arrays between DSEFS server and client. Mixed versions during upgrades are not supported. Upgrade all nodes in the cluster before using the DSEFS shell. (DSP-14107)
Resolved issues:
  • DataStax installer does not set up DSEFS correctly for No Services installations. (DSP-13473)
  • NullPointerException: Unexpected null value of column valid_from in <dse keyspace>.inodes while running fsck. (DSP-12615)
  • Memory leak occurs with incorrect use of WebHDFS API. (DSP-13813)
  • Rare client-side ParsingException. (DSP-14000)
  • Incorrect FileNotFound errors when using Spark with DSEFS. (DSP-14105)

5.1.3 DSE Graph

Changes and enhancements:
  • Improved and simplified data batch loading of pre-formatted data. (DGL-235)

    Supporting changes:

    • Schema discovery and schema generation are deprecated. (DGL-246)
    • Standard IDs are deprecated. (DGL-247)
    • Transformations are deprecated. (DGL-248)
    • Standard vertex IDs are deprecated. Use custom vertex IDs instead. (DSP-13485)
  • Schema discovery and schema generation are deprecated. (DGL-246)
  • Graph custom id support for multiple keyed vertices. (DGL-258)
  • Query engine significantly improved to allow more queries to be satisfied by using indexes. In particular, AND and OR queries are now handled and translate transparently to multiple backend queries or, if possible, single search queries. (DSP-11534)
  • Allow for indexes to be used with ORDER BY clause. (DSP-11931)
  • Checking for edge connectedness no longer performs an unnecessary backend query. (DSP-12863)
  • Edge queries using between predicate now use an index, if available. (DSP-13541)
  • Improved support for domain-specific languages (DSL) in Gremlin enables the DataStax driver to specify TraversalSource. (DSP-13545)
  • cache=false at the transaction level now includes disabling AdjacencyListStoreImpl and IndexStoreImpl. (DSP-13560)
  • Vertices without multi-properties fetch all properties in a single query, rather than requesting properties one at a time. Using multi-properties as vertices is not recommended, because multiple cardinality (multi-properties) are retrieved in graph traversals more slowly than single cardinality properties. Vertices with multi-properties default to the previous behavior of requesting properties individually. (DSP-13646)
  • More Gremlin APIs are supported in DSEGraphFrames: dedup, sort, limit, filter, + as()/select(), or(). (DSP-13649)
  • Do partition deletes for the property/edge table entries if possible. (DSP-13671)
  • Timeouts for graph traversals now start from the time the request is received. Earlier releases started timeouts for graph traversals at processing start time. Timeouts will appear more readily on an overloaded server. (DSP-13828)
  • Numeric sack values no longer need to be explicitly typed (for example, 3.0D). You can still provide for greater specificity in the expected return type. (DSP-14026)
  • Lambdas provided to the sack() step are now recognized by the LambdaRestrictionStrategy. You must disable the restrict_lambda setting to call this method. (DSP-14118)
  • Support user-supplied IDs for edges and properties. ID must be Java UUID. (DSP-12932)
Resolved issues:
  • -help prints help twice. (DGL-257)
  • DGL prints warning excessively. (DGL-262)
  • The number of vertex labels is limited to 200 per graph. (DSP-11078)
  • Graph frames error if meta-property is not populated. (DSP-13063)
  • Gremlin server log directory setting doesn't work if default log location is moved. Use dse-env.sh to change log locations. (DSP-13508)
  • DseGraphFrame throws UnsupportedOperationException for graph with empty schema. (DSP-13858)
  • DseGraphRpc.getSchemaBlob should request EXECUTE permissions instead of SELECT. (DSP-13888)
  • Single cardinality edge updates work incorrectly. (DSP-14185)
  • DseGraphFrames.updateVertices() requires unnecessary ID columns. (DSP-14175)
  • The within predicate is not working for unindexed edges. (DSP-13209)

5.1.3 DSE Search changes and enhancements

  • OffheapPostings is present by default in demo and auto-generated solrconfig.xml files. (DSP-10088)
  • The default filter cache settings are changed. (DSP-13153)
  • Streamlined autoSolrConfig.xml template for auto-generated search indexes. CQL ALTER SEARCH INDEX CONFIG, ALTER SEARCH INDEX Schema, and CREATE SEARCH INDEX shortcuts for TieredMergePolicyFactory. (DSP-13229)
  • DeleteById is deprecated. (DSP-13988)
  • Extend TieredMergePolicy to support automatic removal of deletes. (DSP-13626)
  • DSE Search indexing optimizes for SSDs by default. Spinning disk detection logic is removed. (DSP-13924)
  • Improved error messages on invalid solr_query are more descriptive for invalid queries and syntax errors. (DSP-14003)
Resolved issues:
  • Shard request exceptions are not logged at the replica level. (DSP-12691)
  • Unnecessary double segment flushing on hard commit. (DSP-13971)
  • Reintroduce provisioning/dropping states for backward compatibility. Issue a warning when a graph is found. (DSP-14111)
  • Search permissions cannot be managed on non-search nodes in the cluster. (DSP-14242)

5.1.3 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.3

A list of DataStax Enterprise 5.1.3 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.3 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Fix cassandra-stress hang issues when an error during cluster connection happens (CASSANDRA-12938)
  • Better bootstrap failure message when blocked by (potential) range movement (CASSANDRA-13744)
  • "ignore" option is ignored in sstableloader (CASSANDRA-13721)
  • Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652)
  • Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512)
  • Copy session properties on cqlsh.py do_login (CASSANDRA-13640)
  • Potential AssertionError during ReadRepair of range tombstone and partition deletions (CASSANDRA-13719)
  • Don't let stress write warmup data if n=0 (CASSANDRA-13773)
  • Gossip thread slows down when using batch commit log (CASSANDRA-12966)
  • Randomize batchlog endpoint selection with only 1 or 2 racks (CASSANDRA-12884)
  • Fix digest calculation for counter cells (CASSANDRA-13750)
  • Fix ColumnDefinition.cellValueType() for non-frozen collection and change SSTabledump to use type.toJSONString() (CASSANDRA-13573)
  • Skip materialized view addition if the base table doesn't exist (CASSANDRA-13737)
  • Drop table should remove corresponding entries in dropped_columns table (CASSANDRA-13730)
  • Log warn message until legacy auth tables have been migrated (CASSANDRA-13371)
  • Fix incorrect [2.1 <- 3.0] serialization of counter cells created in 2.0 (CASSANDRA-13691)
  • Fix invalid writetime for null cells (CASSANDRA-13711)
  • Fix ALTER TABLE statement to atomically propagate changes to the table and its MVs (CASSANDRA-12952)
  • Fix Digest mismatch Exception if hints file has UnknownColumnFamily (CASSANDRA-13696)
  • Fixed ambiguous output of nodetool tablestats command (CASSANDRA-13722)
  • Purge tombstones created by expired cells (CASSANDRA-13643)
  • Make concat work with iterators that have different subsets of columns (CASSANDRA-13482)
  • Set test.runners based on cores and memory size (CASSANDRA-13078)
  • sstabledump reports incorrect usage for argument order (CASSANDRA-13532)
  • Uncaught exceptions in Netty pipeline (CASSANDRA-13649)
  • Prevent integer overflow on exabyte filesystems (CASSANDRA-13067)
  • Fix queries with LIMIT and filtering on clustering columns (CASSANDRA-11223)
  • Fix potential NPE when resume bootstrap fails (CASSANDRA-13272)
  • Fix toJSONString for the UDT, tuple and collection types (CASSANDRA-13592)
  • Clone HeartBeatState when building gossip messages. Make its generation/version volatile (CASSANDRA-13700)

General upgrade advice for DSE 5.1.3

General upgrade advice for DataStax Enterprise 5.1.3.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DSE 5.1.3

Upgrading
  • Creating Materialized View with filtering on non-primary-key base column (added in CASSANDRA-10368) is disabled, because the liveness of view row is depending on multiple filtered base non-key columns and base non-key column used in view primary-key. This semantic cannot be supported without storage format change, see CASSANDRA-13826. For append-only use case, you may still use this feature with a startup flag: -Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true.
  • The table system_auth.resource_role_permissons_index is no longer used and should be dropped after all nodes are on 5.1.3. Note that upgrades from DSE 5.0 series since 5.0.10 to DSE versions before 5.1.3 are not recommended.
  • Full repairs are now default if no option is specified on nodetool repair, unless incremental repair was already run on the table/keyspace being repaired, to maintain backward compatibility. Incremental repair may be run on new tables by using the -inc option.
  • Full repairs will no longer run repair unless the --run-anticompaction option is specified - Incremental repairs are no longer supported on tables with materialized views or CDC until its limitations are addressed. An incremental repair triggered on a base table or materialized view run a full repair instead. See CASSANDRA-12888 for details.

Materialized Views

For upgrades from DSE 5.1.1 or 5.1.2 or any version earlier than DSE 5.0.10

  • Cassandra will no longer allow dropping columns on tables with Materialized Views.
  • A change was made in the way the Materialized View timestamp is computed, which may cause an old deletion to a base column which is view primary key (PK) column to not be reflected in the view when repairing the base table post-upgrade. This condition is only possible when a column deletion to an MV primary key (PK) column not present in the base table PK (via UPDATE base SET view_pk_col = null or DELETE view_pk_col FROM base) is missed before the upgrade and received by repair after the upgrade. If such column deletions are done on a view PK column which is not a base PK, it's advisable to run repair on the base table of all nodes prior to the upgrade. Alternatively it's possible to fix potential inconsistencies by running repair on the views after upgrade or drop and re-create the views. See CASSANDRA-11500 for more details.
  • Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column = null or DELETE unselected_column FROM base) may not be properly reflected in the view in some situations so we advise against doing deletions on base columns not selected in views until this is fixed on CASSANDRA-13826.

Spark Cassandra Connector changes for DSE 5.1.3

A list of DataStax Enterprise 5.1.3 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.3 includes DataStax Spark Cassandra Connector 2.0.5 with all changes from earlier versions, and adds these production-certified changes:
  • Allow 'YYYY' format LocalDate
  • Add metrics for write batch Size (SPARKC-501)
  • Type Converters for java.time.localdate (SPARKC-495)

DSE 5.1.2

Release notes for DataStax Enterprise 5.1.2.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

installation_location/resources/dse/conf/dse.yaml
  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.2.
Important: DataStax recommends the latest patch release. The latest version of DataStax Enterprise 5.1 is 5.1.17. Due to Potential data loss for INSERTs with very large TTLs. (DSP-15412), DataStax does not recommend DSE 5.1.0-5.1.2 for production.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

18 July 2017

5.1.2 Components

All components from DSE 5.1.2 are listed. Components that are updated for DSE 5.1.1 are indicated with an asterisk (*).

  • Apache Cassandra™ 3.11.0.1758 *
  • Apache Solr™ 6.0.1.0.1716 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.6-20170623-d59f0b40 *
  • Apache Tomcat® 8.0.43 *
  • DataStax Spark Cassandra Connector 2.0.3 *
  • DSE Java Driver 1.2.2
  • DSEFS 5.1.2 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.2 Highlights

Executive summary highlights for DSE 5.1.2:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

5.1.2 DataStax Enterprise highlights

DataStax Enterprise 5.1.2 includes CASSANDRA-13004 that fixes possible corruption while adding a column to a table or removing a column from a table. (DSP-13684)

This fix requires a messaging protocol version change to VERSION_3014. DataStax strongly recommends additional steps for the following upgrade paths:
Upgrade from Upgrade to Upgrade steps
5.0.0 through 5.0.8 5.1.2 and later See the Upgrades from DSE 5.0.0 to 5.0.8 and from DSE 5.1.0 and 5.1.1 to DSE 5.1.2 only step in the Preparing to upgrade section in Upgrading from DataStax Enterprise 5.0 to 5.1.
5.1.0 through 5.1.1 5.1.2 and later See Preparing to upgrade in Upgrades for DataStax Enterprise patch releases.

5.1.2 DSE Analytics and DSEFS highlights

  • DSE will not start if DSEFS is enabled (which is the default for all Analytics nodes in 5.1) and the DSEFS work directory or data directories are missing and cannot be created. In earlier releases, DSE would start but the Analytics nodes would experience hard-to-detect problems later on. (DSP-13238)
  • DSEFS performance is improved when authorization is enabled. New dse.yaml advanced DSEFS options: query_cache_size and query_cache_expire_after_ms adjust the credential caching. (DSP-13107)

5.1.2 DSE Graph highlights

  • Performance improvement: Gremlin script compilation. (DSP-12789)
  • Significant improvement on vertex properties retrieval. (DSP-13467)
  • Partitioned vertex tables (PVT) are deprecated. (DSP-13501)
  • Graph Loader: Support loading geospatial data type. (DGL-225)

5.1.2 DSE Search highlights

  • Re-indexing performance improvements. (DSP-13751), (DSP-12923)
  • Fixes to solr indexing management tasks. (DSP-13778), (DSP-10088), (DSP-13793)

5.1.2 DataStax Enterprise

Changes and enhancements:

  • Jackson Deserializer vulnerability. (DSP-13414)
  • New nodetool sjk command for troubleshooting and monitoring that runs Swiss Java Knife (SJK) on the local node. (DSP-13544)
  • Make o.a.c.metrics extend org.codahale.metrics to fix Metrics Reporter. (DSP-13840)
  • Make sure to handle range queries while filtering. (DSP-13840).
  • Allow mapping a single column to multiple SASI indexes. (DSP-13045)
  • Properly evict pstmts from prepared statements cache (DSP-13770).
  • Add nodetool sequence batch functionality. (DSP-13770).
  • Show correct protocol version in cqlsh (DSP-13544)
  • null assertion in MemtablePostFlush. (DSP-13544)

Resolved issues:

  • CqlSlowLogPlugin can fail to determine the table name of a DropIndexStatement if the index was dropped already. (DSP-11811)
  • Installer overrides for workload don't work in No Services + Analytics. (DSP-13475)

5.1.2 DSE Analytics

Changes and enhancements:
  • When ALLOW_SPARK_HOME=true, support to specify a user-specific Spark home directory with the SPARK_HOME environment variable. (DSP-8100)
  • Change lease manager log message to improve Spark Master troubleshooting. (DSP-12846)
Resolved issues:
  • Default and provided Spark executor or driver JVM options could get jumbled. (DSP-12857)
  • DSEFS min_free_space default value in dse.yaml is changed to 5 GB. (DSP-13178)
  • Cannot interrupt Spark Shell when unable to connect to DSE and keeps retrying. (DSP-13339)
  • Configuration connection for Spark applications should use a load balancing policy to choose only nodes that are running Spark in the target DC. (DSP-13325)
  • When stopping Spark drivers and executors when a supervising DSE process dies, Spark executors might stay alive even after worker death due to a race condition. (DSP-13688)
  • MultipleRetry policy may retry with an incorrect consistency level. (DSP-13542)

5.1.2 DSE Graph

Changes and enhancements:
  • Specify file matching pattern for directory load. (DGL-177)
  • Graph Loader: Support loading geospatial data type. (DGL-225)
  • Improved error message when Spark submit has connection problems on initialization. (DSP-12632)
  • Partitioned vertex tables (PVT) are deprecated. (DSP-13501)
  • A change is required if more than 256 parameters are passed on a graph query request for TinkerPop drivers and drivers using Cassandra native protocol. Passing very large numbers of parameters on requests is an anti-pattern, because the script evaluation time increases proportionally. DataStax recommends reducing the number of parameters to reduce script compilation times. Consider alternate methods for parameterizing scripts, like passing a single map. If the graph query request requires many arguments, pass a list. If you pass more than 256 parameters, increase the max_query_params option in dse.yaml. (DSP-12789)
  • Don't instantiate DseQueryHandler for each statement in graph. (DSP-13287)
  • GraphSON 2.0 serialization performance enhancements. (DSP-13467)
  • DSEFS keyspace visible in Spark SQL. (DSP-13510)
  • Remove provisioning state during graph creation. Graph is either live or non-existing. (DSP-13686)
  • Improve schema migration. Remove schema provisioning. (DSP-13665)

5.1.2 DSE Graph resolved issues

Resolved issues:
  • Graph loader loads entire grapshon and gryo files in to memory. (DGL-209)
  • Properly parse dates from strings. (DSP-12259)
  • Race condition can cause Spark Executor creation loop during DSE node shutdown. (DSP-12589)
  • Order propertyKeys correctly in schema.describe(). (DSP-12761)
  • Gremlin scripts taking a long time to compile. See required change if more than 256 parameters are passed on a graph query request. (DSP-12789)
  • gremlin-console isn't properly initialized when started in debug mode. (DSP-12900)
  • Change ranking of indices so that Search index < Secondary Index < MV index. (DSP-13212)
  • Graph profile() results should display CQL by default even in console. (DSP-13293)
  • Cache empty result sets for queries that didn't return elements. (DSP-13342)
  • GraphFrames allow grouping by properties which can potentially be null. (DSP-13406)
  • DseGraphFrame needs to be serializable for the spark-shell graph data export. (DSP-13427)
  • Backward compatibility issue with .select() .by() or local(). (DSP-13607)
  • DseGraphFrame.updateEdges() insert single cardinality edges properly. (DSP-13865)
  • Spark shell seems to hang indefinitely when running graph frame drop command. (DSP-13795)

5.1.2 DSEFS

Changes and enhancements:
  • Improve authorization performance. New dse.yaml advanced DSEFS options: query_cache_size and query_cache_expire_after_ms. (DSP-13107)
  • Improve error message when DSEFS is low on storage space. (DSP-13324)
  • DSEFS keyspace creation uses SimpleStrategy with replication factor of 1. After starting the cluster for the first time, you must alter the keyspace to use NetworkTopologyStrategy with proper RF. (DSP-12662)
Resolved issues:
  • DSE will not start if DSEFS is enabled and fails to start due to a configuration problem. (DSP-13238)

5.1.2 DSE Search

Changes and enhancements:
  • rtOffheapPostings is present and true by default in demo and auto-generated solrconfig.xml files. (DSP-10088, DSP-13228)
  • Repair-driven re-indexing is significantly faster because individual partition indexing tasks are executed in parallel. Override Cassandra's default post-repair index builder. (DSP-12923)
  • The default filter cache settings are changed. (DSP-13153)
  • The Tika functionality that is bundled with Apache Solr is deprecated. Instead, use the stand-alone Apache Tika project. (DSP-14002)
Resolved issues:
  • Gremlin inside() function no longer uses search index. (DSP-13553)
  • CREATE SEARCH INDEX fails with custom resources. (DSP-13778)
  • Improved error message when running dse cassandra-stop when there are multiple DSE processes. (DSP-12938)
  • Solr 2i invalidation deadlocks if invalidation runs with index unregistered. (DSP-13751)
  • Auto-generation options need to be validated correctly. (DSP-13793)

5.1.2 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.2

A list of DataStax Enterprise 5.1.2 enhancements to Apache Cassandra™ 3.11.0.

DataStax Enterprise (DSE) 5.1.2 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.11.0. (For Cassandra updates, see CHANGES.txt.)

  • Properly evict pstmts from prepared statements cache (CASSANDRA-13641)
  • Allow different NUMACTL_ARGS to be passed in (CASSANDRA-13557)
  • Fix secondary index queries on COMPACT tables (CASSANDRA-13627)
  • Nodetool listsnapshots output is missing a newline, if there are no snapshots (CASSANDRA-13568)
  • Fix toJSONString for the UDT, tuple and collection types (CASSANDRA-13592)
  • Fix nested Tuples/UDTs validation (CASSANDRA-13646)
  • Replace string comparison with regex/number checks in MessagingService test (CASSANDRA-13216)
  • Fix formatting of duration columns in CQLSH (CASSANDRA-13549)
  • Ensure int overflow doesn't occur when calculating large partition warning size (CASSANDRA-13172)
  • Ensure consistent view of partition columns between coordinator and replica in ColumnFilter (CASSANDRA-13004)
  • Failed unregistering mbean during drop keyspace (CASSANDRA-13346)
  • nodetool scrub/cleanup/upgradesstables exit code is wrong (CASSANDRA-13542)
  • Fix the reported number of sstable data files accessed per read (CASSANDRA-13120)
  • Fix schema digest mismatch during rolling upgrades from versions before 3.0.12 (CASSANDRA-13559)
  • Upgrade JNA version to 4.4.0 (CASSANDRA-13072)
  • Interned ColumnIdentifiers should use minimal ByteBuffers (CASSANDRA-13533)
  • ReverseIndexedReader may drop rows during 2.1 to 3.0 upgrade (CASSANDRA-13525)
  • Fix repair process violating start/end token limits for small ranges (CASSANDRA-13052)
  • Nodes started with join_ring=False should be able to serve requests when authentication is enabled (CASSANDRA-11381)
  • cqlsh COPY FROM: increment error count only for failures, not for attempts (CASSANDRA-13209)
  • Fix the problem with duplicated rows when using paging with SASI (CASSANDRA-13302)
  • Allow CONTAINS statements filtering on the partition key and it’s parts (CASSANDRA-13275)
  • Fall back to even ranges calculation in clusters with vnodes when tokens are distributed unevenly (CASSANDRA-13229)
  • Fix duration type validation to prevent overflow (CASSANDRA-13218)
  • Forbid unsupported creation of SASI indexes over partition key columns (CASSANDRA-13228)
  • Reject multiple values for a key in CQL grammar. (CASSANDRA-13369)
  • UDA fails without input rows (CASSANDRA-13399)
  • Fix compaction-stress by using daemonInitialization (CASSANDRA-13188)
  • V5 protocol flags decoding broken (CASSANDRA-13443)
  • Use write lock not read lock for removing sstables from compaction strategies. (CASSANDRA-13422)
  • Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors (CASSANDRA-13329)
  • Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)
  • Add charset to Analyser input stream (CASSANDRA-13151)
  • Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
  • Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
  • Tracing payload not passed from QueryMessage to tracing session (CASSANDRA-12835)
  • Add storage port options to sstableloader (CASSANDRA-13518)
  • Properly handle quoted index names in cqlsh DESCRIBE output (CASSANDRA-12847)
  • Avoid reading static row twice from old format sstables (CASSANDRA-13236)
  • Fix NPE in StorageService.excise() (CASSANDRA-13163)
  • Expire OutboundTcpConnection messages by a single Thread (CASSANDRA-13265)
  • Fail repair if insufficient responses received (CASSANDRA-13397)
  • Fix SSTableLoader fail when the loaded table contains dropped columns (CASSANDRA-13276)
  • Avoid name clashes in CassandraIndexTest (CASSANDRA-13427)
  • Handling partially written hint files (CASSANDRA-12728)
  • Interrupt replaying hints on decommission (CASSANDRA-13308)
  • Handling partially written hint files (CASSANDRA-12728)
  • Fix NPE issue in StorageService (CASSANDRA-13060)
  • Make reading of range tombstones more reliable (CASSANDRA-12811)
  • Fix startup problems due to schema tables not completely flushed (CASSANDRA-12213)
  • Fix view builder bug that can filter out data on restart (CASSANDRA-13405)
  • Fix 2i page size calculation when there are no regular columns (CASSANDRA-13400)
  • Fix the conversion of 2.X expired rows without regular column data (CASSANDRA-13395)
  • Fix hint delivery when using ext+internal IPs with prefer_local enabled (CASSANDRA-13020)
  • Nodetool upgradesstables/scrub/compact ignores system tables (CASSANDRA-13410)
  • Fix schema version calculation for rolling upgrades (CASSANDRA-13441)
  • Avoid starting gossiper in RemoveTest (CASSANDRA-13407)
  • Fix weightedSize() for row-cache reported by JMX and NodeTool (CASSANDRA-13393)
  • Fix JVM metric names (CASSANDRA-13103)
  • Coalescing strategy sleeps too much (CASSANDRA-13090)
  • Fix 2ndary index queries on partition keys for tables with static columns (CASSANDRA-13147)
  • Fix ParseError unhashable type list in cqlsh copy from (CASSANDRA-13364)

General upgrade advice for DSE 5.1.2

General upgrade advice for DataStax Enterprise 5.1.2.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for 5.1.2

A list of DataStax Enterprise 5.1.2 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.2 includes DataStax Spark Cassandra Connector 2.0.3 with all changes from earlier versions, and adds these production-certified changes:
  • All patches up to 1.6.8.

DSE 5.1.1

Release notes for DataStax Enterprise 5.1.1.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.1.
Important: DataStax recommends the latest patch release. The latest version of DataStax Enterprise 5.1 is 5.1.17. Due to Potential data loss for INSERTs with very large TTLs. (DSP-15412), DataStax does not recommend DSE 5.1.0-5.1.2 for production.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

23 May 2017

Table 12. DSE functionality

5.1.1 Components

All components from DSE 5.1.1 are listed. Components that are updated for DSE 5.1.1 are indicated with an asterisk (*).
  • Apache Cassandra™ 3.10.0.1695 *
  • Apache Solr™ 6.0.1.0.1705 *
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.5-20170321-f3032b39 *
  • Apache Tomcat® 8.0.43 *
  • DataStax Spark Cassandra Connector 2.0.2 *
  • DSE Java Driver 1.2.2
  • DSEFS 5.1.26 *
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.1 Highlights

Executive summary highlights for DSE 5.1.1:The executive summary highlights are just a top-level view. Be sure to review all of the release notes.

DSE Analytics and DSEFS highlights

DSE 5.1.1 improves the reliability of Spark workers reconnecting when the Spark Master changes to a different node. For example, if the current master node goes down. Although this scenario was rarely encountered, it would sometimes require running a command to restart the Spark workers. The affected versions are DSE 5.0.7 and 5.1.0. (DSP-11306)

DSE Graph highlights

DSE 5.1.1 highlights include:
  • Failing OLAP queries if meta-properties were used in graph schema. (DSP-13016)
  • Script synchronization to prevent multiple threads trying to compile the same Gremlin script. In multi-threaded scenarios, Gremlin scripts would hang. (DSP-12814)

DSE Search highlights

Skip DSE 5.1.0 and upgrade directly to DSE 5.1.1 if you:
  • Use the HTTP interface. (DSP-13318), (DSP-13270)
  • Have a Thrift column family backing an active Solr core. (DSP-13019)
  • Use TTL to expire data. (DSP-12960)
  • Use index encryption. (DSP-13155), (DSP-12620)
  • Use live indexing. (DSP-12040), (DSP-12941)

5.1.1 DataStax Enterprise

Changes and enhancements:

  • Security fix with commons-collections4 version 4.1 due to CVE-2015-6420. (DSP-13060)
  • Guard mapped memory accesses with an assertion instead of causing a segmentation fault in JVM. (DSP-13344)

Resolved issues:

  • dsetool logs clear credentials on logs. (DSP-12985)
  • Plain text authentication handled incorrectly in DseAuthenticator causes performance degradation. (DSP-13201)
  • Installer deletes user directories under /etc/dse/conf during upgrade to 5.1. (DSP-13296)
  • SafeNet/KMIP authentication failure via LDAP. (DSP-12739)
  • CVE-2012-2098 vulnerability in Apache Ant Core 1.7.0. (DSP-12925)

5.1.1 DSE Advanced Replication

Changes and enhancements:
  • Increased robustness of CDC processor. (DSP-12852)
  • Add audit log compression parameter. (DSP-12949)
Resolved issues:
  • Error while refreshing configuration. (DSP-13148)
  • In flight Advanced Replication mutations are not encrypted when commitlog encryption is enabled. (DSP-12961)
  • MutationFileSource fails when a transmission file is not found. (DSP-11633)
  • AdvRep channel status NPE. (DSP-12522)
  • AdvRep CLI metrics list output showing negative message count. (DSP-12788)
  • advrep log count Serializer Not Defined Error MultiNode. (DSP-13032)

5.1.1 DSE Analytics

Changes and enhancements:
  • Spark Cassandra Connector should make DseSession compatible sessions. (DSP-12737)
Resolved issues:
  • On start, Spark worker registers with master that is then changed, but doesn't reregister with new master. (DSP-11306)
  • A new CQL type tinyint. (DSP-11940)
  • When DSE node with Spark Master gracefully shuts down at the same time that an application is submitted or stopped, Spark Master fails to save the recovery storage information. (DSP-12795)
  • Weather sensor demo website not graphing all data values. (DSP-13041)
  • Extra unnecessary messages when starting Spark shell. (DSP-13239)
  • The spark-submit --driver-class-path option does not place a jar only on the Driver Classpath. (DSP-13289)

5.1.1 DSE Graph

Changes and enhancements:
  • Make explicit parameter for setting tmp dir for mapdb and netty. (DGL-167)
  • Support recursive loading of directories. (DGL-172)
  • Remove double cluster client in ClusterBuilder. Instead, use a single client and configure the CL in a {{SimpleGraphStatement}} for creating the graph. (DGL-183)
  • VertexInputRDD.getOrCreateVertex method performance improvement; Graph OLAP query running time reduced by ~10%. (DSP-12782)
  • DseGraphFrames library is included in com.datastax.dse:dse-spark-dependencies to support application build. (DSP-13074)
Resolved issues:
  • Support secondary indexes. (DGL-202)
  • DGL creates duplicate edges when rerunning when using custom ids. (DGL-205)
  • Properties with empty strings are skipped. New graph loader -skip_blank_values option. (DGL-215)
  • Tab-delimited data cannot be read correctly with File.text. (DGL-222)
  • RangeStep fails when used with negative values. (DSP-11671)
  • Logging level in DigestTokensManager lowered from INFO to DEBUG. (DSP-12234)
  • Decimal type does not work, for both read and write, when reading a graph from Spark. (DSP-12299)
  • Comparing IDs of newly created elements with normal elements causes a class cast exception. (DSP-12738)
  • Allow graph.allow_scan to be set on tx level. (DSP-12794)
  • Improve handling of ASM "Method code too large" exception when processing large Gremlin script. (DSP-12802)
  • Many threads get stuck compiling the same script. (DSP-12814)
  • Check that a new ID given to a schema element has not already been used. (DSP-12826)
  • Optimize solr .within() queries correctly. (DSP-12830)
  • Vertex properties without meta-properties defined in schema create invalid RDD data. (DSP-13016)
  • OLAP case sensitivity for edges and meta-properties. (DSP-13085)
  • Exception thrown when attempting to read IDs of vertices retrieved through a full-graph scan. (DSP-13210)
  • Graph should start listening to schema updates only after DSE system keyspace is set up. (DSP-13251)
  • DseGraphFrame fail with UUID as a custom id. (DSP-13302)

5.1.1 DSEFS

Changes and enhancements:
  • Local node is preferred for placing new data blocks to save network bandwidth usage by DSEFS. (DSP-12746)
Resolved issues:
  • DSEFS memory leaks. (DSP-13023)
  • Cannot write file to WebHDFS REST interface with Spark. (DSP-13154)

5.1.1 DSE Search

Changes and enhancements:
  • Solr demos updated to use CQL index management to create cores. (DSP-11451)
  • Runtime node blacklisting for distributed search queries; the EndpointStateTracker MBean now has Blacklisted boolean attribute. (DSP-12965)
  • Display reindexing progress with dsetool core_indexing_status --progress option. (DSP-12617)
  • Support for indexing frozen sets and lists of native and user-defined (tuple/UDT) element types. Indexing frozen maps is not supported. (DSP-12983)
Resolved issues:
  • Remove <dataDir> option from solrConfig files in demo apps. (DSP-9402)
  • CQL Search queries time out when a column has a colon (:) in it. Solr field name policy applies to DSE Search. See the field name information in Apache Solr and Apache Lucene limitations. (DSP-11296)
  • Make TimeUUIDField epoch not platform-dependent. (DSP-11424)
  • Term vector (TV) file handles leak when an empty DWPT gets discarded in RT setup. (DSP-12040)
  • DistributedRequestException isn't created with a detail message. (DSP-12493)
  • BlockCache corruption with high concurrency. (DSP-12620)
  • Poor performance when searching with UDT sub-fields. (DSP-12812)
  • Better TTL logging. (DSP-12885)
  • Term frequency inconsistencies in RT. (DSP-12941)
  • The TTL task is never de-scheduled. (DSP-12960)
  • Cannot reload core after Thrift table upgrade. (DSP-13019)
  • Solr listens only on port 8080 regardless of configuration. (DSP-13187)
  • Solr is accepting HTTP requests before all cores have loaded. (DSP-13270)
  • Excessive StatefulEncryptorAdapter usage by evicting StatefulEncryptorAdapter cache when index output gets closed. (DSP-13155)
  • Upgrade Tomcat to 8.0.43 to fix CVE-2016-8735 and other security issues. (DSP-13318)

5.1.1 DataStax Enterprise known issues

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Cassandra enhancements for DSE 5.1.1

A list of DataStax Enterprise 5.1.0 enhancements to Apache Cassandra™ 3.10.0.

DataStax Enterprise (DSE) 5.1.1 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.10.0. (For Cassandra updates, see CHANGES.txt.)

  • Fix the problem with duplicated rows when using paging with SASI (CASSANDRA-13302)
  • Allow CONTAINS statements filtering on the partition key and it’s parts (CASSANDRA-13275)
  • Fall back to even ranges calculation in clusters with vnodes when tokens are distributed unevenly (CASSANDRA-13229)
  • Fix duration type validation to prevent overflow (CASSANDRA-13218)
  • Forbid unsupported creation of SASI indexes over partition key columns (CASSANDRA-13228)
  • Reject multiple values for a key in CQL grammar. (CASSANDRA-13369)
  • UDA fails without input rows (CASSANDRA-13399)
  • Fix compaction-stress by using daemonInitialization (CASSANDRA-13188)
  • V5 protocol flags decoding broken (CASSANDRA-13443)
  • Use write lock not read lock for removing sstables from compaction strategies. (CASSANDRA-13422)
  • Use corePoolSize equal to maxPoolSize in JMXEnabledThreadPoolExecutors (CASSANDRA-13329)
  • Avoid rebuilding SASI indexes containing no values (CASSANDRA-12962)
  • Add charset to Analyser input stream (CASSANDRA-13151)
  • Delete illegal character from StandardTokenizerImpl.jflex (CASSANDRA-13417)
  • Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
  • Tracing payload not passed from QueryMessage to tracing session (CASSANDRA-12835)
  • Add storage port options to sstableloader (CASSANDRA-13518)
  • Properly handle quoted index names in cqlsh DESCRIBE output (CASSANDRA-12847)
  • Avoid reading static row twice from old format sstables (CASSANDRA-13236)
  • Fix NPE in StorageService.excise() (CASSANDRA-13163)
  • Expire OutboundTcpConnection messages by a single Thread (CASSANDRA-13265)
  • Fail repair if insufficient responses received (CASSANDRA-13397)
  • Fix SSTableLoader fail when the loaded table contains dropped columns (CASSANDRA-13276)
  • Avoid name clashes in CassandraIndexTest (CASSANDRA-13427)
  • Handling partially written hint files (CASSANDRA-12728)
  • Interrupt replaying hints on decommission (CASSANDRA-13308)
  • Handling partially written hint files (CASSANDRA-12728)
  • Fix NPE issue in StorageService (CASSANDRA-13060)
  • Make reading of range tombstones more reliable (CASSANDRA-12811)
  • Fix startup problems due to schema tables not completely flushed (CASSANDRA-12213)
  • Fix view builder bug that can filter out data on restart (CASSANDRA-13405)
  • Fix 2i page size calculation when there are no regular columns (CASSANDRA-13400)
  • Fix the conversion of 2.X expired rows without regular column data (CASSANDRA-13395)
  • Fix hint delivery when using ext+internal IPs with prefer_local enabled (CASSANDRA-13020)
  • Nodetool upgradesstables/scrub/compact ignores system tables (CASSANDRA-13410)
  • Fix schema version calculation for rolling upgrades (CASSANDRA-13441)
  • Avoid starting gossiper in RemoveTest (CASSANDRA-13407)
  • Fix weightedSize() for row-cache reported by JMX and NodeTool (CASSANDRA-13393)
  • Fix JVM metric names (CASSANDRA-13103)
  • Coalescing strategy sleeps too much (CASSANDRA-13090)
  • Fix 2ndary index queries on partition keys for tables with static columns (CASSANDRA-13147)
  • Fix ParseError unhashable type list in cqlsh copy from (CASSANDRA-13364)

General upgrade advice for DSE 5.1.1

General upgrade advice for DataStax Enterprise 5.1.1.

All upgrade advice from previous versions applies. Carefully review the Upgrading DataStax Enterprise planning and upgrade instructions to ensure a smooth upgrade and avoid pitfalls and frustrations. This general advice applies to the database upgrade and does not replace the upgrade documentation.
  • General upgrading advice for any version and new features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

Spark Cassandra Connector changes for DSE 5.1.1

A list of DataStax Enterprise 5.1.1 production-certified changes for the DataStax Spark Cassandra Connector.

DataStax Enterprise (DSE) 5.1.1 includes DataStax Spark Cassandra Connector 2.0.2 with all changes from earlier versions, and adds these production-certified changes:
  • Protect against Size Estimate Overflows (SPARKC-492)
  • Add java.time classes support to converters and sparkSQL (SPARKC-491)
  • Allow Writes to Static Columnns and Partition Keys (SPARKC-470)

DSE 5.1.0

Release notes for DataStax Enterprise 5.1.0.

  • Select Hadoop libraries

    Built-in Hadoop and Bring-Your-Own-Hadoop (BYOH) were deprecated in DataStax Enterprise (DSE) 5.0, and were removed in DSE 5.1. Hadoop removal from DSE 5.1 and later means that DSE does not allow for the startup of Hadoop services previously included in DSE, including MapReduce JobTracker and TaskTracker.

    However, DSE has supported built-in Spark since DSE 4.5 and Bring-Your-Own-Spark (BYOS) since DSE 5.0, and that support continues today. Because Spark depends on certain Hadoop libraries on the server and the client, DSE continues to ship with Hadoop libraries that are required for running Spark and BYOS.

    To view the included Hadoop libraries, see DataStax Enterprise 5.1.x third-party software.

Release notes for DataStax Enterprise 5.1.0.
Important: DataStax recommends the latest patch release. The latest version of DataStax Enterprise 5.1 is 5.1.17. Due to Potential data loss for INSERTs with very large TTLs. (DSP-15412), DataStax does not recommend DSE 5.1.0-5.1.2 for production.
Attention: TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. When using a long TTL, DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action.

18 April 2017

5.1.0 Components

All components from DSE 5.1.0 are listed.
  • Apache Cassandra™ 3.10.0.1652
  • Apache Solr™ 6.0.1.0.1596
  • Apache Spark™ 2.0.2.6
  • Apache TinkerPop™ 3.2.5-20170222-de2f4034
  • Apache Tomcat® 8.0.37
  • DataStax Spark Cassandra Connector 2.0.1
  • DSE Java Driver 1.2.2
  • DSEFS 5.1.24
  • Netty 4.0.42.Final
  • Spark Jobserver 0.6.2.234 (requires compatible API)
  • Select Hadoop libraries

5.1.0 New features

See DataStax Enterprise 5.1 new features.

5.1.0 Experimental features

These features are experimental. DataStax does not support these experimental features for production:
  • Partitioned vertex tables (PVT) for handling supernodes in DSE Graph.

    Used for vertices that have a very large number of edges, a partitioned vertex consists of a portion of a vertex's data that results from dividing the vertex into smaller components for graph database storage.

  • Importing graphs using DseGraphFrame.
  • The dsetool index_checks use an Apache Lucene® experimental feature.
  • SASI indexes.
  • Structured streaming operations to and from DSEFS use a Spark ALPHA feature.
  • A DSEFS file system that spans multiple data centers.
  • Labs features in OpsCenter.

5.1.0 Changes and enhancements

5.1.0 DataStax Enterprise changes and enhancements

  • Add proxy authentication to DSE authentication model. (DSP-3800), (DSP-8467)
  • TimeWindowCompactionStrategy (TWCS) is set on dse_perf tables. To use TWCS on tables that were created in earlier releases, alter the tables after upgrade to DSE 5.1. (DSP-5560)
  • MemoryOnlyStrategy works with compression. (DSP-6715)
  • Add metrics for dropped mutations in Performance Object. (DSP-7936)
  • DSE server startup time is improved. (DSP-9545)
  • DateTieredStorageStrategy is deprecated. Use TimeWindowStorageStrategy instead. (DSP-9740)
  • Add tab completion to cqlsh for DSE custom compaction strategies. (DSP-9864)
  • Slow query log includes trace ID. (DSP-10055)
  • Support for setting row-level permissions. Setting row-level permissions with row-level access control (RLAC) is not supported for use with DSE Search or DSE Graph. (DSP-10093)
  • For G1GC the max heap size cap increased from 8192 MB to 32765 MB. See also Java performance tuning. (DSP-10459)
  • Change compaction strategy used by CassandraAuditWriter. (DSP-11508)
  • Implement dsetool command for printing most recent slowest queries. (DSP-11152)
  • Improved performance and changed defaults for CQL slow query logs. (DSP-11171)
  • Upgrades to DataStax Enterprise 5.1 are supported only from DataStax Enterprise 5.0. Upgrades from earlier versions require an interim upgrade to DSE 5.0. (DSP-11281)
  • The default authenticator is DseAuthenticator and default authorizer is DseAuthorizer in cassandra.yaml. Review and adjust your security settings after upgrading to DSE 5.1. (DSP-12211)
  • Authenticators other than DseAuthenticator and authorizers other than DseAuthorizer were deprecated in DSE 5.0; in DSE 5.1 some security features might not work correctly if other authenticators or authorizers are used. (DSP-12542)
  • Improved help for CQL and cqlsh commands. (DSP-12845)

    In cqlsh, type help to list all available topics. Type help name to find out more about the name command. For example, help CAPTURE or help ALTER_KEYSPACE.

  • Only perform drop below RF check on decommission for non-partitioned keyspaces. (DSP-13054)
  • Fix SmallInt and TinyInt serialization. (DSP-12916)
  • Check for null/empty password before calling legacyAuthenticate from CassandraLoginModule. (DSP-8573)
  • Allow registering user expression on SELECT statement. (DSP-12549)
  • Apply request timeout in cqlsh COPY correctly, after upgrading to execution profiles. (DSP-12698)
  • Update Java driver to DSE driver version 1.2.0-eap5. (DSP-11964)
  • Fix AssertionError in continuous paging request on select count(*) query. (DSP-11964)
  • Update internal DSE driver and fix formatting for Duration type. (DSP-11964)
  • Replace open source Python driver with DataStax Enterprise driver. (DSP-11964)
  • Fix OutOfSpaceTest. (DSP-12239)
  • Allow to add index restrictions to SELECT in an immutable way. (DSP-12239)
  • Allow grammar extensions to be added to cqlsh for tab completion. (DSP-12150)
  • Improve compaction performance. (DSP-11695)
  • Add client warning to SASI index. (DSP-11695)
  • Add support for UNSET values to cqlsh COPY FROM command. (DSP-11695)
  • Improve error message for incompatible authentication and authorization configuration. (DSP-11695)
  • Implement optimized continuous paging. (DSP-11695)
  • Added show-queries, query-log-file, and no-progress log options to cassandra-stress. (DSP-9476).
  • Allow large partition generation in cassandra-stress user mode. (DSP-9476)
  • Optimize variable sized integer (VIntCoding) and DataOutputStreamPlus interface using a ByteBuffer to stage writes (BufferedDataOutputStreamPlus). (DSP-9476)
  • Improve metrics and reduce overhead under resource contention. (DSP-9476)
  • Performance improvement: Make SinglePartitionReadCommand::queriesMulticellType() faster. (DSP-9476)
  • Accept internal resource name in GRANT/REVOKE statements. (DSP-11746)
  • Improve StatementRestrictions::getPartitionKeys() execution speed. (DSP-11724)
  • Move responsibility for qualifying keyspace in authorization statements to IResource. (DSP-11588)
  • Insert default superuser role with fixed timestamp. (DSP-11600)
  • Make permissions extensible. (DSP-11600)
  • Make IResource more easily extensible. (DSP-11600)
  • Add method to IAuthenticator to login by user, as well as by role. (DSP-11600)
  • Add private protocol version. (DSP-11535)

5.1.0 DSE Advanced Replication changes and enhancements

DSE Advanced Replication (V2) is CDC based and provides substantial improvements. CDC must be enabled in Cassandra.Migration from DSE 5.0 Advanced Replication (V1) to DSE 5.1 Advanced Replication (V2) is required.
  • DSE Advanced Replication certified for use with DSE Multi-Instance. (DSP-10738)
  • Support replication to multiple clusters. (DSP-8352)
  • Support multi-DC edge (source) cluster configurations. (DSP-8744)
  • Implement DSE Advanced Replication using Cassandra CDC (Change data capture). (DSP-9822)
  • Support for setting row-level permissions. (DSP-10727)

    Row-level access control (RLAC) security on the destination cluster. (DSP-10893)

  • Added support for migration. Migration from DSE 5.0 Advanced Replication (V1) to DSE 5.1 Advanced Replication (V2). (DSP-12280)
  • Performance metrics enhancements, including gauge metric type and Transmission group metrics. (DSP-12922).

5.1.0 DSE Analytics changes and enhancements

  • Implement WebHDFS REST interface on DSEFS. (DSP-2347)
  • Enable optional running Spark executor as a separate user. (DSP-4252)
  • Opaquely use Solr indexes to optimize SparkSQL queries. (DSP-5028)
  • The dsetool listjt command is removed and replaced with Automatic Spark Master election. (DSP-5944)
  • DSEFS support in BYOS. (DSP-8888)
  • Support SSL in the Spark Master and Worker UI. (DSP-9928)

    In dse.yaml, the spark_encryption_options are no longer valid.

  • Hive connector is removed. CassandraHive Metastore is used by Spark SQL. Hive cql/cassandra handler are removed. (DSP-10333)
  • BYOHadoop and DSE Hadoop are removed. (Deprecated in DSE 5.0) (DSP-10408)
  • Faster locking in DSEFS and support for shared locks. (DSP-11145)
  • Geo types are supported in DSE SparkSQL and represented as well known text. (DSP-11173)
  • A new CQL-based Resource Manager for Spark manages communication between the client application and the server to provide for more secured communications. See Monitoring Spark with the web interface. (DSP-11331)
    Note: If authentication is enabled, Spark Master web UI will prompt for credentials after upgrading to DSE 5.1.0 or later. See Security changes in the Upgrading from DataStax Enterprise 5.0 to 5.1 documentation.
  • Analytics jobs run through dse spark-submit can take advantage of continuous paging for performance gains. See Enabling continuous paging. (DSP-11343)
  • Access DSEGraphFrame tables through SparkSQL. (DSP-11898)
  • Enable authentication for server side Spark UIs. (DSP-11955)
  • Enhanced dse client-tool spark. (DSP-12048)
  • Programmatically setting the shuffle parameter using conf.set("spark.shuffle.service.port", port is not supported. Instead, use dse spark-submit which automatically sets the correct service port based on the authentication state. (DSP-12471)
  • Spark Jobserver has been upgraded to 0.6.2.234. This custom version requires applications to be recompiled using the compatible DataStax Spark Jobserver API (recommended) or jobserver 0.7.0. (DSP-12478)

5.1.0 DSE Graph changes and enhancements

  • The default number of threads used for loading vertices (load_vertex_threads) or edges (load_edge_threads) is changed from 1 to 0. (DGL-124)
  • When query fails due to timeout, state in error message which timeout was exceeded. (DSP-9393)
  • Add ifExists to drop graph. (DSP-9511)
  • Database errors related to graph queries go directly to drivers. (DSP-9567)
  • The format of edge IDs changed. There is no user impact. (DSP-10566)
  • Reject out of bounds geo data. (DSP-10748)
  • Disable graph#io. (DSP-10804)
  • Improve Graph and Spark integration for performance and usability with DSEGraphFrame framework for batch graph queries. (DSP-11104)
  • Prevent external Solr schema changes from being overwritten by DSE Graph. (DSP-11226)
  • Support Date type in Graph. (DSP-11287)
  • Graph-specific MBeans moved from datastore-latencies to request-latencies category. (DSP-11521)
  • Support for Solr-based fuzzy search in graph. (DSP-11273)
  • DSE Graph API support for edit distance queries. (DSP-11880)
  • Search regex '.' now matches all whitespace. (DSP-11952)
  • Kryo version conflict. (DSP-11984)
  • Add DSEG snapshot config mutator. (DSP-12072)
  • Setting Spark properties from Gremlin. (DSP-12296)
  • The Geo interfaces for distance and polygon queries are changed in the driver. (DSP-12710)
  • Changes in Geo predicates. (DSP-12467)

5.1.0 DSEFS changes and enhancements

  • DSEFS commands for controlling file permissions and ownership. (DSP-10582)
  • Tab autocompletion is supported. (DSP-10584)
  • Support for file compression. (DSP-10655)
  • Enhanced local file system operations in DSEFS shell. (DSP-10933)
  • Add comment (#) support in DSEFS shell. (DSP-10935)
  • Expose DSEFS metrics via JMX. (DSP-11375)
  • Improve DSEFS user experience: human readable sizes (-h) and single column output (-1). (DSP-11675)
  • Fix recursive ls parameter name: change -r to -R. (DSP-12016)
  • Make name_id part of primary key in names table. Improved DSEFS Cassandra schema to improve recovery of all metadata from inconsistency caused by concurrent writes. Upgrades to DataStax Enterprise 5.1 require steps to get new schema. (DSP-12450)
  • Although DSEFS is enabled by default in DSE 5.1.0, the dsefs.enabled setting is commented out in the new DSE 5.1.0 dse.yaml file. To enable DSEFS, uncomment the dsefs_options.enabled setting after upgrade to DSE 5.1.0. (DSP-13310)

5.1.0 DSE Search changes and enhancements

DSE Search in DataStax Enterprise 5.1 uses Apache Solr 6.0. (DSP-9748) This significant change requires advanced planning and specific actions before and after the upgrade.
Important: To upgrade DSE Search and SearchAnalytics workloads, you must follow the specific steps in upgrading to DSE 5.1.
  • DataImportHandler is no longer supported. The import handler tab is removed from Solr Admin UI. Before upgrading to DSE 5.1, remove all data import handlers from solrconfig files. (DSP-6266)
  • Remove the legacy netty-based inter-node communication protocol. See /en/upgrade/doc/upgrade/datastax_enterprise/upgdDSE51.html#upgdDSE51__prepUpg51SearchTimeout for non-query search requests like core creation and distributed deletes is set in the internode_messaging_options with the client_request_timeout_seconds option. (DSP-6933)
  • Automatically index both analyzed and non-analyzed versions of textual vertex properties. (DSP-7633)
  • Check for index integrity with dsetool using lucene CheckIndex. (DSP-8875)
  • New DSE Search index management commands to manage cluster-wide search indexes. (DSP-9204)
  • Lucene merge scheduling and lack of parallelism cause periods of 0 throughput. (DSP-9325)

    In earlier releases, the default mergeScheduler settings in solrconfig.xml were not set appropriately. The default settings are now set automatically and appropriately, unless a custom mergeScheduler configuration is provided.

  • Deprecated Solr field types require action before upgrade to DSE 5.1. (DSP-9509)
  • HTTP writes are deprecated. Insert data into DSE by using CQL. (DSP-9540)
  • dsetool search commands use the CQL index management commands. dsetool create_core no longer supports deleteAll. (DSP-9762)
  • DateRangeField support with new DateRangeType data type. (DSP-10225)
  • Improved asynchronous indexing performance. (DSP-10617)
  • Add more checks to CassandraSolrConfig for unwanted config elements. (DSP-10677)
  • LUCENE-7299 Optimized segment flushing with radix sort. (DSP-10685)
  • Changes in default behavior for auto-generated schemas to enable DocValues. (DSP-10690)
  • XML correctly indented to improve readability for auto-generated resources. (DSP-10795)
  • When using SpatialRecursivePrefixTreeFieldType (RPT) in search schemas, replace the units field type with distanceUnits after Upgrading to DSE 5.1. (DSP-10802)
  • Optimize Solr query parser to use filter boolean queries. (DSP-10916)
  • Stored=true copy fields are not supported and cause schema validation to fail. Before upgrading to 5.1, you must change the stored attribute value of a copyField directive from true to false in the schema.xml file and reload the core. (DSP-11087)
  • PER PARTITION clause is not supported for DSE Search solr_query queries. (DSP-11050)
  • Support limiting queries by time with the Solr timeAllowed parameter, DSE Search differences apply. (DSP-11165)
  • Improve client-side mapping of DSE Search exceptions. (DSP-11315)
  • Default batch size for the search TTL Process is changed. (DSP-11493)

    When a value is not specified for ttl_index_rebuild_options.max_docs_per_batch in dse.yaml, the default is changed from 100 to 4096.

  • DSE Search does not support the duration Cassandra data type. (DSP-11825)
  • Improved error handling for authentication and authorization of Solr HTTP requests and Solr Admin UI. (DSP-12550)

    Requests that fail due to lack of permissions return a 403 error, not a 401 error that was returned in earlier versions.

  • Add support for unfrozen tuples. (DSP-12347)
  • Improve default selection for dse.yaml and solrconfig.xml write path configuration. See Tuning search for maximum indexing throughput. (DSP-12491)

5.1.0 Known issues

Known issues for DSE:
  • sstableloader incorrectly detects keyspace when working with snapshots. (DB-2649)
    Workaround: create a directory that matches the keyspace name, and then create symbolic links into that directory from snapshot directory with name of the destination table. For example:
    mkdir -p /var/tmp/keyspace1
    ln -s <path>/cassandra/data/keyspace1/standard1-0e65b961deb311e88daf5581c30c2cd4/snapshots/data-load /var/tmp/keyspace1/standard1
  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.8, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • Potential data loss for INSERTs with very large TTLs. TTL expiration timestamps are susceptible to the year 2038 problem. (DSP-15412)
    The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately. TTLs are considered "very large" when close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. The maximum expiration timestamp that can be represented by the storage engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL that expire after this date are not currently supported. There is no protection against INSERTS with TTL expiring after the maximum supported date, causing the expiration time field to overflow and the records to expire immediately.
    Warning: Upgrade to DSE 5.1.7 or later and take required action to protect against overflow of local expiration time.
  • Even with nodetool repair -full or nodetool repair -pr, DSE DSE 5.1.0-5.1.2 are run as incremental and mark sstables as repaired causing anti-compaction. (DSP-14464)
  • DataStax Enterprise will not run with Java 1.8u161 or later. (DSP-15277)
  • Potential data loss for INSERTs with very large TTLs, where "very large" is close to the maximum allowed value of 630720000 seconds (20 years), starting from 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL is gradually reduced as the maximum expiration date approaches. For instance, on 2028-01-19T03:14:06 with a TTL of 10 years is impacted. If you use very large TTLs, DataStax strongly recommends upgrading to 5.1.7 or later. (DSP-15412)
  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.

Known issue for DSE Analytics:

  • The "remember me" feature used by the Shiro 1.2.4 library and also used by the Spark Job Server is vulnerable to malicious attackers. Do not enable the "remember me" feature in a custom shiro.ini file if you defined one in application.conf.

    DSE does not enable the "remember me" feature by default. (DSP-11072)

  • Apache Spark local privilege escalation vulnerability: CVE-2018-11760. DataStax recommends not using PySpark in multi-user environments. (DSP-18225)
Known issues for DSE Search:
  • Upgrades from 5.0.x to DSE 5.1.0-5.1.5 continuously exchange schema, which can possibly lead to compactions backing up. DataStax recommends upgrading to the latest version, 5.1.17. (DB-1477)
  • DateRange parsing improperly rolls over month, day, hour, min, seconds when invalid dates in a date range are specified. (DSP-12480)
  • DSE Search might miss token filtering on mixed versions clusters. Upgrade all nodes to DSE 5.1.6 or later for correct token filtering. (DSP-14998)
  • Skip DSE 5.1.0 and upgrade directly to DSE 5.1.1 if you:
    • Use the HTTP interface. (DSP-13318), (DSP-13270)
    • Have a Thrift column family backing an active Solr core. (DSP-13019)
    • Use TTL to expire data. (DSP-12960)
    • Use index encryption. (DSP-13155), (DSP-12620)
    • Use live indexing. (DSP-12040), (DSP-12941)
  • Solr listens only on port 8080 regardless of configuration. (DSP-13187)
  • Auto generated solrconfig.xml has invalid requestHandler for JSON core creations after upgrade to 5.1.0. (DSP-13188)
    If you make HTTP writes with JSON documents (deprecated), then change the auto generated solrconfig.xml:
    <requestHandler name="/update/json" class="solr.UpdateUpdateRequestHandler" startup="lazy"/>
    to
    <requestHandler name="/update/json" class="solr.UpdateRequestHandler" startup="lazy"/>

5.1.0 Resolved issues

5.1.0 DataStax Enterprise core resolved issues

  • Recent worst queries for slow query log. (DSP-5088)

    New configurable cql_slow_log_options.

  • dse lib has old metrics core version. (DSP-11389)
  • cqlsh SOURCE command shouldn't assume PlainTextAuthenticator. (DSP-12773)

5.1.0 DSE Advanced Replication resolved issues

  • Fix authentication and encryption settings for SSL remote cluster connections. (DSP-9470)

5.1.0 DSE Analytics resolved issues

  • Make dse client-tool sql-schema command consistent with double-dash parameters. (DSP-10557)
  • CFS repair can repair only the default file system as defined in Hadoop configuration. (DSP-12481)

5.1.0 DSE Graph resolved issues

  • Search.tokenRegex() is case sensitive. (DSP-9425)
  • Graph not working properly with Kerberos with serializeResultToString: true. (DSP-12201)
  • Enable split-DC graph ID allocation. (DSP-12516)
  • geo.distance(lng,lat,radius) expresses radius in degrees rather than kilometers. (DSP-12415)
  • Align distance query behavior between vertex properties with and without search indexes. (DSP-12673)

5.1.0 DSE Search resolved issues

  • Solr range facets before, after, and between return incorrect and inconsistent results on multinode clusters. (DSP-4485)
  • Validate auto generated resources before writing them. (DSP-7638)
  • Support for non-frozen UDTs. Solr field name policy applies to DSE Search. See the field name information in Apache Solr and Apache Lucene limitations. (DSP-11412)
  • Users require SELECT permissions on any search index that they view. Specific permissions are required for all core operations when using the Solr Admin UI. (DSP-11910)
  • QueryUtils#getStandardVertexIdComponents is not thread safe. (DSP-12254)
  • Core is not correctly unloaded on restarted nodes. (DSP-12434)
  • Native driver connections in dsetool aren't isolated to specified host. (DSP-12438)
  • Heap is exhausted while search reindexes very wide partitions. New IndexPool MBean attributes. (DSP-12547)
  • Concurrent sorting issue with RT. (DSP-12600)
  • Disable redundant, experimental, and other Solr 6 features. (DSP-13093)

Cassandra enhancements for DSE 5.1.0

A list of DataStax Enterprise 5.1.0 enhancements to Apache Cassandra™ 3.10.0.

DataStax Enterprise (DSE) 5.1.0 includes all changes from earlier DSE releases. These production-certified changes are enhancements to Apache Cassandra™ 3.10.0. (For Cassandra updates, see CHANGES.txt.)

  • Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820)
  • cdc column addition strikes again (CASSANDRA-13382)
  • Fix static column indexes (CASSANDRA-13277)
  • DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298)
  • unittest CipherFactoryTest failed on MacOS (CASSANDRA-13370)
  • Forbid SELECT restrictions and CREATE INDEX over non-frozen UDT columns (CASSANDRA-13247)
  • Default logging we ship will incorrectly print "?:?" for "%F:%L" pattern (CASSANDRA-13317)
  • Possible AssertionError in UnfilteredRowIteratorWithLowerBound (CASSANDRA-13366)
  • Support unaligned memory access for AArch64 (CASSANDRA-13326)
  • Improve SASI range iterator efficiency on intersection with an empty range (CASSANDRA-12915).
  • Fix equality comparisons of columns using the duration type (CASSANDRA-13174)
  • Obfuscate password in stress-graphs (CASSANDRA-12233)
  • Move to FastThreadLocalThread and FastThreadLocal (CASSANDRA-13034)
  • nodetool stopdaemon errors out (CASSANDRA-13030)
  • Tables in system_distributed should not use gcgs of 0 (CASSANDRA-12954)
  • Fix primary index calculation for SASI (CASSANDRA-12910)
  • More fixes to the TokenAllocator (CASSANDRA-12990)
  • NoReplicationTokenAllocator should work with zero replication factor (CASSANDRA-12983)
  • Address message coalescing regression (CASSANDRA-12676)
  • Fix possible NPE on upgrade to 3.0/3.X in case of IO errors (CASSANDRA-13389)
  • Legacy deserializer can create empty range tombstones (CASSANDRA-13341)
  • Legacy caching options can prevent 3.0 upgrade (CASSANDRA-13384)
  • Use the Kernel32 library to retrieve the PID on Windows and fix startup checks (CASSANDRA-13333)
  • Fix code to not exchange schema across major versions (CASSANDRA-13274)
  • Dropping column results in "corrupt" SSTable (CASSANDRA-13337)
  • Bugs handling range tombstones in the sstable iterators (CASSANDRA-13340)
  • Fix CONTAINS filtering for null collections (CASSANDRA-13246)
  • Applying: Use a unique metric reservoir per test run when using Cassandra-wide metrics residing in MBeans (CASSANDRA-13216)
  • Propagate row deletions in 2i tables on upgrade (CASSANDRA-13320)
  • Slice.isEmpty() returns false for some empty slices (CASSANDRA-13305)
  • Add formatted row output to assertEmpty in CQL Tester (CASSANDRA-13238)
  • Prevent data loss on upgrade 2.1 - 3.0 by adding component separator to LogRecord absolute path (CASSANDRA-13294)
  • Improve testing on macOS by eliminating sigar logging (CASSANDRA-13233)
  • Cqlsh copy-from should error out when csv contains invalid data for collections (CASSANDRA-13071)
  • Update c.yaml doc for offheap memtables (CASSANDRA-13179)
  • Faster StreamingHistogram (CASSANDRA-13038)
  • Legacy deserializer can create unexpected boundary range tombstones (CASSANDRA-13237)
  • Remove unnecessary assertion from AntiCompactionTest (CASSANDRA-13070)
  • Fix cqlsh COPY for dates before 1900 (CASSANDRA-13185)
  • Use keyspace replication settings on system.size_estimates table (CASSANDRA-9639)
  • Add vm.max_map_count StartupCheck (CASSANDRA-13008)
  • Hint related logging should include the IP address of the destination in addition to
  • host ID (CASSANDRA-13205)
  • Reloading logback.xml does not work (CASSANDRA-13173)
  • Lightweight transactions temporarily fail after upgrade from 2.1 to 3.0 (CASSANDRA-13109)
  • Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 (CASSANDRA-13125)
  • Fix UPDATE queries with empty IN restrictions (CASSANDRA-13152)
  • Fix handling of partition with partition-level deletion plus
  • live rows in sstabledump (CASSANDRA-13177)
  • Provide user workaround when system_schema.columns does not contain entries
  • for a table that's in system_schema.tables (CASSANDRA-13180)
  • Honor truststore-password parameter in cassandra-stress (CASSANDRA-12773)
  • Discard in-flight shadow round responses (CASSANDRA-12653)
  • Don't anti-compact repaired data to avoid inconsistencies (CASSANDRA-13153)
  • Wrong logger name in AnticompactionTask (CASSANDRA-13343)
  • Commitlog replay may fail if last mutation is within 4 bytes of end of segment (CASSANDRA-13282)
  • Fix queries updating multiple time the same list (CASSANDRA-13130)
  • Fix GRANT/REVOKE when keyspace isn't specified (CASSANDRA-13053)
  • Avoid race on receiver by starting streaming sender thread after sending init message (CASSANDRA-12886)
  • Fix "multiple versions of ant detected..." when running ant test (CASSANDRA-13232)
  • Coalescing strategy sleeps too much (CASSANDRA-1309)
  • Fix flaky LongLeveledCompactionStrategyTest (CASSANDRA-12202)
  • Fix failing COPY TO STDOUT (CASSANDRA-12497)
  • Fix ColumnCounter::countAll behaviour for reverse queries (CASSANDRA-13222)
  • Exceptions encountered calling getSeeds() breaks OTC thread (CASSANDRA-13018)
  • Fix negative mean latency metric (CASSANDRA-12876)
  • Use only one file pointer when creating commitlog segments (CASSANDRA-12539)
  • Remove unused repositories (CASSANDRA-13278)
  • Log stacktrace of uncaught exceptions (CASSANDRA-13108)
  • Use portable stderr for java error in startup (CASSANDRA-13211)
  • Fix Thread Leak in OutboundTcpConnection (CASSANDRA-13204)
  • Coalescing strategy can enter infinite loop (CASSANDRA-13159)

General upgrade advice for DSE 5.1.0

General upgrade advice for DataStax Enterprise 5.1.0.

General upgrade advice for DataStax Enterprise 5.1.0.

Carefully review all planning and upgrade documentation in the Upgrading DataStax Enterprise guide. This general advice applies to the database upgrade and does not replace the upgrade documentation.

  • General upgrading advice for any version and New features for Apache Cassandra are in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
  • See also the Apache Cassandra changes in CHANGES.txt.

DataStax Enterprise 5.1.0 includes Apache Cassandra™ 3.10.0.

New features in Cassandra 3.10

  • New `DurationType` (cql duration). See CASSANDRA-11873
  • Runtime modification of concurrent_compactors is now available via nodetool
  • Support for the assignment operators +=/-= has been added for update queries.
  • An Index implementation may now provide a task which runs prior to joining the ring. See CASSANDRA-12039
  • Filtering on partition key columns is now also supported for queries without secondary indexes.
  • A slow query log has been added: slow queries will be logged at DEBUG level. For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms in cassandra.yaml.
  • Support for GROUP BY queries has been added.
  • A new compaction-stress tool has been added to test the throughput of compaction for any cassandra-stress user schema. see compaction-stress help for how to use.
  • Compaction can now take into account overlapping tables that don't take part in the compaction to look for deleted or overwritten data in the compacted tables. Then such data is found, it can be safely discarded, which in turn should enable the removal of tombstones over that data.
    The behavior can be engaged in two ways:
    • as a "nodetool garbagecollect -g CELL/ROW" operation, which applies single-table compaction on all sstables to discard deleted data in one step
    • as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag, which uses overlapping tables as a source of deletions/overwrites during all compactions.
    The argument specifies the granularity at which deleted data is to be found:
    • If ROW is specified, only whole deleted rows (or sets of rows) will be discarded.
    • If CELL is specified, any columns whose value is overwritten or deleted will also be discarded.
    • NONE (default) specifies the old behavior, overlapping tables are not used to decide when to discard data.
    Which option to use depends on your workload, both ROW and CELL increase the disk load on compaction (especially with the size-tiered compaction strategy), with CELL being more resource-intensive. Both should lead to better read performance if deleting rows (resp. overwriting or deleting cells) is common.
  • Prepared statements are now persisted in the table prepared_statements in the system keyspace. Upon startup, this table is used to preload all previously prepared statements - i.e. in many cases clients do not need to re-prepare statements against restarted nodes.
  • cqlsh can now connect to older Cassandra versions by downgrading the native protocol version. Please note that this is currently not part of our release testing and, as a consequence, it is not guaranteed to work in all cases. See CASSANDRA-12150 for more details.
  • Snapshots that are automatically taken before a table is dropped or truncated will have a "dropped" or "truncated" prefix on their snapshot tag name.
  • Metrics are exposed for successful and failed authentication attempts. These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
  • Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET. See CASSANDRA-11424 for details
  • Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
  • Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.

Advice for upgrades to Cassandra 3.10

  • Support for alter types of already defined tables and of UDTs fields has been disabled. If it is necessary to return a different type, please use casting instead. See CASSANDRA-12443 for more details.
  • Specifying the default_time_to_live option when creating or altering a materialized view was erroneously accepted (and ignored). It is now properly rejected.
  • Only Java and JavaScript are now supported UDF languages. The sandbox in 3.0 already prevented the use of script languages except Java and JavaScript.
  • Compaction now correctly drops sstables out of CompactionTask when there isn't enough disk space to perform the full compaction. This should reduce pending compaction tasks on systems with little remaining disk space.
  • Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the "full" request time on the coordinator. Previously, they only covered the time from when the coordinator sent a message to a replica until the time that the replica responded. Additionally, the previous behavior was to reset the timeout when performing a read repair, making a second read to fix a short read, and when subranges were read as part of a range scan or secondary index query. In 3.10 and higher, the timeout is no longer reset for these "subqueries". The entire request must complete within the specified timeout. As a consequence, your timeouts may need to be adjusted to account for this. See CASSANDRA-12256 for more details.
  • Logs written to stdout are now consistent with logs written to files. Time is now local (it was UTC on the console and local in files). Date, thread, file and line info where added to stdout. (see CASSANDRA-12004)
  • The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided. The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on that jar, but if you need that jar for backward compatibility until you do so, you should use the version provided on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
  • (Tools development) DatabaseDescriptor no longer implicitly startups components/services like commit log replay. This may break existing 3rd party tools and clients. In order to startup a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner, snitch, encryption context. Client initialization just applies the configuration but does not setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
  • Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property. - Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format (e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).

New features in Cassandra 3.8

  • Shared pool threads are now named according to the stage they are executing tasks for. Thread names mentioned in traced queries change accordingly.
  • A new option has been added to cassandra-stress "-rate fixed={number}/s" that forces a scheduled rate of operations/sec over time. Using this, stress can accurately account for coordinated ommission from the stress process.
  • The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
  • hdr histograms have been added to stress runs, it's output can be saved to disk using: "-log hdrfile=" option. This histogram includes response/service/wait times when used with the fixed or throttle rate options. The histogram file can be plotted on http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
  • TimeWindowCompactionStrategy has been added. This has proven to be a better approach to time series compaction and new tables should use this instead of DTCS. See CASSANDRA-9666 for details.
  • Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and a brief explanation of on-disk locations for archived data in CommitLog form. This can be enabled via ALTER TABLE ... WITH cdc=true. Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached between unflushed CommitLogSegments and cdc_raw. NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version cluster as it will lead to exceptions which can interrupt traffic. Once all nodes have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.

Advice for upgrades to Apache Cassandra 3.8

  • The ReversedType behaviour has been corrected for clustering columns of BYTES type containing empty value. Scrub should be run on the existing SSTables containing a descending clustering column of BYTES type to correct their ordering. See CASSANDRA-12127 for more details.
  • Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address to the public instance IP if this property is defined on cassandra.yaml.
  • The name "json" and "distinct" are not valid anymore a user-defined function names (they are still valid as column name however). In the unlikely case where you had defined functions with such names, you will need to recreate those under a different name, change your code to use the new names and drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more details).
  • DateTieredCompactionStrategy has been deprecated - new tables should use TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might cause increased compaction load for a while after the migration so make sure you run tests before migrating. Read CASSANDRA-9666 for background on this.

Advice for upgrades to Apache Cassandra 3.7

  • A maximum size for SSTables values has been introduced, to prevent out of memory exceptions when reading corrupt SSTables. This maximum size can be set via max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.

New features in Apache Cassandra 3.6

  • JMX connections can now use the same auth mechanisms as CQL clients. New options in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings still only expose JMX locally, and use the JVM's own security mechanisms when remote connections are permitted. For more details on how to enable the new options, see the comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details. Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system property at startup is deprecated. See CASSANDRA-11725 for more details.
  • JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
  • Collision checks are performed when joining the token ring, regardless of whether the node should bootstrap. Additionally, replace_address can legitimately be used without bootstrapping to help with recovery of nodes with partially failed disks. See CASSANDRA-10134 for more details.
  • Key cache will only hold indexed entries up to the size configured by column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries will never go into memory. See CASSANDRA-11206 for more details.
  • For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL from the inserted or updated values.
  • Startup is now aborted if corrupted transaction log files are found. The details of the affected log files are now logged, allowing the operator to decide how to resolve the situation.
  • Filtering expressions are made more pluggable and can be added programatically via a QueryHandler implementation. See CASSANDRA-11295 for more details.

New features in Apache Cassandra 3.4

  • Internal authentication now supports caching of encrypted credentials. Reference cassandra.yaml:credentials_validity_in_ms
  • Remote configuration of auth caches via JMX can be disabled using the the system property cassandra.disable_auth_caches_remote_configuration
  • sstabledump tool is added to be 3.0 version of former sstable2json. The tool only supports v3.0+ SSTables. See tool's help for more detail.
  • The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is common across all caches in the auth subsystem. The specific mbean interfaces for each individual cache will be removed in a subsequent major version.

New features in Apache Cassandra 3.2

  • We now make sure that a token does not exist in several data directories. This means that we run one compaction strategy per data_file_directory and we use one thread per directory to flush. Use nodetool relocatesstables to make sure your tokens are in the correct place, or just wait and compaction will handle it. See CASSANDRA-6696 for more details.
  • bound maximum in-flight commit log replay mutation bytes to 64 megabytes tunable via cassandra.commitlog_max_outstanding_replay_bytes
  • Support for type casting has been added to the selection clause.
  • Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression. Note: hints compression is currently disabled by default.

Advice for upgrades to Apache Cassandra 3.2

  • The compression ratio metrics computation has been modified to be more accurate.
  • Running Cassandra as root is prevented by default.
  • JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file
  • The Thrift API is deprecated and will be removed in Cassandra 4.0.

Advice for upgrades to Apache Cassandra 3.1

  • The return value of SelectStatement::getLimit as been changed from DataLimits to int.
  • Custom index implementation should be aware that the method Indexer::indexes() has been removed as its contract was misleading and all custom implementation should have almost surely returned true inconditionally for that method.
  • GC logging is now enabled by default (you can disable it in the jvm.options file if you prefer).

New features in Apache Cassandra 3.0

  • EACH_QUORUM is now a supported consistency level for read requests.
  • Support for IN restrictions on any partition key component or clustering key as well as support for EQ and IN multicolumn restrictions has been added to UPDATE and DELETE statement.
  • Support for single-column and multi-colum slice restrictions (>, >=, <= and <) has been added to DELETE statements
  • nodetool rebuild_index accepts the index argument without the redundant table name
  • Materialized Views, which allow for server-side denormalization, is now available. Materialized views provide an alternative to secondary indexes for non-primary key queries, and perform much better for indexing high cardinality columns. See http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
  • Hinted handoff has been completely rewritten. Hints are now stored in flat files, with less overhead for storage and more efficient dispatch. See CASSANDRA-6230 for full details.
  • Option to not purge unrepaired tombstones. To avoid users having data resurrected if repair has not been run within gc_grace_seconds, an option has been added to only allow tombstones from repaired sstables to be purged. To enable, set the compaction option 'only_purge_repaired_tombstones':true but keep in mind that if you do not run repair for a long time, you will keep all tombstones around which can cause other problems.
  • Enabled warning on GC taking longer than 1000ms. See cassandra.yaml:gc_warn_threshold_in_ms

Advice for upgrades to Apache Cassandra 3.0

  • Clients must use the native protocol version 3 when upgrading from 2.2.X as the native protocol version 4 is not compatible between 2.2.X and 3.Y. See https://www.mail-archive.com/user@cassandra.apache.org/msg45381.html for details.
  • A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator, representing the IP address of the client attempting authentication. It will be a breaking change for any custom implementations.
  • token-generator tool has been removed.
  • Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9, or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and older versions is not supported.
  • The 'memtable_allocation_type: offheap_objects' option has been removed. It should be re-introduced in a future release and you can follow CASSANDRA-9472 to know more.
  • Configuration parameter memory_allocator in cassandra.yaml has been removed.
  • The native protocol versions 1 and 2 are not supported anymore.
  • Max mutation size is now configurable via max_mutation_size_in_kb setting in cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024.
  • 3.0 requires Java 8u40 or later.
  • Garbage collection options were moved from cassandra-env to jvm.options file.
  • New transaction log files have been introduced to replace the compactions_in_progress system table, temporary file markers (tmp and tmplink) and sstable ancerstors. Therefore, compaction metadata no longer contains ancestors. Transaction log files list sstable descriptors involved in compactions and other operations such as flushing and streaming. Use the sstableutil tool to list any sstable files currently involved in operations not yet completed, which previously would have been marked as temporary. A transaction log file contains one sstable per line, with the prefix "add:" or "remove:". They also contain a special line "commit", only inserted at the end when the transaction is committed. On startup we use these files to cleanup any partial transactions that were in progress when the process exited. If the commit line is found, we keep new sstables (those with the "add" prefix) and delete the old sstables (those with the "remove" prefix), vice-versa if the commit line is missing. Should you lose or delete these log files, both old and new sstable files will be kept as live files, which will result in duplicated sstables. These files are protected by incremental checksums so you should not manually edit them. When restoring a full backup or moving sstable files, you should clean-up any left over transactions and their temporary files first. You can use this command: ===> sstableutil -c ks table See CASSANDRA-7066 for full details.
  • New write stages have been added for batchlog and materialized view mutations you can set their size in cassandra.yaml
  • User defined functions are now executed in a sandbox. To use UDFs and UDAs, you have to enable them in cassandra.yaml.
  • New SSTable version 'la' with improved bloom-filter false-positive handling compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade is not necessary but recommended.
  • Before upgrading to 3.0, make sure that your cluster is in complete agreement (schema versions outputted by `nodetool describecluster` are all the same).
  • Schema metadata is now stored in the new `system_schema` keyspace, and legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details.
  • Pig's support has been removed.
  • Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use CqlBulkOutputFormat and CqlBulkRecordWriter instead.
  • Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed; use CqlInputFormat and CqlOutputFormat instead.
  • Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed; use CqlRecordReader and CqlRecordWriter instead.
  • hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers. To specify a list of excluded data centers when hinted_handoff_enabled is set to true, use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details.
  • The `sstable_compression` and `chunk_length_kb` compression options have been deprecated. The new options are `class` and `chunk_length_in_kb`. Disabling compression should now be done by setting the new option `enabled` to `false`.
  • The compression option `crc_check_chance` became a top-level table option, but is currently enforced only against tables with enabled compression.
  • Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax has been deprecated since 2.1.0 and is removed in 3.0.0.
  • The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options, has now been removed.
  • Batchlog entries are now stored in a new table - system.batches. The old one has been deprecated.
  • JMX methods set/getCompactionStrategyClass have been removed, use set/getCompactionParameters or set/getCompactionParametersJson instead.
  • SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
  • The secondary index API has been comprehensively reworked. This will be a breaking change for any custom index implementations, which should now look to implement the new org.apache.cassandra.index.Index interface. New syntax has been added to create and query row-based indexes, which are not explicitly linked to a single column in the base table.

Spark Cassandra Connector changes for DSE 5.1.0

DataStax Enterprise (DSE) 5.1.0 includes DataStax Spark Cassandra Connector 2.0.1 with this production-certified change:
  • Refactor Custom Scan Method (SPARKC-481)
DSE 5.1.0 includes these production-certified changes from earlier versions of the DataStax Spark Cassandra Connector.
2.0.0
  • Upgrade driver version for 2.0.0 Release to 3.1.4 (SPARKC-474)
  • Extend SPARKC-383 to All Row Readers (SPARKC-473)

2.0.0 RC1

  • Includes all patches up to 1.6.5
  • Automatic adjustment of Max Connections (SPARKC-471)
  • Allow for Custom Table Scan Method (SPARKC-459)
  • Enable PerPartitionLimit (SPARKC-446)
  • Support client certificate authentication for two-way SSL Encryption (SPARKC-359
  • Change Config Generation for Cassandra Runners (SPARKC-424)
  • Remove deprecated QueryRetryDelay parameter (SPARKC-423)
  • User ConnectionHostParam.default as default hosts String
  • Update usages of deprecated SQLContext so that SparkSession is used instead (SPARKC-400)
  • Test Reused Exchange SPARK-17673 (SPARKC-429)
  • Module refactoring (SPARKC-398)
  • Recognition of Java Driver Annotated Classes (SPARKC-427)
  • RDD.deleteFromCassandra (SPARKC-349)
  • Coalesce Pushdown to Cassandra (SPARKC-161)
  • Custom Conf options in Custom Pushdowns (SPARKC-435)
  • Upgrade CommonBeatUtils to 1.9.3 to Avoid SID-760 (SPARKC-457)

2.0.0 M3

  • Includes all patches up to 1.6.2

2.0.0 M2

  • Includes all patches up to 1.6.1

2.0.0 M1

  • Added support for left outer joins with Cassandra table (SPARKC-181)
  • Removed CassandraSqlContext and underscore based options (SPARKC-399)
  • Upgrade to Spark 2.0.0-preview (SPARKC-396)
    • Removed Twitter demo because there is no spark-streaming-twitter package available anymore
    • Removed Akka Actor demo becaues there is no support for such streams anymore
    • Bring back Kafka project and make it compile
    • Update several classes to use our Logging instead of Spark Logging because Spark Logging became private
    • Update plugins and Scala version