DataStax Enterprise 6.7 release notes

DataStax Enterprise release notes include cluster requirements, upgrade advice, components, security updates, changes and enhancements, issues, and resolved issues for DataStax Enterprise 6.7.x.

DataStax Enterprise release notes cover cluster requirements, upgrade guidance, components, security updates, changes and enhancements, issues, and resolved issues for DataStax Enterprise (DSE) 6.7.x.

Requirement for Uniform Licensing

All nodes in each cluster must be uniformly licensed to use the same subscription. For example, if a cluster contains 5 nodes, all 5 nodes within that cluster must be either DataStax Distribution of Apache Cassandra™, or all 5 nodes must be DataStax Enterprise. Mixing different subscriptions within a cluster is not permitted. The DataStax Advanced Workloads Pack may be added to any DataStax Enterprise (not DataStax Distribution of Apache Cassandra) cluster in an incremental fashion. For example, a 10-node DSE cluster may be extended to include 3 nodes of the Advanced Workloads Pack. “Cluster” means a collection of nodes running the software which communicate with one another using gossip. See Enterprise Terms.

Before you upgrade

Upgrade advice Compatibility
Before you upgrade to a later major version, upgrade to the latest patch release (6.7.4) on your current version. Be sure to read the relevant upgrade documentation. Upgrades to DSE 6.7 are supported from:
Check the compatibility page for your products. DSE 6.7 product compatibility:
See Upgrading DataStax drivers. DataStax Drivers: You may need to recompile your client application code.
Use DataStax Bulk Loader for loading and unloading data. Loads data into DSE 5.0 or later and unloads data from any Apache Cassandra™ 2.1 or later data source.

DSE 6.7.4 release notes

dse.yaml

The location of the dse.yaml file depends on the type of installation:
Package installations /etc/dse/dse.yaml
Tarball installations installation_location/resources/dse/conf/dse.yaml

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

27 June 2019

Table 1. DSE functionality
6.7.4 DSE core 6.7.4 DSE Graph
6.7.4 DSE Analytics 6.7.4 DSE Search
6.7.4 DSEFS

6.7.4 Components

All components from DSE 6.7.4 are listed. Components that are updated for DSE 6.7.4 are indicated with an asterisk (*).
  • Apache Solr™ 6.0.1.1.2472 *
  • Apache Spark™ 2.2.3.4
  • Apache TinkerPop™ 3.3.7 with additional production-certified changes *
  • Apache Tomcat® 8.0.53
  • DSE Java Driver 1.7.0
  • Netty 4.1.25.4.dse
  • Spark Jobserver 0.8.0.45 DSE custom version

DSE 6.7.4 is compatible with Apache Cassandra™ 3.11 and adds additional production-certified changes.

DSE 6.7.4 Highlights

High-value benefits of upgrading to DSE 6.7.4 include these highlights:

DSE Database (DSE core) highlights

  • Significant fixes and improvements for native memory, the chunk cache, and async read timeouts.
  • Improved logging identifies which client, keyspace, table, and partition key is rejected when mutation exceeds size threshold. (DB-1051)
  • Improved lightweight transactions (LWT) handling. (DB-3018)
  • Prevent hang during startup when reading counter cache. (DB-3050)
  • New configurable memory leak tracking. (DB-3123)
  • Fixed an issue by incrementing pending echos when sending gossip echo requests. (DB-3187)
  • Fixed an issue that caused an error when using tarball installs to create two instances on the same physical server with remote JMX access. (DB-2483)
  • Improved status reporting for nodesync validation list. (DB-2707)
  • Bootstrap now fails when the node is not able to fetch the schema from other nodes in the cluster. (DB-3186)
  • Fixed an issue to prevent deadlock when replaying schema mutations from commit log during DSE startup. (DB-3190)

DSE Analytics highlights

  • When DSE authentication is enabled, Spark security is forced to be enabled. (DSP-17274)
  • Spark security is turned on in dse.yaml configuration file. (DSP-17271)

DSE Graph highlights

  • Improved gremlin-console startup time. (DSP-11550)
  • Fixed an issue where DseGraphFrame cannot directly copy graph from one cluster to another. You can now dynamically pass cluster and connection configuration for different graph objects. (DSP-18605)
  • Fixed an issue where UnsatisfiedLinkError occurs when insert multi edge with DseGraphFrame in BYOS (Bring Your Own Spark). (DSP-18916)
  • DSE Graph does not use primary key predicate in Search/.has() predicate. (DSP-18993)
  • Fixed an issue where T values get hidden by property keys of the same name in valueMap(). (DSP-19261)
  • Fixed an issue where graph Solr queries might not be run on a TPC Scheduler. (DSP-18898)

DSE Search highlights

  • Performance improvements and overload protection for search queries. (DSP-15875)
  • Performance improvements to Solr deletes that correspond to Cassandra rows. (DSP-17419)
  • Fixed an issue where FQ was broken with queryExecutorThreads and timeAllowed set. (DSP-18717)
  • Changes to correct uneven distribution of shard requests with the STATIC set cover finder. (DSP-18197)
  • New recommended method for case-insensitive text search, faceting, grouping, and sorting with new LowerCaseStrField Solr field type. This type sets field values as lowercase and stores them as lowercase in docValues. (DSP-18763)

6.7.4 DSE core

Changes and enhancements:

  • Improved logging identifies which client, keyspace, table, and partition key is rejected when mutation exceeds size threshold. (DB-1051)
  • Improved status reporting for nodesync validation list. (DB-2707)
  • Enable upgrading and downgrading SSTables using a CQL file that contains DDL statements to recreate the schema. (DB-2951)
  • Improved lightweight transactions (LWT) performance. New cassandra.yaml LWT configuration options. (DB-3018)
  • Deadlock when replaying schema mutations from commit log during DSE startup. (DB-3190)
  • Reject requests from the TPC backpressure queue when they have been on the queue for too long. (DSP-15875)

Resolved issues:

  • Tarball installs to create two instances on the same physical server with remote JMX access with binding the separated IPs to port 7199 causes JMX error of Address already in use (Bind failed) because com.sun.management.jmxremote.host is ignored. (DB-2483)
  • Slow startup or node hangs when encryption is used. (DB-3050)
  • cqlsh EXECUTE AS command does not work. (DB-3098)
  • DSE fails to start with ERROR Attempted serializing to buffer exceeded maximum of 65535 bytes. Improved error to identify a workaround for commitlog corruption. (DB-3162)
  • AssertionError in temporary buffer pool causes CorruptSSTableException. (DB-3172, DB-3174)
  • Memory leak on errors when reading. (DB-3175)
  • Bootstrap should fail when the node is not able to fetch the schema from other nodes in the cluster. (DB-3186)
  • Increment pending echos when sending gossip echo requests. (DB-3187)
  • sstabledowngrade needs write access to the snapshot folder for a different output location. (DB-3231)
  • Fixed possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.9, or DSE 6.7.4. Be sure to follow the upgrade instructions.

6.7.4 DSE Analytics

Changes and enhancements:

  • A warning message is displayed when DSE authentication is enabled, but Spark security is not enabled. (DSP-17273)
  • When DSE authentication is enabled, Spark security is forced to be enabled. (DSP-17274)
    dse.yaml Spark security is enforced
    authentication_options When enabled: true
    spark_security_enabled This setting is ignored.
    spark_security_encryption_enabled This setting is ignored.

Known issues:

  • When the Spark security options are not configured in dse.yaml, the native CQL Protocol authentication can be sidestepped with direct access to the Netty RPC client. Although this access should fail to run Spark applications, the CQL authentication can be bypassed on systems with an open Netty port 7077 using Spark RPC. (DSP-17271)
    Solution: Configure the Spark security options in dse.yaml:
    spark_shared_secret_bit_length: 256
                  spark_security_enabled: true
                  spark_security_encryption_enabled: true

Resolved issues:

  • Accessing files from Spark through WebHDFS interface fails with message: java.io.IOException: Content-Length is missing. (DSP-18559)
  • BYOS DSEFS access fails with AuthenticationException with dseauth_internal_no_otherschemes. (DSP-18822)
  • Submitting many Spark apps will reach the default tombstone_failure_threshold before the default 90 days gc_grace_seconds defined for the system_auth.role_permissions table. (DSP-19098)

6.7.4 DSEFS

Resolved issues:

6.7.4 DSE Graph

Changes and enhancements:

Resolved issues:
  • DseGraphFrame cannot directly copy graph from one cluster to another. You can now dynamically pass cluster and connection configuration for different graph objects. (DSP-18605)
    Workaround for earlier versions:
    1. Export graph to DSEFS:
      g.V.write.format("csv").save("dsefs://culster1/tmp/vertices")
      g.E.write.format("csv").save("dsefs://culster1/tmp/edges")
    2. Import graph to the other cluster:
      g.updateVertices(spark.read.format("csv").load("dsefs://culster1/tmp/vertices")
      g.updateEdges(spark.read.format("csv").load("dsefs://culster1/tmp/edges")
  • Issue querying a search index when the vertex label is set to cache properties. (DSP-18898)
  • UnsatisfiedLinkError when insert multi edge with DseGraphFrame in BYOS (Bring Your Own Spark). (DSP-18916)
  • DSE Graph does not use primary key predicate in Search/.has() predicate. (DSP-18993)
  • T values get hidden by property keys of the same name in valueMap(). (DSP-19261)

6.7.4 DSE Search

Changes and enhancements:

  • Changes to correct uneven distribution of shard requests with the STATIC set cover finder. (DSP-18197)
    • The shard.set.cover.finder default is changed from DYNAMIC to STATIC. After upgrading, you can restore the original behavior with:
      dsetool set_core_property keyspace_name.table_name shard.set.cover.finder=DYNAMIC
    • A new inertia parameter for dsetool set_core_property supports fine tuning. The default value of 1 can be adjusted for environments with vnodes and more than 10 nodes.
  • New recommended method for case-insensitive text search, faceting, grouping, and sorting with new LowerCaseStrField custom Solr field type. This type sets field values as lowercase and stores them as lowercase in docValues. (DSP-18763)
    Note: DataStax does not support using the TextField Solr field type with solr.KeywordTokenizer and solr.LowerCaseFilterFactory to achieve single-token, case-insensitive indexing on a CQL text field.

Resolved issues:

  • SASI queries don't work on tables with row level access control (RLAC). (DB-3082)
  • Documents might not be removed from the index when a key element has value equal to a Solr reserved word. (DSP-17419)
  • FQ broken with queryExecutorThreads and timeAllowed set. (DSP-18717)
  • Search should error out, rather than timeout, on Solr query with non-existing field list (fl) fields. (DSP-18218)
  • While using live indexing, also known as RT or real-time indexing, a race condition can be triggered when concurrently indexing and running heavy facet queries. The race condition fails an assertion that, in turn, fails searcher opening and leaves the index in an inconsistent state. (DSP-18786)

Cassandra enhancements for DSE 6.7.4

DataStax Enterprise 6.7.4 is compatible with Apache Cassandra™ 3.11 and adds production-certified enhancements.

DataStax Enterprise 6.7.4 is compatible with Apache Cassandra™ 3.11 includes all production-certified enhancements from previous versions.

General upgrade advice for DSE 6.7.4

General upgrade advice for DataStax Enterprise 6.7.4

DataStax Enterprise 6.7.4 is compatible with Apache Cassandra™ 3.11. All upgrade advice from previous versions applies. Carefully reviewing the DataStax Enterprise upgrade planning and upgrade instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

TinkerPop changes for DSE 6.7.4

A list of DataStax Enterprise 6.7.4 production-certified changes in addition to Apache TinkerPop 3.3.7.

DataStax Enterprise (DSE) 6.7.4 includes all changes from previous DSE versions plus these production-certified changes that are in addition to Apache TinkerPop™ 3.3.7. See TinkerPop upgrade documentation for all changes.

  • Developed DSL pattern for gremlin-javascript.
  • Generated uberjar artifact for Gremlin Console.
  • Improved folding of property() step into related mutating steps.
  • Added inject() to steps generated on the DSL TraversalSource.
  • Removed gperfutils dependencies from Gremlin Console.
  • Fixed PartitionStrategy when setting vertex label and having includeMetaProperties configured to true.
  • Ensure gremlin.sh works when directories contain spaces.
  • Prevented client-side hangs if metadata generation fails on the server.
  • Fixed bug with EventStrategy in relation to addE() where detachment was not happening properly.
  • Ensured that gremlin.sh works when directories contain spaces.
  • Fixed bug in detachment of Path where embedded collection objects would prevent that process.
  • Enabled ctrl+c to interrupt long running processes in Gremlin Console.
  • Quieted "host unavailable" warnings for both the driver and Gremlin Console.
  • Fixed construction of g:List from arrays in gremlin-javascript.
  • Fixed bug in GremlinGroovyScriptEngine interpreter mode around class definitions.
  • Implemented EdgeLabelVerificationStrategy.
  • Fixed behavior of P for within() and without() in GLVs to be consistent with Java when using varargs.
  • Cleared the input buffer after exceptions in Gremlin Console.
  • Added parameter to configure the processor in the gremlin-javascript client constructor.
  • Docker images now use gremlin user instead of root user.
  • Refactored use of commons-lang to use common-lang3 only. Dependencies may still use commons-lang.
  • Bumped commons-lang3 to 3.8.1.
  • Added GraphSON serialization support for Duration, Char, ByteBuffer, Byte, BigInteger, and BigDecimal in gremlin-python.
  • Added ProfilingAware interface to allow steps to be notified that profile() was being called.
  • Fixed bug where profile() could produce negative timings when group() contained a reducing barrier.
  • Improved logic determining the dead or alive state of a Java driver Connection.
  • Improved handling of dead connections and the availability of hosts.
  • Bumped httpclient to 4.5.7.
  • Bumped slf4j to 1.7.25.
  • Bumped commons-codec to 1.12.
  • Fixed partial response failures when using authentication in gremlin-python.
  • Fixed a bug in PartitionStrategy where addE() as a start step was not applying the partition.
  • Improved performance of JavaTranslator by reducing calls to Method.getParameters().
  • Implemented EarlyLimitStrategy which is supposed to significantly reduce backend operations for queries that use range().
  • Reduced chance of hash collisions in Bytecode and its inner classes.
  • Added Symbol.asyncIterator member to the Traversal class to provide support for await ... of loops (async iterables).

Bug fixes:

  • TINKERPOP-2081 PersistedOutputRDD materialises rdd lazily with Spark 2.x.
  • TINKERPOP-2091 Wrong/Missing feature requirements in StructureStandardTestSuite.
  • TINKERPOP-2094 Gremlin Driver Cluster Builder serializer method does not use mimeType as suggested.
  • TINKERPOP-2095 GroupStep looks for irrelevant barrier steps.
  • TINKERPOP-2096 gremlinpython: AttributeError when connection is closed before result is received.
  • TINKERPOP-2100 coalesce() creating unexpected results when used with order().
  • TINKERPOP-2105 Gremlin-Python connection not returned back to the pool on exception from gremlin server.
  • TINKERPOP-2113 P.Within() doesn't work when given a List argument.

Improvements:

  • TINKERPOP-1889 JavaScript GLV: Use heartbeat to prevent connection timeout.
  • TINKERPOP-2010 Generate jsdoc for gremlin-javascript.
  • TINKERPOP-2013 Process tests that are auto-ignored stink.
  • TINKERPOP-2018 Generate API docs for Gremlin.Net.
  • TINKERPOP-2038 Make groovy script cache size configurable.
  • TINKERPOP-2050 Add a :bytecode command to Gremlin Console.
  • TINKERPOP-2062 Add Traversal class to CoreImports.
  • TINKERPOP-2065 Optimize iterate() for remote traversals.
  • TINKERPOP-2067 Allow getting raw data from Gremlin.Net.Driver.IGremlinClient.
  • TINKERPOP-2068 Bump Jackson Databind 2.9.7.
  • TINKERPOP-2069 Document configuration of Gremlin.Net.
  • TINKERPOP-2070 gremlin-javascript: Introduce Connection representation.
  • TINKERPOP-2071 gremlin-python: the graphson deserializer for g:Set should return a python set.
  • TINKERPOP-2073 Generate tabs for static code blocks.
  • TINKERPOP-2074 Ensure that only NuGet packages for the current version are pushed.
  • TINKERPOP-2077 VertexProgram.Builder should have a default create() method with no Graph.
  • TINKERPOP-2078 Hide use of EmptyGraph or RemoteGraph behind a more unified method for TraversalSource construction.
  • TINKERPOP-2084 For remote requests in console display the remote stack trace.
  • TINKERPOP-2092 Deprecate default GraphSON serializer fields.
  • TINKERPOP-2097 Create a DriverRemoteConnection with an initialized Client.
  • TINKERPOP-2102 Deprecate static fields on TraversalSource related to remoting.
  • TINKERPOP-2106 When gremlin executes timeout, throw TimeoutException instead of TraversalInterruptedException/InterruptedIOException.
  • TINKERPOP-2110 Allow Connection on Different Path (from /gremlin).
  • TINKERPOP-2114 Document common Gremlin anti-patterns.
  • TINKERPOP-2118 Bump to Groovy 2.4.16.
  • TINKERPOP-2121 Bump Jackson Databind 2.9.8.

DSE 6.7.3 release notes

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

dse.yaml

The location of the dse.yaml file depends on the type of installation:
Package installations /etc/dse/dse.yaml
Tarball installations installation_location/resources/dse/conf/dse.yaml

23 April 2019

Table 2. DSE functionality
6.7.3 DSE core 6.7.3 DSE Graph
6.7.3 DSE Analytics 6.7.3 DSE Search
6.7.3 DSEFS

6.7.3 Components

  • Apache Solr™ 6.0.1.1.2408 (updated)
  • Apache Spark™ 2.2.3.4 (updated)
  • Apache TinkerPop™ 3.3.6 with additional production-certified changes
  • Apache Tomcat® 8.0.53
  • DSE Java Driver 1.7.0
  • Netty 4.1.25.4.dse
  • Spark Jobserver 0.8.0.45 DSE custom version

DSE 6.7.3 is compatible with Apache Cassandra™ 3.11 and adds additional production-certified changes.

DSE 6.7.3 Highlights

High-value benefits of upgrading to DSE 6.7.3 include these highlights:

DSE Database (DSE core) highlights

  • Compaction performance improvement with new cassandra.yaml pick_level_on_streaming option. (DB-1658)
  • The sstableloader downgrade from DSE to OSS Apache Cassandra is supported with new sstabledowngrade tool. (DB-2756)
    Important: The sstabledowngrade command cannot be used to downgrade system tables or downgrade DSE versions.
  • Fix NodeSync failing when validating MV row with empty partition key. (DB-2823)
  • nodesync can be enabled and disabled on materialized views (MV). (DB-3008)
  • Fixed anti-compaction transaction for atomicity and index building. (DB-3016)
  • Remedy deadlock during node startup when calculating disk boundaries. (DB-3028)
  • Correct handling of dropped UDT columns in SSTables. (DB-3031)

    Workaround: If issues with UDTs in SSTables exist after upgrade from DSE 6.0.x or DSE 5.0.x, run sstablescrub -e fix-only offline on the SSTables that have or had UDTs that were created in DSE 5.0.x.

  • Fix unclosed range tombstones in read response. See Cassandra enhancements for DSE 6.7.3.
  • The frame decoding off-heap queue size is configurable and smaller by default. (DB-3047)

DSE Analytics highlights

  • AlwaysOn SQL (AOSS) log files, including service.log, are consolidated in system.log. (DSP-18261)
  • Support configuration to connect to multiple hosts from BYOS connector with multiple hostnames in DseByosAuthConfFactory. (DSP-18231)
  • Fixed a leak in BulkTableWriter. (DSP-18513)

DSE Graph highlights

  • Time, date, inet, and duration data types are now supported in graph search indexes. (DSP-17694)
  • Some minor DSE GraphFrame code fixes. (DSP-18215)
  • Improved usability with simplified vertex and edge loading for single label update. (DSP-18404)
  • Operations through gremlin-console run with anonymous permissions. (DSP-18471)

DSE Search highlights

Upgrade if:
  • You use timestamps as primary keys or primary key elements. (DSP-18223)
  • You are hitting LUCENE-8262 and having to reload Solr cores. (DSP-18211)
  • You use queryExecutorThreads with facet queries. (DSP-18237)

6.7.3 DSE core

Changes and enhancements:

  • Compaction performance improvement with new cassandra.yaml pick_level_on_streaming option. (DB-1658)

    Streamed-in SSTables of tables using LCS (leveled compaction strategy) are placed in the same level as the source node, with possible up-leveling. Set to true to save compaction work for operations like nodetool refresh and replacing a node.

  • The sstableloader downgrade from DSE to OSS Apache Cassandra is supported with new sstabledowngrade tool. (DB-2756)
  • Unused memory in buffer pool. (DB-2788)
  • The memory in use in the buffer pool is not identical to the memory allocated. (DB-2904)
  • Support for using sstableloader to stream OSS Cassandra 3.x and DSE 5.x data to DSE 6.0 and later. (DB-2909)
  • Improved user tools for SSTable upgrades (sstableupgrade) and downgrades (sstabledowngrade). (DB-2950)
  • Memory improvements with these supported changes:
    • Configurable memory is supported for offline sstable tools. (DB-2955)

      You can use these environment variables tools:

      • MAX_HEAP_SIZE - defaults to 256 MB
      • MAX_DIRECT_MEMORY - defaults to ((system_memory - heap_size) / 4) with a minimum of 1 GB and a maximum of 8 GB.

      To specify memory on the command line:

      MAX_HEAP_SIZE=2g MAX_DIRECT_MEMORY=10g sstabledowngrade keyspace table
    • Buffer pool, and metrics for the buffer pool, are now in two pools. In cassandra.yaml, file_cache_size_in_mb option sets the file cache (or chunk cache) and new direct_reads_size_in_mb option for all other short-lived read operations. (DB-2958)

      To retrieve the buffer pool metrics:

      nodetool sjk mxdump -q "org.apache.cassandra.metrics:type=CachedReadsBufferPool,name=*"
      nodetool sjk mxdump -q "org.apache.cassandra.metrics:type=DirectReadsBufferPool,name=*"

      For legacy compatibility, org.apache.cassandra.metrics:type=BufferPool still exists and is the same as org.apache.cassandra.metrics:type=CachedReadsBufferPool.

    • cassandra-env.sh respect heap and direct memory values set in jvm.options or as environment variables. (DB-2973)
      The precedence for heap and direct memory is:
      • Environment variables
      • jvm.options
      • calculations in cassandra-env.sh
  • AIO is automatically disabled if the chunk cache size is small enough: less or equal to system RAM / 8. (DB-2997)
  • nodesync cannot be enabled or disabled on materialized views (MV). (DB-3008)
  • Optimized memory usage for direct reads pool when using a high number of LWTs. (DB-3124)

    When not set in cassandra.yaml, the default calculated size of direct_reads_size_in_mb changed from 128 MB to 2 MB per TPC core thread, plus 2 MB shared by non-TPC threads, with a maximum value of 128 MB.

Resolved issues:

  • Native server Message.Dispatcher.Flusher task stalls under heavy load. (DB-1814)
  • Race in CommitLog can cause failed force-flush-all. (DB-2542)
  • Unclosed range tombstones in read response. (DB-2601)
  • The sstableloader downgrade from DSE to OSS Apache Cassandra is not supported. New sstabledowngrade tool is required. (DB-2756)
  • NodeSync fails when validating MV row with empty partition key. (DB-2823)
  • TupleType values with null fields NPE when being made byte-comparable. (DB-2872)
  • Nodes in a cluster continue trying to connect to a decommissioned node. (DB-2886)
  • Reference leak in SSTableRewriter in sstableupgrade when keepOriginals is true. (DB-2944)
  • Hint-dispatcher file-channel not closed, if open() fails with OOM. (DB-2947)
  • Hints and metadata should not use buffer pool. (DB-2958)
  • Lightweight transactions contention may cause IO thread exhaustion. (DB-2965)
  • DIRECT_MEMORY is being calculated using 25% of total system memory if -Xmx is set in jvm.options. (DB-2973)
  • Netty direct buffers can potentially double the -XX:MaxDirectMemorySize limit. (DB-2993)
  • Increased NIO direct memory because the buffers are not cleaned until GC is run. (DB-2996)
  • Unable to upgrade SSTables from DSE 5.0.14 to DSE 6.0.5. (DB-3014)
  • Anti-compaction transaction causes temporary data loss. (DB-3016)
  • Deadlock during node startup when calculating disk boundaries. (DB-3028)
  • Dropped UDT columns in SSTables deserialization are broken after upgrading from DSE 5.0. (DB-3031)
  • Limit off-heap frame queues by configurable number of frames and total number of bytes. (DB-3047)
  • Mishandling of frozen in complex nested types. (DB-3081)
  • cqlsh EXECUTE AS command does not work. (DB-3098)
  • 32-bit int overflow in StreamingTombstoneHistogramBuilder during compaction. (DB-3108)
  • Too many NotInCacheExceptions (12) in trie index flow. (DB-3120)
  • Possible direct memory leak when part of bulk allocation fails. (DB-3125)
  • Counters in memtable allocators and buffer pool metrics can be incorrect when out of memory (OOM) failures occur. (DB-3126)
  • Memory leak occurs when a read from disk times out. (DB-3127)
  • RpcExecutionException does not print the user who is not authorized to perform a certain action. (DSP-15895)
  • Leak in BulkTableWriter. (DSP-18513)
  • Make the remote host visible in the error message for failed magic number verification. (DSP-18645)

Known issue:

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.9, or DSE 6.7.4. Be sure to follow the upgrade instructions.

6.7.3 DSE Analytics

Changes and enhancements:

  • Configurable heartbeat detection for AlwaysOn SQL (AOSS). (DSP-16391)
  • AlwaysOn SQL (AOSS) log files, including service.log, are consolidated in system.log. (DSP-18261)
  • Support configuration to connect to multiple hosts from BYOS connector. (DSP-18231)
  • CQL syntax error when single quote is not correctly escaped before including in save cache query to AOSS cache table. (DSP-18418)
  • Improved error messaging for AlwaysOn SQL (AOSS) client tool. (DSP-18409)
  • dse spark-submit --status <driver_ID> command fails. (DSP-18616)

Resolved issues:

  • After client-to-node SSL is enabled, all Spark nodes must also listen on port 7480. (DSP-15744)
  • dse client-tool configuration byos-export does not export required Spark properties. (DSP-15938)
  • Issue with viewing information for completed jobs when authentication is enabled. (DSP-17854)
  • Downloaded Spark JAR files are executable for all users. (DSP-17692)
  • Cassandra Spark Connector rejects nested UDT when null. (DSP-17965)
  • CassandraHiveMetastore does not unquote predicates for server side filtering. (DSP-18017)
  • Spark Cassandra Connector does properly cache manually prepared RegularStatements, see SPARKC-558. (DSP-18075)
  • Apache Spark local privilege escalation vulnerability: CVE-2018-11760. (DB-18225)
  • Can't access AlwaysOn SQL (AOSS) UI when authorization is enabled. (DSP-18236)
  • Invalid options show for dse spark-submit command line help. (DSP-18293)
  • Remove class DGFCleanerInterceptor from byos.jar. (DSP-18445)
  • GBTClassifier in Spark ML fails when periodic checkpointing is on. (DSP-18450)

Known issue:

  • DSE 6.7.3 is not compatible with Zeppelin in SparkR and PySpark 0.8.1. (DSP-18777)

    The Apache Spark™ 2.2.3.4 that is included with DSE 6.7.3 contains the patched protocol and all versions of DSE are compatible with the Scala interpreter.

    However, SparkR and PySpark use only a separate channel for communication with Zeppelin. This protocol was vulnerable to attack from other users on the system and was secured in CVE-2018-11760. Zeppelin in SparkR and PySpark 0.8.1 fails because it does not recognize that Spark 2.2.2 and later contain this patched protocol and attempts to use the old protocol. The Zeppelin patch to recognize this protocol is not available in a released Zeppelin build.

    Solution: Do not upgrade to DSE 6.7.3 if you use SparkR or PySpark. Wait for the Zeppelin release later than 0.8.1 that will recognize that DSE-packaged Spark can use the secured protocol.

6.7.3 DSEFS

Resolved issues:

  • Change dsefs:// default port when the DSEFS setting public_port is changed in dse.yaml. (DSP-17962)

    The shortcut dsefs:/// now automatically resolves to broadcastaddress:dsefs.public_port, instead of incorrectly using broadcastaddress:5598 regardless of the configured port.

  • weather_sensors demo is updated to use native DSEFS commands instead of dse fs hadoop. (DSP-17708)
  • Fix handling of path alternatives in DSEFS shell to provide wildcard support for mkdir and ls commands. (DSP-17768)
    For example, to make several subdirectories with a single command:
    dse fs mkdir -p /datastax/demos/weather_sensors/{byos-daily,byos-monthly,byos-station}
    dse fs mkdir -p {path1,path2}/dir
  • Problem with change group ownership of files using the fileSystem.setOwner method. (DSP-18052)

6.7.3 DSE Graph

Changes and enhancements:

  • The default for the spark.cassandra.output.ignoreNulls parameter is now true for DSE Graph Frames edge updates. To override this setting, set the spark.cassandra.output.ignoreNulls property to false. (DSP-17377)
  • Vertex and especially edge loading is simplified. idColumn function is no longer required. (DSP-18404)

Known issue:

  • Improved error reporting for errors during cache-based vertex lookup. AssertionError: Should not happen errors are properly reported depending on the root cause error, for example as a timeout exception. (DSP-18254)
  • Potential performance drop involving large table scans with DSE Analytics. DSE Graph OLAP operations such as V().count... and V().groupCount()... may be trivially affected (< 10%). (DSP-18683)
Resolved issues:
  • NPE when dropping a graph with an alias in gremlin console. (DSP-13387)
  • OLAP traversal duplicates the partition key properties: OLAP g.V().properties() prints 'first' vertex n times with custom ids. (DSP-15688)
  • Time, date, inet, and duration data types are not supported in graph search indexes. (DSP-17694)
  • AND operator is ignored in combination with OR operator in graph searches. (DSP-18061)
  • Should prevent sharing Gremlin Groovy closures between scripts that are submitted through session-less connections, like DSE drivers. (DSP-18146)
  • Some minor DSE GraphFrame code fixes. (DSP-18215)
  • Reduce probability of hitting max_concurrent_sessions limit for OLAP workloads with BYOS. (DSP-18280)
    Tip: For OLAP workloads with BYOS, DataStax recommends increasing the max_concurrent_sessions using this formula as a guideline:
    max_concurrent_sessions = spark_executors_threads_per_node x reliability_coefficient
    where reliability_coefficient must be greater than 1, with a minimum reliability_coefficient value between 2 and RF x 2.
  • Operations through gremlin-console run with system permissions, but should run with anonymous permissions. (DSP-18471)

6.7.3 DSE Search

Resolved issues:

  • SASI should discard stale static row. (DB-2956)
  • Edges are inserted with tombstone values set when inserting a recursive edge with multiple cardinality. When calling g.updateEdges(df), any null entries in the provided data frame should be ignored. (DSP-17377)
  • Solr HTTP request for CSV output is blank. The CSVResponseWriter returns only stored fields if a field list is not provided in the URL. (DSP-18029)
    To workaround, specify a field list with the URL:
    /select?q=*%3A*&sort=lst_updt_gdttm+desc&rows=10&fl=field1,field2&wt=csv&indent=true
  • Avoid interrupting request threads when an internode handshake fails so that the Lucene file channel lock cannot be interrupted. (DSP-18211)
  • Timestamp PK routing on solr_query fails. (DSP-18223)
  • Facets and stats queries broken when using queryExecutorThreads. (DSP-18237)

Cassandra enhancements for DSE 6.7.3

DataStax Enterprise 6.7.3 is compatible with Apache Cassandra™ 3.11 and adds production-certified enhancements.

DataStax Enterprise 6.7.3 is compatible with Apache Cassandra™ 3.11 and adds these production-certified enhancements:

  • Fix unclosed range tombstones in read response. Always close RT markers returned by ReadCommand#executeLocally() (CASSANDRA-14515)
  • Severe concurrency issues in STCS,DTCS,TWCS,TMD.Topology,TypeParser (CASSANDRA-14781)

General upgrade advice for DSE 6.7.3

General upgrade advice for DataStax Enterprise 6.7.3

DataStax Enterprise 6.7.3 is compatible with Apache Cassandra™ 3.11. All upgrade advice from previous versions applies. Carefully reviewing the DataStax Enterprise upgrade planning and upgrade instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

TinkerPop changes for DSE 6.7.3

A list of DataStax Enterprise 6.7.3 production-certified changes in addition to Apache TinkerPop 3.3.63.3.6.

DataStax Enterprise (DSE) 6.7.3 includes all changes from previous DSE releases plus these production-certified changes that are in addition to Apache TinkerPop™ 3.3.6. See TinkerPop upgrade documentation for all changes.

  • Disables the ScriptEngine global function cache which can hold on to references to "g" along with some other minor bug fixes/enhancements.

DSE 6.7.2 release notes

Release notes for DataStax Enterprise 6.7.2.

Important: DataStax recommends the latest patch release for most environments.

27 February 2019

6.7.2 Components

All components from DSE 6.7.2 are listed.

  • Apache Solr™ 6.0.1.1.2381
  • Apache Spark™ 2.2.2.8
  • Apache TinkerPop™ 3.3.5 with additional production-certified changes
  • Apache Tomcat® 8.0.53
  • DSE Java Driver 1.7.0
  • Netty 4.1.25.4.dse
  • Spark Jobserver 0.8.0.45 DSE custom version

DSE 6.7.2 is compatible with Apache Cassandra™ 3.11 and includes all production-certified changes from earlier versions.

6.7.2 Resolved issue

  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)

    If the DSE 5.0.x schema contains user-defined types (UDTs), the SSTable serialization headers are fixed when DSE is started with DSE 6.7.2 or later.

6.7.2 Known issue

  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.9, or DSE 6.7.4. Be sure to follow the upgrade instructions.

Cassandra enhancements for DSE 6.7.2

DataStax Enterprise 6.7.2 is compatible with Apache Cassandra™ 3.11 and adds production-certified enhancements.

DataStax Enterprise 6.7.2 is compatible with Apache Cassandra™ 3.11 and includes all production-certified enhancements from previous versions.

General upgrade advice for DSE 6.7.2

General upgrade advice for DataStax Enterprise 6.7.2

DataStax Enterprise 6.7.2 is compatible with Apache Cassandra™ 3.11. All upgrade advice from previous versions applies. Carefully reviewing the DataStax Enterprise upgrade planning and upgrade instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

TinkerPop changes for DSE 6.7.2

A list of DataStax Enterprise 6.7.2 production-certified changes in addition to Apache TinkerPop 3.3.5.

DataStax Enterprise (DSE) 6.7.2 includes Apache TinkerPop™ 3.3.5 and all production-certified changes from previous DSE releases. See TinkerPop upgrade documentation for all changes.

DSE 6.7.1 release notes

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

11 February 2019

Table 3. DSE functionality
6.7.1 DSE database 6.7.1 DSE Graph
6.7.1 DSE Analytics 6.7.1 DSE Search
6.7.1 DSEFS

6.7.1 Components

All components from DSE 6.7.1 are listed. Components that are updated for DSE 6.7.1 are indicated.

  • Apache Solr™ 6.0.1.1.2381 (updated)
  • Apache Spark™ 2.2.2.8 (updated)
  • Apache TinkerPop™ 3.3.5 with additional production-certified changes
  • Apache Tomcat® 8.0.53 (updated)
  • DSE Java Driver 1.7.0 (updated)
  • Netty 4.1.25.4.dse (updated)
  • Spark Jobserver 0.8.0.45 DSE custom version (updated)

DSE 6.7.1 is compatible with Apache Cassandra™ 3.11 and adds additional production-certified changes.

DSE 6.7.1 Highlights

High-value benefits of upgrading to DSE 6.7.1 include these highlights:

DSE Database (DSE core) highlights

Improvements:

  • DSE Metrics Collector aggregates DSE metrics and integrates with existing monitoring solutions to facilitate problem resolution and remediation. (DSP-17319)

Important bug fixes:

  • Fixed an issue where heap memory usage seems higher with default file cache settings. (DB-2865)
  • Fixed resource leak related to streaming operations that affects tiered storage users. Excessive number of TieredRowWriter threads causing java.lang.OutOfMemoryError. (DB-2463)

DSE Analytics highlights

Upgrade if:
  • You want improved error reporting during Spark job submission. (DSP-16359)
  • You are having issues with search improvements for analytics when search is not enabled. (DSP-16465)
  • You are moving directories in DSEFS. (DSP-17347)

DSE Graph highlights

Upgrade if:
  • You want to run the DSEFS auth demo. (DSP-17700)
  • You want to disable and configure DSEFS internode (node-to-node) authentication. (DSP-17721)
  • You have slow gremlin script compilation times. (DSP-14132)
  • You get errors for OLAP traversals after dropping schema elements. (DSP-15884)
  • You want new JMX operations for graph MBeans. (DSP-15928)
  • You want server side error messages for remote exceptions reported in Gremlin console. (DSP-16375)
  • You occasionally get inconsistent query results. (DSP-18005)
  • Use graph OLAP and want secret tokens redacted in log files. (DSP-18074)
  • You want to build fuzzy-text search indexes on string properties that form part of a vertex label ID. (DSP-17386)

DSE Search highlights

Upgrade if:
  • You index timestamp partition keys. (DSP-17761)
  • You do a lot of reindexing. (DSP-17975)
  • You use frozen maps in a table with DSE Search. (DSP-18073)
  • You have non-ASCII characters in your indexed columns. (DSP-17816, DSP-17961)

6.7.1 DSE core

Changes and enhancements:

  • The overuse bounds for token allocation is improved when RF = 1 or RF = number of racks. (DB-1552)
  • New nodetool rebuild_view command rebuilds materialized views for local data. Existing view data is not cleared. (DB-2451)
  • Improved messages for nodetool nodesyncservice ratesimulator command include explanation for single node clusters and when no tables have NodeSync enabled. (DB-2468)
  • Improved error message when Netty Epoll library cannot be loaded. (DB-2579)
  • New environment variable MAX_DIRECT_MEMORY overrides cassandra.yaml value for how much direct memory (NIO direct buffers) that the JVM can use. (DB-2919)
  • New JMX operations for graph MBeans. (DSP-15928)
    • adjacency-cache.size - adjacency cache size attribute
    • adjacency-cache.clear - operation to clean adjacency cache
    • index-cache.size - vertex cache size attribute
    • index-cache.clear - operation to clean vertex cache
    JMX operations are not cluster-aware. Invoke on each node as appropriate to your environment.
  • Improved encryption key error reporting. (DSP-17723)
  • Changed default values watermark values for DSE Metrics Collector. Configurable with JVM startup properties. (DSP-17733)

Resolved issues:

  • Running the nodetool nodesyncservice enable command reports the error NodeSyncRecord constructor assertion failed. (DB-2280)

    Workaround: Before DSE 6.7.1, restart DSE to resolve the issue so that you can execute the command and enable NodeSync without error.

  • Read and compaction errors with levelled compaction strategy (LCS). (DB-2446)
  • Excessive number of TieredRowWriter threads causing java.lang.OutOfMemoryError (DB-2463)
  • The nodetool nodesyncservice ratesimulator -deadline-overrides option is not supported. (DB-2468)
  • The nodetool gcstats command output incorrectly reports the GC reclaimed metric in bytes, instead of the expected MB. (DB-2598)
  • TypeParser is not thread safe. (DB-2602)
  • Possible corruption in compressed files with uncompressed chunks. (DB-2634)
  • Incorrect order of application of nodetool garbagecollect leaves tombstones that should be deleted. (DB-2658)
  • DSE does not start with Unable to gossip with any peers error if cross_node_timeout is true. (DB-2670)
  • Heap and CPU quota exceeded detection for asynchronous JavaScript UDFs is not reliable. (DB-2645)
  • Exception should occur when user with no permissions returns no rows on restricted table. (DB-2668)
  • User-defined aggregates (UDAs) that instantiate user-defined types (UDTs) break after restart. (DB-2771)
  • Memory leak on unfetched continuous paging requests. (DB-2851)
  • Batch replay is interrupted and good batches are skipped when a mutation of an unknown table is found. (DB-2855)
  • Late continuous paging errors can leave unreleased buffers behind. (DB-2862)
  • Heap memory usage is higher with default file cache settings. (DB-2865)
  • Prepared statement cache issues when using row-level access control (RLAC) permissions. Existing prepared statements are not correctly invalidated. (DB-2867)
  • dsetool does not work when native_transport_interface is set in cassandra.yaml. (DSP-16796)

    To workaround for DSE 6.7.0: Use native_transport_interface_prefer_ipv6 instead.

  • Security: java-xmlbuilder is vulnerable to XML external entities (XXE). (DSP-13962)
  • Kerberos protocol and QoP parameters are not correctly propagated. (DSP-15455)

Known issue:

  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.9, or DSE 6.7.4. Be sure to follow the upgrade instructions.

6.7.1 DSE Analytics

Changes and enhancements:
  • Memory leak in Spark Thrift Server. (DSP-17433)
  • Structured Streaming support for (Bring Your Own Spark) BYOS Spark 2.3. (DSP-17593)
  • Add the ability to disable and configure DSEFS internode (node-to-node) authentication. (DSP-17721)
Resolved issues:
  • Improved error handling: only submission-related error exceptions from Spark submitted applications are wrapped in a Dse Spark Submit Bootstrapper Failed to Submit error. (DSP-16359)
  • Search optimizations for search analytics Spark SQL queries are applied to a datacenter that no longer has search enabled. Queries launched from a search-enabled datacenter cause search optimizations even when the target datacenter does not have search enabled. (DSP-16465)
  • Submission in client mode does not support specifying remote jars (DSEFS) for main application resource (main jar) and jars specified with --jars / spark.jars. (DSP-17382)
  • Incorrect conversions in DirectJoin Spark SQL operations for timestamps, UDTs, and collections. (DSP-17444)
  • dse spark-sql-metastore-migrate does not work with DSE Unified Authentication and internal authentication. (DSP-17632)
  • Spark Web UI redirection drops path component. (DSP-17877)
  • AlwaysOn SQL (AOSS) shutdown service is not called so the driver processes are not killed. (DSP-18039)

6.7.1 DSEFS

Resolved issues:

  • DSEFS does not support listen_on_broadcast_address as configured in cassandra.yaml. (DSP-17363)
  • Moving a directory under itself causes data loss and orphan data structures. (DSP-17347)
  • DSEFS retries resolving corrupted paths. (DSP-17379)
  • DSEFS auth demo does not work. (DSP-17700)

6.7.1 DSE Graph

Changes and enhancements:
  • New tool fixes inconsistencies in graph data that are caused by schema changes, like label delete, or improper data loading. (DSP-15884)
    • DSE Graph Gremlin console: graph.cleanUp()
    • Spark: spark.dseGraph("name").cleanUp()
  • Server side error messages for remote exceptions are reported in Gremlin console. (DSP-16375)
Resolved issues:
  • Properties unattached to vertex show up with null values. (DSP-12300)
  • Graph OLTP: Slow gremlin script compilation times. (DSP-14132)
  • Graph/Search escaping fixes. (DSP-17216, DSP-17277, DSP-17816)
  • Search indexes on key fields work only with non-tokenized queries. (DSP-17386)
  • Graph OLTP: Potential ThreadLocal resource leak. (DSP-17808)
  • DseGraphFrame fail to read properties with symbols, like period (.), in names. (DSP-17818)
  • DSE GraphFrame operations cache but do not explicitly uncache. (DSP-17870)
  • g.V().repeat(...).until(...).path() returns incomplete path without edges. (DSP-17933)
  • gf.V().id().next() causes data to get mismatched with properties in legacy DseGraphFrame. (DSP-17979)
  • Inconsistent results when using gremlin on static data. (DSP-18005)
  • Graph OLAP: secret tokens are unmasked in log files. (DSP-18074)

6.7.1 DSE Search

Changes and enhancements:
  • The default for auto-generated schemas is useJtsMulti="false". See Spatial queries with polygons require JTS. (DSP-17764)
  • Requesting a core reindex with dsetool reload_core or REBUILD SEARCH INDEX no longer builds up a queue of reindexing tasks on a node. Instead, a single starting reindexing task handles all reindex requests that are already submitted to that node. (DSP-17045, DSP-13030)
  • CQL timestamp field can be part of a Solr unique key. (DSP-17761)
Resolved issues:
  • java.lang.AssertionError: rtDocValues.maxDoc=5230 maxDoc=4488 error is thrown in the system.log during indexing and reindexing. (DSP-17529)
  • Unexpected search index errors occur when non-ASCII characters, like the U+3000 (ideographic space) character, are in indexed columns. (DSP-17816, DSP-17961)
  • TextField type in search index schema should be case-sensitive if created when using copyField. (DSP-17817)
  • Strong self-ref loop detected after reindex is detected. (DSP-17975)
  • Loading frozen map columns fails during search read-before-write. (DSP-18073)

Cassandra enhancements for DSE 6.7.1

DataStax Enterprise 6.7.1 is compatible with Apache Cassandra™ 3.11 and adds production-certified enhancements.

DataStax Enterprise 6.7.1 is compatible with Apache Cassandra™ 3.11 and adds these production-certified enhancements:

  • Pad uncompressed chunks when they would be interpreted as compressed (CASSANDRA-14892)
  • Correct SSTable sorting for garbagecollect and levelled compaction (CASSANDRA-14870)
  • Avoid calling iter.next() in a loop when notifying indexers about range tombstones (CASSANDRA-14794)
  • Fix purging semi-expired RT boundaries in reversed iterators (CASSANDRA-14672)
  • DESC order reads can fail to return the last Unfiltered in the partition (CASSANDRA-14766)
  • Fix corrupted collection deletions for dropped columns in 3.0 <-> 2.{1,2} messages (CASSANDRA-14568)
  • Fix corrupted static collection deletions in 3.0 <-> 2.{1,2} messages (CASSANDRA-14568)
  • Handle failures in parallelAllSSTableOperation (cleanup/upgradesstables/etc) (CASSANDRA-14657)
  • Improve TokenMetaData cache populating performance avoid long locking (CASSANDRA-14660)
  • Fix static column order for SELECT * wildcard queries (CASSANDRA-14638)
  • sstableloader should use discovered broadcast address to connect intra-cluster (CASSANDRA-14522)
  • Fix reading columns with non-UTF names from schema (CASSANDRA-14468)

General upgrade advice for DSE 6.7.1

General upgrade advice for DataStax Enterprise 6.7.1

DataStax Enterprise 6.7.1 is compatible with Apache Cassandra™ 3.11. All upgrade advice from previous versions applies. Carefully reviewing the DataStax Enterprise upgrade planning and upgrade instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

TinkerPop changes for DSE 6.7.1

A list of DataStax Enterprise 6.7.1 production-certified changes in addition to Apache TinkerPop 3.3.5.

DataStax Enterprise (DSE) 6.7.1 includes all changes from previous DSE releases plus these production-certified changes that are in addition to Apache TinkerPop™ 3.3.5. See TinkerPop upgrade documentation for all changes.

Resolved issues:
  • Masked sensitive configuration options in the KryoShimServiceLoader logs.
  • Fixed a concurrency issue in TraverserSet.

DSE 6.7.0 release notes

Components, changes and enhancements, resolved issues, and known issues for DSE 6.7.0.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

dse.yaml

The location of the dse.yaml file depends on the type of installation:
Package installations /etc/dse/dse.yaml
Tarball installations installation_location/resources/dse/conf/dse.yaml

5 December 2018

6.7.0 Components

  • Apache Solr™ 6.0.1.1.2356
  • Apache Spark™ 2.2.2.5
  • Apache TinkerPop™ 3.3.3 with additional production-certified changes
  • Apache Tomcat® 8.0.53
  • DSE Java Driver 1.7.0
  • Netty 4.1.25.4.dse
  • Spark Jobserver 0.8.0.44 (DSE custom version)

DSE 6.7.0 is compatible with Apache Cassandra™ 3.11 and adds additional production-certified changes.

6.7.0 New features

See DataStax Enterprise 6.7 new features.

6.7.0 DSE database

Experimental features. These features are experimental and are not supported for production:
Changes and enhancements:
  • Improved Java user-defined functions (UDF). (DB-1049)
  • Improvements with new engine for materialized views (MV). (DB-1060)
    • Supports multiple non-base primary key in view clustering key when partition key is the same as base table.
    • Supports multiple filter expressions on non-primary key columns.
    • Supports out-of-order modification on columns not selected in MV, see CASSANDRA-11500.
    • Allow dropping base columns that are not part of MV primary key and not filtered.
      Important: Base column data and corresponding MV column data will be dropped.
    • Only MVs created with DSE 6.7 or later use the new MV format. To upgrade legacy MVs, create and build a new MV with the same schema and then point applications to use the new MV. See Known limitations of materialized views.
  • Ability to read the TTL and WRITE TIME of an element in a collection. (DB-1289)
  • Centralized handling of system properties to allow for logging of final values. Print all configuration flags values in system log. (DB-1556)
  • Support is added for cryptographic token interface standard PKCS#11 keystores. New cassandra.yaml and dse.yaml options for server and client encryption. (DB-1629)
  • CQL CAST function supports INSERT INTO and UPDATE statements, and can be used in WHERE clause. (DB-1837)
  • Reduced allocations when using offheap objects. (DB-2095, DSP-17054)
  • Improved protocol version presentation and setting in cqlsh. (DB-2096)
  • When DSE 6.7 starts, it automatically begins sending metrics and other structured events to DSE Metrics Collector. DSE Metrics Collector is enabled by default. To disable, see Disabling DSE Metrics Collector. (DSP-15910)
  • JTS (Java Topology Suite) is distributed with DSE. Remove any previously installed JTS JAR files from DSE installation classpath. (DSP-16086)
  • Changes in cassandra.yaml and dse.yaml. (DB-2095, DSP-17054)

    Upgrade impact: Make changes to configuration files after the upgrade and before restarting with 6.7.0. As always, carefully review and follow the Upgrading DataStax Enterprise recommendations.

    After the upgrade and before restarting with DSE 6.7.0, remove deprecated settings and use new settings.

  • cassandra.yaml changes
    Memtable settings
    Deprecated cassandra.yaml settings
    memtable_heap_space_in_mb
    memtable_offheap_space_in_mb
    Replace with this setting
    memtable_space_in_mb

    Governs heap and offheap space allocation to set a threshold for automatic memtable flush. The calculated default is 1/4 of the heap size.

    Changed setting
    memtable_allocation_type: offheap_objects

    The default method the database uses to allocate and manage memtable memory is  offheap_objects.

    User-defined functions (UDF) properties
    Deprecated cassandra.yaml settings
    user_defined_function_warn_timeout
    user_defined_function_fail_timeout
    Replace with these settings
    user_defined_function_warn_micros: 500
    user_defined_function_fail_micros: 10000
    user_defined_function_warn_heap_mb: 200
    user_defined_function_fail_heap_mb: 500
    user_function_timeout_policy: die

    Settings are in microseconds since Java UDFs run faster. The new timeouts are not equivalent to the deprecated settings.

    Internode encryption settings
    Deprecated cassandra.yaml setting
    server_encryption_options:
        store_type: JKS
    Replace with these settings
    server_encryption_options:
        keystore_type: JKS
        truststore_type: JKS

    Valid type options are JKS, JCEKS, PKCS12, or PKCS11.

    Client-to-node encryption options
    Deprecated cassandra.yaml setting
    client_encryption_options:
        store_type: JKS
    Replace with these settings
    client_encryption_options:
        keystore_type: JKS
        truststore_type: JKS

    Valid type options are JKS, JCEKS, PKCS12, or PKCS11.

    dse.yaml changes

    Spark resource and encryption options
    Deprecated dse.yaml setting
    spark_ui_options:
        server_encryption_options:
        store_type: JKS
    Replace with these settings
    spark_ui_options:
        server_encryption_options:
        keystore_type: JKS
        truststore_type: JKS

    Valid options are JKS, JCEKS, PKCS12, or PKCS11.

Known issues:

  • DSE 5.0 SSTables with UDTs will be corrupted after migrating to DSE 5.1, DSE 6.0, and DSE 6.7. (DB-2954, CASSANDRA-15035)
    Important: If the DSE 5.0.x schema contains user-defined types (UDTs), upgrade to at least DSE 5.1.13, DSE 6.0.6, or DSE 6.7.2. The SSTable serialization headers are fixed when DSE is started with the upgraded versions.
  • Possible data loss when using DSE Tiered Storage. (DB-3404)
    Warning: If using DSE Tiered Storage, you must immediately upgrade to at least DSE 5.1.16, DSE 6.0.9, or DSE 6.7.4. Be sure to follow the upgrade instructions.
  • dsetool does not work when native_transport_interface is set in cassandra.yaml. (DSP-16796)

    To workaround: Use native_transport_interface_prefer_ipv6 instead.

6.7.0 DSE Advanced Replication

No updates.

6.7.0 DSE Analytics

Experimental features. These features are experimental and are not supported for production:
Changes and enhancements:
  • The default logging behavior of the Spark-SQL shell does not log to STDOUT. All information is in the spark-shell log file. (DSP-16969)
  • DSEFS REST interface supports Kerberos authentication with SPNEGO and Kerberos delegation token authentication. (DSP-13102)
  • Spark Cassandra Connector: New parameter to set a read throttle per task to manage resources when multiple jobs are running in parallel. (DSP-14523)
  • Improved logging messages with recommended resolutions for AlwaysOn SQL (AOSS). (DSP-17326, DSP 17358, DSP-17533)
  • AlwaysOn SQL (AOSS): Set default for spark.sql.thriftServer.incrementalCollect to true. (DSP-17428)
  • Provide a way for clients to determine if AlwaysOn SQL (AOSS) is enabled in DSE. (DSP-17180)
Resolved issues:
  • Unresolved dependency with dse-core when running Spark Application tests with dse-connector. (DSP-17232)
  • AlwaysOn SQL (AOSS) should attempt to auto start again on datacenter restart, regardless of the previous status. (DSP-17359)

6.7.0 DSEFS

Changes and enhancements:
  • The dsefs cp -r shell command adds support for recursive copying. (DSP-10579)
  • The dse hadoop fs command is removed. Use the dsefs commands instead. (DSP-16063, DSP-16594)

6.7.0 DSE Graph

Changes and enhancements:
  • New DSE start-up parameter -Ddse.consistent_replace improves LOCAL_QUORUM and QUORUM consistency on new node after node replacement. (DB-1577)

6.7.0 DSE Search

Experimental features. These features are experimental and are not supported for production:
  • The dsetool index_checks use an Apache Lucene® experimental feature.
Changes and enhancements:
  • Search index schema auto generation supports PolygonType. Lenient mode is no longer required. (DSP-16480)
  • The stored flag in search index schemas is deprecated and is no longer added to auto-generated schemas. If the flag exists in custom schemas, it is ignored. (DSP-14425)

    Workaround: Because the stored=false flag is ignored, queries return more columns than expected. To ensure queries return expected results, specify the fields to return with fl=field1,field2 and so on.

Resolved issues:
  • Search indexes automatically configure geospatial fields of Point or LineString types. (DSP-15811)

DataStax Studio

DataStax Bulk Loader

Cassandra enhancements for DSE 6.7.0

DataStax Enterprise 6.7.0 is compatible with Apache Cassandra™ 3.11 and adds production-certified enhancements.

DataStax Enterprise 6.7.0 is compatible with Apache Cassandra™ 3.11 and adds these production-certified enhancements:

  • Add DEFAULT, UNSET, MBEAN and MBEANS to `ReservedKeywords`. (CASSANDRA-14205)
  • Add Unittest for schema migration fix (CASSANDRA-14140)
  • Print correct snitch info from nodetool describecluster (CASSANDRA-13528)
  • Close socket on error during connect on OutboundTcpConnection (CASSANDRA-9630)
  • Enable CDC unittest (CASSANDRA-14141)
  • Split CommitLogStressTest to avoid timeout (CASSANDRA-14143)
  • Improve commit log chain marker updating (CASSANDRA-14108)
  • Fix updating base table rows with TTL not removing view entries (CASSANDRA-14071)
  • Reduce garbage created by DynamicSnitch (CASSANDRA-14091)
  • More frequent commitlog chained markers (CASSANDRA-13987)
  • RPM package spec: fix permissions for installed jars and config files (CASSANDRA-14181)
  • More PEP8 compiance for cqlsh (CASSANDRA-14021)
  • Fix support for SuperColumn tables (CASSANDRA-12373)
  • Fix missing original update in TriggerExecutor (CASSANDRA-13894)
  • Improve short read protection performance (CASSANDRA-13794)
  • Fix counter application order in short read protection (CASSANDRA-12872)
  • Fix MV timestamp issues (CASSANDRA-11500)
  • Fix AssertionError in short read protection (CASSANDRA-13747)
  • Gossip thread slows down when using batch commit log (CASSANDRA-12966)
  • Allow native function calls in CQLSSTableWriter (CASSANDRA-12606)
  • Copy session properties on cqlsh.py do_login (CASSANDRA-13847)
  • Fix load over calculated issue in IndexSummaryRedistribution (CASSANDRA-13738)
  • Obfuscate password in stress-graphs (CASSANDRA-12233)
  • ReverseIndexedReader may drop rows during 2.1 to 3.0 upgrade (CASSANDRA-13525)
  • Avoid reading static row twice from old format sstables (CASSANDRA-13236)
  • Fix possible NPE on upgrade to 3.0/3.X in case of IO errors (CASSANDRA-13389)
  • Add duration data type (CASSANDRA-11873)
  • Properly report LWT contention (CASSANDRA-12626)
  • Stress daemon help is incorrect(CASSANDRA-12563)
  • Remove ALTER TYPE support (CASSANDRA-12443)
  • Fix assertion for certain legacy range tombstone pattern (CASSANDRA-12203)
  • Remove support for non-JavaScript UDFs (CASSANDRA-12883)
  • Better handle invalid system roles table (CASSANDRA-12700)
  • Upgrade netty version to fix memory leak with client encryption (CASSANDRA-13114)
  • Fix trivial log format error (CASSANDRA-14015)
  • Allow SSTabledump to do a JSON object per partition (CASSANDRA-13848)
  • Remove unused and deprecated methods from AbstractCompactionStrategy (CASSANDRA-14081)
  • Fix Distribution.average in cassandra-stress (CASSANDRA-14090)
  • Presize collections (CASSANDRA-13760)
  • Add GroupCommitLogService (CASSANDRA-13530)
  • Parallelize initial materialized view build (CASSANDRA-12245)
  • Fix flaky SecondaryIndexManagerTest.assert[Not]MarkedAsBuilt (CASSANDRA-13965)
  • Make LWTs send resultset metadata on every request (CASSANDRA-13992)
  • Fix flaky indexWithFailedInitializationIsNotQueryableAfterPartialRebuild (CASSANDRA-13963)
  • Introduce leaf-only iterator (CASSANDRA-9988)
  • Allow only one concurrent call to StatusLogger (CASSANDRA-12182)
  • Refactoring to specialised functional interfaces (CASSANDRA-13982)
  • Speculative retry should allow more friendly parameters (CASSANDRA-13876)
  • Throw exception if we send/receive repair messages to incompatible nodes (CASSANDRA-13944)
  • Replace usages of MessageDigest with Guava's Hasher (CASSANDRA-13291)
  • Add nodetool command to print hinted handoff window (CASSANDRA-13728)
  • Fix some alerts raised by static analysis (CASSANDRA-13799)
  • Checksum SSTable metadata (CASSANDRA-13321, CASSANDRA-13593)
  • Add result set metadata to prepared statement MD5 hash calculation (CASSANDRA-10786)
  • Add incremental repair support for --hosts, --force, and subrange repair (CASSANDRA-13818)
  • Refactor GcCompactionTest to avoid boxing (CASSANDRA-13941)
  • Expose recent histograms in JmxHistograms (CASSANDRA-13642)
  • Add SERIAL and LOCAL_SERIAL support for cassandra-stress (CASSANDRA-13925)
  • LCS needlessly checks for L0 STCS candidates multiple times (CASSANDRA-12961)
  • Correctly close netty channels when a stream session ends (CASSANDRA-13905)
  • Update lz4 to 1.4.0 (CASSANDRA-13741)
  • Throttle base partitions during MV repair streaming to prevent OOM (CASSANDRA-13299)
  • Improve short read protection performance (CASSANDRA-13794)
  • Fix AssertionError in short read protection (CASSANDRA-13747)
  • Use compaction threshold for STCS in L0 (CASSANDRA-13861)
  • Fix problem with min_compress_ratio: 1 and disallow ratio < 1 (CASSANDRA-13703)
  • Add extra information to SASI timeout exception (CASSANDRA-13677)
  • Rework CompactionStrategyManager.getScanners synchronization (CASSANDRA-13786)
  • Add additional unit tests for batch behavior, TTLs, Timestamps (CASSANDRA-13846)
  • Add keyspace and table name in schema validation exception (CASSANDRA-13845)
  • Emit metrics whenever we hit tombstone failures and warn thresholds (CASSANDRA-13771)
  • Allow changing log levels via nodetool for related classes (CASSANDRA-12696)
  • Add stress profile yaml with LWT (CASSANDRA-7960)
  • Reduce memory copies and object creations when acting on ByteBufs (CASSANDRA-13789)
  • simplify mx4j configuration (Cassandra-13578)
  • Fix trigger example on 4.0 (CASSANDRA-13796)
  • force minumum timeout value (CASSANDRA-9375)
  • Add bytes repaired/unrepaired to nodetool tablestats (CASSANDRA-13774)
  • Don't delete incremental repair sessions if they still have sstables (CASSANDRA-13758)
  • Fix pending repair manager index out of bounds check (CASSANDRA-13769)
  • Don't use RangeFetchMapCalculator when RF=1 (CASSANDRA-13576)
  • Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-13664)
  • Use an ExecutorService for repair commands instead of new Thread(..).start() (CASSANDRA-13594)
  • Fix race / ref leak in anticompaction (CASSANDRA-13688)
  • Fix race / ref leak in PendingRepairManager (CASSANDRA-13751)
  • Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615)
  • Improve sstablemetadata output (CASSANDRA-11483)
  • Support for migrating legacy users to roles has been dropped (CASSANDRA-13371)
  • Introduce error metrics for repair (CASSANDRA-13387)
  • Refactoring to primitive functional interfaces in AuthCache (CASSANDRA-13732)
  • Update metrics to 3.1.5 (CASSANDRA-13648)
  • batch_size_warn_threshold_in_kb can now be set at runtime (CASSANDRA-13699)
  • Avoid always rebuilding secondary indexes at startup (CASSANDRA-13725)
  • Upgrade JMH from 1.13 to 1.19 (CASSANDRA-13727)
  • Upgrade SLF4J from 1.7.7 to 1.7.25 (CASSANDRA-12996)
  • Default for start_native_transport now true if not set in config (CASSANDRA-13656)
  • Don't add localhost to the graph when calculating where to stream from (CASSANDRA-13583)
  • Allow skipping equality-restricted clustering columns in ORDER BY clause (CASSANDRA-10271)
  • Use common nowInSec for validation compactions (CASSANDRA-13671)
  • Improve handling of IR prepare failures (CASSANDRA-13672)
  • Send IR coordinator messages synchronously (CASSANDRA-13673)
  • Flush system.repair table before IR finalize promise (CASSANDRA-13660)
  • Fix column filter creation for wildcard queries (CASSANDRA-13650)
  • Add 'nodetool getbatchlogreplaythrottle' and 'nodetool setbatchlogreplaythrottle' (CASSANDRA-13614)
  • fix race condition in PendingRepairManager (CASSANDRA-13659)
  • Allow noop incremental repair state transitions (CASSANDRA-13658)
  • Run repair with down replicas (CASSANDRA-10446)
  • Added started & completed repair metrics (CASSANDRA-13598)
  • Added started & completed repair metrics (CASSANDRA-13598)
  • Improve secondary index (re)build failure and concurrency handling (CASSANDRA-10130)
  • Improve calculation of available disk space for compaction (CASSANDRA-13068)
  • Change the accessibility of RowCacheSerializer for third party row cache plugins (CASSANDRA-13579)
  • Allow sub-range repairs for a preview of repaired data (CASSANDRA-13570)
  • NPE in IR cleanup when columnfamily has no sstables (CASSANDRA-13585)
  • Fix Randomness of stress values (CASSANDRA-12744)
  • Allow selecting Map values and Set elements (CASSANDRA-7396)
  • Fast and garbage-free Streaming Histogram (CASSANDRA-13444)
  • Update repairTime for keyspaces on completion (CASSANDRA-13539)
  • Add configurable upper bound for validation executor threads (CASSANDRA-13521)
  • Bring back maxHintTTL propery (CASSANDRA-12982)
  • Add testing guidelines (CASSANDRA-13497)
  • Add more repair metrics (CASSANDRA-13531)
  • RangeStreamer should be smarter when picking endpoints for streaming (CASSANDRA-4650)
  • Avoid rewrapping an exception thrown for cache load functions (CASSANDRA-13367)
  • Log time elapsed for each incremental repair phase (CASSANDRA-13498)
  • Add multiple table operation support to cassandra-stress (CASSANDRA-8780)
  • Fix incorrect cqlsh results when selecting same columns multiple times (CASSANDRA-13262)
  • Fix WriteResponseHandlerTest is sensitive to test execution order (CASSANDRA-13421)
  • Improve incremental repair logging (CASSANDRA-13468)
  • Start compaction when incremental repair finishes (CASSANDRA-13454)
  • Add repair streaming preview (CASSANDRA-13257)
  • Cleanup isIncremental/repairedAt usage (CASSANDRA-13430)
  • Change protocol to allow sending key space independent of query string (CASSANDRA-10145)
  • Make gc_log and gc_warn settable at runtime (CASSANDRA-12661)
  • Take number of files in L0 in account when estimating remaining compaction tasks (CASSANDRA-13354)
  • Skip building views during base table streams on range movements (CASSANDRA-13065)
  • Improve error messages for +/- operations on maps and tuples (CASSANDRA-13197)
  • Remove deprecated repair JMX APIs (CASSANDRA-11530)
  • Fix version check to enable streaming keep-alive (CASSANDRA-12929)
  • Make it possible to monitor an ideal consistency level separate from actual consistency level (CASSANDRA-13289)
  • Outbound TCP connections ignore internode authenticator (CASSANDRA-13324)
  • Upgrade junit from 4.6 to 4.12 (CASSANDRA-13360)
  • Cleanup ParentRepairSession after repairs (CASSANDRA-13359)
  • Upgrade snappy-java to 1.1.2.6 (CASSANDRA-13336)
  • Incremental repair not streaming correct sstables (CASSANDRA-13328)
  • Upgrade the JNA version to 4.3.0 (CASSANDRA-13300)
  • Add the currentTimestamp, currentDate, currentTime and currentTimeUUID functions (CASSANDRA-13132)
  • Remove config option index_interval (CASSANDRA-10671)
  • Reduce lock contention for collection types and serializers (CASSANDRA-13271)
  • Make it possible to override MessagingService.Verb ids (CASSANDRA-13283)
  • Avoid synchronized on prepareForRepair in ActiveRepairService (CASSANDRA-9292)
  • Adds the ability to use uncompressed chunks in compressed files (CASSANDRA-10520)
  • Don't flush sstables when streaming for incremental repair (CASSANDRA-13226)
  • Remove unused method (CASSANDRA-13227)
  • Fix minor bugs related to #9143 (CASSANDRA-13217)
  • Output warning if user increases RF (CASSANDRA-13079)
  • Remove pre-3.0 streaming compatibility code for 4.0 (CASSANDRA-13081)
  • Add support for + and - operations on dates (CASSANDRA-11936)
  • Fix consistency of incrementally repaired data (CASSANDRA-9143)
  • Increase commitlog version (CASSANDRA-13161)
  • Make TableMetadata immutable, optimize Schema (CASSANDRA-9425)
  • Refactor ColumnCondition (CASSANDRA-12981)
  • Parallelize streaming of different keyspaces (CASSANDRA-4663)
  • Improved compactions metrics (CASSANDRA-13015)
  • Speed-up start-up sequence by avoiding un-needed flushes (CASSANDRA-13031)
  • Use Caffeine (W-TinyLFU) for on-heap caches (CASSANDRA-10855)
  • Thrift removal (CASSANDRA-11115)
  • Remove pre-3.0 compatibility code for 4.0 (CASSANDRA-12716)
  • Add column definition kind to dropped columns in schema (CASSANDRA-12705)
  • Add (automate) Nodetool Documentation (CASSANDRA-12672)
  • Update bundled cqlsh python driver to 3.7.0 (CASSANDRA-12736)
  • Reject invalid replication settings when creating or altering a keyspace (CASSANDRA-12681)
  • Clean up the SSTableReader#getScanner API wrt removal of RateLimiter (CASSANDRA-12422)
  • Use new token allocation for non bootstrap case as well (CASSANDRA-13080)
  • Avoid byte-array copy when key cache is disabled (CASSANDRA-13084)
  • Require forceful decommission if number of nodes is less than replication factor (CASSANDRA-12510)
  • Allow IN restrictions on column families with collections (CASSANDRA-12654)
  • Log message size in trace message in OutboundTcpConnection (CASSANDRA-13028)
  • Add timeUnit Days for cassandra-stress (CASSANDRA-13029)
  • Add mutation size and batch metrics (CASSANDRA-12649)
  • Add method to get size of endpoints to TokenMetadata (CASSANDRA-12999)
  • Expose time spent waiting in thread pool queue (CASSANDRA-8398)
  • Conditionally update index built status to avoid unnecessary flushes (CASSANDRA-12969)
  • cqlsh auto completion: refactor definition of compaction strategy options (CASSANDRA-12946)
  • Add support for arithmetic operators (CASSANDRA-11935)
  • Add histogram for delay to deliver hints (CASSANDRA-13234)
  • Fix cqlsh automatic protocol downgrade regression (CASSANDRA-13307)
  • Changing `max_hint_window_in_ms` at runtime (CASSANDRA-11720)
  • Nodetool repair can hang forever if we lose the notification for the repair completing/failing (CASSANDRA-13480)
  • Anticompaction can cause noisy log messages (CASSANDRA-13684)
  • Switch to client init for sstabledump (CASSANDRA-13683)
  • CQLSH: Don't pause when capturing data (CASSANDRA-13743)

General upgrade advice for DSE 6.7.0

General upgrade advice for DataStax Enterprise 6.7.0

DataStax Enterprise 6.7.0 is compatible with Apache Cassandra™ 3.11. All upgrade advice from previous versions applies. Carefully reviewing the DataStax Enterprise upgrade planning and upgrade instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

DataStax Enterprise 6.7.0 is compatible with Apache Cassandra™ 3.11 and adds Cassandra enhancements for DSE 6.7.0.

Additional advice for upgrading between versions of Apache Cassandra™ includes:

Cassandra 4.0 changes

  • Support for COMPACT STORAGE tables is removed. Follow the DataStax Enterprise upgrade documentation for upgrades from DSE 5.1 to DSE 6.7 and DSE 5.0 to DSE 6.7 for instructions on migrating all tables with COMPACT STORAGE to CQL table format.
  • Fixed a problem with incremental repair which caused repaired data to be inconsistent between nodes. The fix changes the behavior of both full and incremental repairs. For full repairs, data is no longer marked repaired. For incremental repairs, anticompaction is run at the beginning of the repair, instead of at the end. If incremental repair was being used prior to upgrading, a full repair should be run after upgrading to resolve any inconsistencies. The DataStax Enterprise upgrade documentation includes instructions to run nodetool repair.
  • Deprecated config option index_interval is removed (it was deprecated since 2.0)
  • Deprecated repair JMX APIs are removed.
  • The version of snappy-java has been upgraded to 1.1.2.6.
  • Config option commitlog_sync_batch_window_in_ms is deprecated. Batch mode remains a valid commit log mode, however.
  • A new commit log sync mode, group, is similar to batch mode but blocks for up to a configurable number of milliseconds between disk flushes.
  • Due to the parallelization of the initial build of materialized views, the per token range view building status is stored in the new table `system.view_builds_in_progress`. The old table `system.views_builds_in_progress` is no longer used and can be removed. See CASSANDRA-12245 for more details.
  • nodetool clearsnapshot now requires the --all flag to remove all snapshots. Previous behavior would delete all snapshots by default.
  • Background read repair has been deprecated. dclocal_read_repair_chance and read_repair_chance table options have been deprecated, and will be removed entirely in 4.0. See CASSANDRA-13910 for details.

Cassandra 3.11.2 changes

  • Cassandra is now relying on the JVM options to properly shutdown on OutOfMemoryError. By default it will rely on the OnOutOfMemoryError option as the ExitOnOutOfMemoryError and CrashOnOutOfMemoryError options are not supported by the older 1.7 and 1.8 JVMs. A warning will be logged at startup if none of those JVM options are used. See CASSANDRA-13006 for more details.

Cassandra 3.11.2 upgrade considerations

  • Creating Materialized View with filtering on non-primary-key base column (added in CASSANDRA-10368) is disabled, because the liveness of view row is depending on multiple filtered base non-key columns and base non-key column used in view primary-key. This semantic cannot be supported without storage format change, see CASSANDRA-13826. For append-only use case, you may still use this feature with a startup flag: "-Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true"
  • The NativeAccessMBean isAvailable method will only return true if the native library has been successfully linked. Previously it was returning true if JNA could be found but was not taking into account link failures.
  • Primary ranges in the system.size_estimates table are now based on the keyspace replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
  • In 2.1, the default for otc_coalescing_strategy was 'DISABLED'. In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown to be a performance regression. The default for 3.11.0 and newer has been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should be aware that the default has changed.
  • The StorageHook interface has been modified to allow to retrieve read information from SSTableReader (CASSANDRA-13120).
  • Materialized Views for upgrades from DSE 5.1.1 or 5.1.2 or any version DSE 5.0.10 or later:
    • Cassandra will no longer allow dropping columns on tables with Materialized Views.
    • A change was made in the way the Materialized View timestamp is computed, which may cause an old deletion to a base column which is view primary key (PK) column to not be reflected in the view when repairing the base table post-upgrade. This condition is only possible when a column deletion to an MV primary key (PK) column not present in the base table PK (via UPDATE base SET view_pk_col = null or DELETE view_pk_col FROM base) is missed before the upgrade and received by repair after the upgrade. If such column deletions are done on a view PK column which is not a base PK, it's advisable to run repair on the base table of all nodes prior to the upgrade. Alternatively it's possible to fix potential inconsistencies by running repair on the views after upgrade or drop and re-create the views. See CASSANDRA-11500 for more details.
    • Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column = null or DELETE unselected_column FROM base) may not be properly reflected in the view in some situations so we advise against doing deletions on base columns not selected in views until this is fixed on CASSANDRA-13826.

Cassandra 3.10 changes

  • Runtime modification of concurrent_compactors is now available via nodetool concurrent_compactors.
  • Support for the assignment operators +=/-= has been added for update queries.
  • An Index implementation may now provide a task which runs prior to joining the ring. See CASSANDRA-12039
  • Filtering on partition key columns is now also supported for queries without secondary indexes.
  • A slow query log has been added: slow queries will be logged at DEBUG level. For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms in cassandra.yaml.
  • Support for GROUP BY queries has been added.
  • A new compaction-stress tool has been added to test the throughput of compaction for any cassandra-stress user schema. see compaction-stress help for how to use.
  • Prepared statements are now persisted in the table prepared_statements in the system keyspace. Upon startup, this table is used to preload all previously prepared statements - i.e. in many cases clients do not need to re-prepare statements against restarted nodes.
  • cqlsh can now connect to older Cassandra versions by downgrading the native protocol version. Please note that this is currently not part of our release testing and, as a consequence, it is not guaranteed to work in all cases. See CASSANDRA-12150 for more details.
  • Snapshots that are automatically taken before a table is dropped or truncated will have a "dropped" or "truncated" prefix on their snapshot tag name.
  • Metrics are exposed for successful and failed authentication attempts. These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
  • Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET. See CASSANDRA-11424 for details
  • Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
  • Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.

Cassandra 3.10 upgrade considerations

  • Support for alter types of already defined tables and of UDTs fields has been disabled. If it is necessary to return a different type, please use casting instead. See CASSANDRA-12443 for more details.
  • Specifying the default_time_to_live option when creating or altering a materialized view was erroneously accepted (and ignored). It is now properly rejected.
  • Only Java and JavaScript are now supported UDF languages. The sandbox in 3.0 already prevented the use of script languages except Java and JavaScript.
  • Compaction now correctly drops sstables out of CompactionTask when there isn't enough disk space to perform the full compaction. This should reduce pending compaction tasks on systems with little remaining disk space.
  • Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the "full" request time on the coordinator. Previously, they only covered the time from when the coordinator sent a message to a replica until the time that the replica responded. Additionally, the previous behavior was to reset the timeout when performing a read repair, making a second read to fix a short read, and when subranges were read as part of a range scan or secondary index query. In 3.10 and higher, the timeout is no longer reset for these "subqueries". The entire request must complete within the specified timeout. As a consequence, your timeouts may need to be adjusted to account for this. See CASSANDRA-12256 for more details.
  • Logs written to stdout are now consistent with logs written to files. Time is now local (it was UTC on the console and local in files). Date, thread, file and line info where added to stdout. (see CASSANDRA-12004)
  • The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided. The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
  • (Tools development) DatabaseDescriptor no longer implicitly startups components/services like commit log replay. This may break existing 3rd party tools and clients. In order to startup a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner, snitch, encryption context. Client initialization just applies the configuration but does not setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
  • Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property.
  • Duration literals support the ISO 8601 format. By consequence, identifiers matching that format (e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).

Cassandra 3.8 changes

  • Shared pool threads are now named according to the stage they are executing tasks for. Thread names mentioned in traced queries change accordingly.
  • A new option has been added to cassandra-stress "-rate fixed={number}/s" that forces a scheduled rate of operations/sec over time. Using this, stress can accurately account for coordinated ommission from the stress process.
  • The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
  • hdr histograms have been added to stress runs, it's output can be saved to disk using: "-log hdrfile=" option. This histogram includes response/service/wait times when used with the fixed or throttle rate options. The histogram file can be plotted on http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
  • TimeWindowCompactionStrategy has been added. This has proven to be a better approach to time series compaction and new tables should use this instead of DTCS. See CASSANDRA-9666 for details.
  • DateTieredCompactionStrategy has been deprecated - new tables should use TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might cause increased compaction load for a while after the migration so make sure you run tests before migrating. Read CASSANDRA-9666 for background on this.
  • Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and a brief explanation of on-disk locations for archived data in CommitLog form. This can be enabled via ALTER TABLE ... WITH cdc=true. Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached between unflushed CommitLogSegments and cdc_raw. NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version cluster as it will lead to exceptions which can interrupt traffic. Once all nodes have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.

Cassandra 3.10 upgrade considerations

  • The ReversedType behaviour has been corrected for clustering columns of BYTES type containing empty value. Scrub should be run on the existing SSTables containing a descending clustering column of BYTES type to correct their ordering. See CASSANDRA-12127 for more details.
  • Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address to the public instance IP if this property is defined on cassandra.yaml.
  • The name "json" and "distinct" are not valid anymore a user-defined function names (they are still valid as column name however). In the unlikely case where you had defined functions with such names, you will need to recreate those under a different name, change your code to use the new names and drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more details).

Cassandra 3.7 upgrade considerations

  • A maximum size for SSTables values has been introduced, to prevent out of memory exceptions when reading corrupt SSTables. This maximum size can be set via max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.

Cassandra 3.6 changes

  • JMX connections can now use the same auth mechanisms as CQL clients. New options in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings still only expose JMX locally, and use the JVM's own security mechanisms when remote connections are permitted. For more details on how to enable the new options, see the comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details. Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system property at startup is deprecated. See CASSANDRA-11725 for more details.
  • JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
  • Collision checks are performed when joining the token ring, regardless of whether the node should bootstrap. Additionally, replace_address can legitimately be used without bootstrapping to help with recovery of nodes with partially failed disks. See CASSANDRA-10134 for more details.
  • Key cache will only hold indexed entries up to the size configured by column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries will never go into memory. See CASSANDRA-11206 for more details.
  • For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL from the inserted or updated values.
  • Startup is now aborted if corrupted transaction log files are found. The details of the affected log files are now logged, allowing the operator to decide how to resolve the situation.
  • Filtering expressions are made more pluggable and can be added programatically via a QueryHandler implementation. See CASSANDRA-11295 for more details.

Cassandra 3.4 changes

  • Internal authentication now supports caching of encrypted credentials. Reference cassandra.yaml:credentials_validity_in_ms
  • Remote configuration of auth caches via JMX can be disabled using the the system property cassandra.disable_auth_caches_remote_configuration
  • sstabledump tool is added to be 3.0 version of former sstable2json. The tool only supports v3.0+ SSTables. See tool's help for more detail.
  • The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is common across all caches in the auth subsystem. The specific mbean interfaces for each individual cache will be removed in a subsequent major version.

Cassandra 3.2 changes

  • We now make sure that a token does not exist in several data directories. This means that we run one compaction strategy per data_file_directory and we use one thread per directory to flush. Use nodetool relocatesstables to make sure your tokens are in the correct place, or just wait and compaction will handle it. See CASSANDRA-6696 for more details.
  • bound maximum in-flight commit log replay mutation bytes to 64 megabytes tunable via cassandra.commitlog_max_outstanding_replay_bytes
  • Support for type casting has been added to the selection clause.
  • Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression. Note: hints compression is currently disabled by default.
  • The Thrift API is deprecated and will be removed in Cassandra 4.0.

Cassandra 3.2 upgrade considerations

  • The compression ratio metrics computation has been modified to be more accurate.
  • Running Cassandra as root is prevented by default.
  • JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file

Cassandra 3.1 upgrade considerations

  • The return value of SelectStatement::getLimit as been changed from DataLimits to int.
  • Custom index implementation should be aware that the method Indexer::indexes() has been removed as its contract was misleading and all custom implementation should have almost surely returned true inconditionally for that method.
  • GC logging is now enabled by default (you can disable it in the jvm.options file if you prefer).

TinkerPop changes for DSE 6.7.0

A list of DataStax Enterprise 6.7.0 production-certified changes in addition to Apache TinkerPop 3.3.3.

DataStax Enterprise (DSE) 6.7.0 includes all changes from previous DSE releases plus these production-certified changes that are in addition to Apache TinkerPop™ 3.3.3:
  • Upgrade to Groovy 2.4.15 - resolves a Groovy bug preventing Lambda creation in GLVs in some cases. (TINKERPOP-1953)
  • Implement TraversalSelectStep - expands the capability of the select() step by allowing nesting as in select("a").select(select("n")) which thus allows for dynamic keys for select() (6.0.1+). ( TINKERPOP-1628)
  • Performance enhancement to Bytecode deserialization. (TINKERPOP-1936)
  • Traversal construction performance enhancements. (TINKERPOP-1950)
  • Path history isn't preserved for keys in mutations. (TINKERPOP-1947)
  • Profile step and iterate do not play nicely with each other (6.0.1+). (TINKERPOP-1869)