OpsCenter Release Notes

OpsCenter release notes provide information about new and improved features, known and resolved issues, and bug fixes.

OpsCenter release notes provide information about new and improved features, known and resolved issues, and bug fixes.

Release impacts

Understand upgrade impacts, compatibility with DSE versions, and known issues.

Before reading release notes, review the following information to understand upgrade impacts, compatibility with DataStax Enterprise (DSE) versions, and known issues for the OpsCenter version.

Upgrade Information

Important: Review the New features in DSE OpsCenter 6.5 pertinent to the release. Additionally, review upgrade considerations as noted in Upgrading DSE OpsCenter. Configuration and other notable changes are provided in detail.

Compatibility

To see which versions of DataStax Enterprise (DSE) are supported with OpsCenter 6.x, see the OpsCenter Compatibility chart.

Known Issues

Important: Review the list of known issues before running a new OpsCenter version on a production DSE cluster.

OpsCenter 6.5.8 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.8 release.

7 May 2020

Changes in 6.5.8

The following changes are included in this release.

Core
  • Permissions are now more restrictive on the SSL directory for package installs. (OPSC-16136)
  • Added the ability to enable hostname verification on LDAP SSL/TLS connections. (OPSC-16354)
  • Changed rpc address logging from info to debug. (OPSC-16456)
  • Removed IP constraint on [agents] reported_interface to allow use of hostnames for failover. (OPSC-16470)
  • Prevented OpsCenter from recommending upgrading to OpsCenter 6.8.x if there is a DSE 5.1.x cluster. (OPSC-16577)
  • Improved failover logging. (OPSC-13463)
  • Changed OpsCenter disconnect cluster so that it does not automatically remove the associated LCM cluster. (OPSC-15026)
  • Updated dependency of JGroup to prevent vulnerabilities in OpsCenter. (OPSC-16105)
  • Updated c3p0 library to prevent vulnerabilities in OpsCenter. (OPSC-16109)
  • Fixed vulnerabilities in the OpsCenter user interface. (OPSC-16221)
  • Corrected an issue which caused the nodes in the ring view to appear gray if the client-to-node encryption section was missing from cassandra.yaml. (OPSC-16255)
  • Fixed an issue that caused a 404 error when retrieving a DataStax Agent configuration value of false. (OPSC-16384)
  • DataStax Agent now cleans up diagnostic tarball files on disk after transfer. (OPSC-16507)
  • Fixed an issue where OpsCenter user interface session caches were invalidated but not cleaned up properly. (OPSC-16572)
  • OpsCenter now cleans up directories created during a diagnostic. (OPSC-16300)
  • Fixed issues with the DataStax agent diagnostic download after failover. (OPSC-16166)
  • Fixed issues in how OpsCenter generates redirect responses during login. (OPSC-16011)
  • Improved the documentation for [ui] storagemap_ttl in opscenterd.conf to help users correctly configure the parameter. (OPSC-14320)
Backup Service
  • Fixed an issue which could cause a StackOverflowException during backups, restores, and metrics gathering. (OPSC-15977)
  • Added datacenter selection to restore. (OPSC-16369)
  • Fixed several issues that could prevent destinations from getting cleaned up from the cluster configuration file after a restore. (OPSC-13253)
  • Corrected an issue that prevented a datacenter snapshot from being taken when nodes outside of that datacenter had problems with their DataStax Agents. (OPSC-16338)
  • When running a backup for select datacenters, the disk space check no longer queries nodes not included in the backup. (OPSC-16407)
  • Fixed an issue causing unneeded checks of remote file sizes during restore from a destination. (OPSC-16431)
Monitoring
  • Added chrony output to the diagnostic tarball. (OPSC-16560)
  • Fixed an issue where the diagnostic tarball download link would be incorrect when behind a proxy with a subpath. (OPSC-15566)
  • Fixed check-2i-cardinality warning caused by OpsCenter backup_reports index. (OPSC-15895)
  • Improved configuration of percentile alerts with a configurable duration and separate histogram aggregation window for the calculation. (OPSC-16115)
  • Fixed iostat errors showing up in the agent.log with iostat does not support -s flag. (OPSC-16181)
  • Improved the performance of the metric fetcher when querying values from the storage cluster. (OPSC-16559)
  • Fixed and issue with the All Graphs option when adding a graph metric to a dashboard graph. (OPSC-16034)
  • Fixed an issue that could prevent alerts rules from populating in the user interface after enabling or disabling an alert rule. (OPSC-16233)
Provisioning
  • Fixed LCM jobs hanging under certain conditions. Such conditions will now result in job failure with details in the opecenterd logs. (OPSC-16176)
  • Fixed OpsCenter and LCM workflows that resulted in seemingly identical clusters. (OPSC-16520)
Repair Service
  • Repair Service temporary files are cleaned up more quickly. (OPSC-15982)
Platform
  • Corrected an issue that caused the DataStax Agent rapidly spawn new threads when trying to restart Repair Service while OpsCenter is down. This issue caused the DataStax Agent to reach the maximum operating system thread limit. (OPSC-16213)

OpsCenter 6.5.7 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.7 release.

21 October 2019

Changes in 6.5.7

The following changes are included in this release.

Backup Service
  • Corrected an issue that prevented the search index from rebuilding after a point-in-time restore. (OPSC-15809)
  • Fixed a bug in the commitlog cleanup throttle that prevented future cleanups from running. (OPSC-15869)
Core
  • If a keystore or truststore file fails to load, OpsCenter logs the keystore or truststore file that failed to load. (OPSC-13632)
  • Sensitive information, including passwords and S3 tokens, are now omitted from diagnostic tarball collection. (OPSC-14760)
  • The OpsCenter user interface now properly displays responses returned from its API. (OPSC-15815)
  • Upgraded OpsCenter and DSE Agent dependencies to address security vulnerabilities. (OPSC-16090, OPSC-16148)
Best Practice Service
  • Changed default Best Practice rule schedules to execute over an hour rather than all at once. (OPSC-16023)
Monitoring
  • When trying to view NodeSync metrics for a table that is ignored by the metrics system, a warning displays. When enabling NodeSync for an OpsCenter rollups table, a warning displays. (OPSC-14614)
  • Provides new Insights diagnostic data tarball for download only as requested by DataStax Support. (OPSC-15945)
Provisioning
  • Improved usability for using a configured HTTP proxy when adding a repository in LCM. (OPSC-15526)
  • Fixed a bug that showed duplicate clusters in OpsCenter when re-running an install job on an existing cluster. (OPSC-15888)
  • Fixed a bug that prevented changes to the LCM datacenter model after the first complete install job. (OPSC-15892)
  • Allows changes to WAIT_FOR_START and WAIT_FOR_STOP defaults when adding a configuration profile. (OPSC-16155)
Repair Service
  • Improved logging statements in Repair Service to clarify which type of repair job is being logged. (OPSC-15913)
  • Removed URI length restriction for [repair_service] ignore_keyspaces and [repair_service] ignore_tables to ensure specified keyspaces and tables are excluded from subrange repairs. To use this improvement, upgrade opscenterd and all the DataStax Agents. (OPSC-13245)

OpsCenter 6.5.6 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.6 release.

cassandra-env.sh

The location of the cassandra-env.sh file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra-env.sh
Tarball installations installation_location/resources/cassandra/conf/cassandra-env.sh

15 May 2019

Highlights

  • Upgraded DSE Java Driver to 1.8.1.
  • Fixed an issue where the default locations for commit log backups could not be entered in the OpsCenter interface.
  • Fixed an issue where users could not interact with the Cannot connect to cluster screen.
  • Removed automatic downloads of the Oracle JRE due to Oracle licensing changes.

Changes in 6.5.6

The following changes are included in this release.

Backup Service
  • Improved metric queries to use Best Practice rules by using prepared statements with parameters for queries from rollup tables. (OPSC-13149)
  • Fixed an issue where alert types that permit immediate notification could not be edited if immediate notification was selected. (OPSC-13704)
  • Changed commit log archiving when starting the DataStax Agent to handle all commit logs as a batch instead of processing each file individually. (OPSC-13782)
  • When recreating a keyspace with nested UDTs, fixed dependency order of UDTs to control the order OpsCenter restores the UDTs. (OPSC-15127)
  • Fixed an issue that prevented some detailed information about backups from displaying in the OpsCenter interface. (OPSC-15200)
  • Fixed missing destination UI bug by keeping UI cache of destinations in sync with the server when a backup is run. (OPSC-15206)
  • Added logging to clarify errors the can occur when creating a snapshot with DSE 6.0.3 or DSE 6.0.4. (OPSC-15309)
  • Backup Service now uses the LZ4 algorithm for compression, which makes compressions four times faster, but results in 10% larger files. Older backups that use gzip can still be restored with OpsCenter. (OPSC-15633)
  • Fixed an issue where the default locations for commit log backups could not be entered in the OpsCenter interface. (OPSC-15683)
  • Upgraded DSE Java Driver to 1.8.1. (OPSC-15719)
  • Fixed bug in parsing tiered_storage_options. (OPSC-15724)
  • Fixed an issue that caused the minimum percentile alerts to return a high value instead of 0 when given only zeros for data. (OPSC-15762)
Core
  • Improved memory handling of tooltips. (OPSC-6524)
  • The OpsCenter UI now properly escapes all JSON responses returned from its API. (OPSC-11508)
  • Rollover log files for opscenterd and other configured rollover log files are now included in the diagnostic tarball. (OPSC-12141)
  • Added documentation for the /logout API. (OPSC-13147)
  • Fixed an issue where the wrong log4j.properties configuration file was included in the installed_location/agent/conf directory of opscenterd tarball distributions. (OPSC-14729)
  • Updated DataStax Agent key generation to utilize RSA instead of DSA and updated documentation. (OPSC-15123)
  • Fixed an issue where OpsCenter returned an error page instead of redirecting to the login page when authentication was enabled. (OPSC-15630)
  • Fixed an issue where users could not interact with the Cannot connect to cluster screen, including being able to select text or click hyperlinks, when configured clusters are unavailable. (OPSC-15767)
Monitoring
  • Corrected an issue where an alert could trigger emails even after it was deleted. (OPSC-13861)
NodeSync
  • Enhanced link styling on the NodeSync status page to make clickable entities more distinct and obvious. (OPSC-15514)
Performance Service
  • Removed blocking CQL queries to improve application performance. (OPSC-15574)
Best Practice Service
  • Include all queries and tables in the error message relating to the Use prepared statements Best Practice rule. (OPSC-15647)
Provisioning
  • Definitions added to allow configuration of CASSANDRA_HEAPDUMP_DIR in cassandra-env.sh. (OPSC-12377)
  • Added more logging to the post-install script for the Debian package installation. (OPSC-15106)
  • Fixed a bug preventing node_install_idle_timeout from being respected in LCM. (OPSC-15376)
  • LCM now displays a useful error and link to documentation when trying to edit or use a Config Profile with an unsupported DSE version. (OPSC-15381)
  • Fixed LCM API uniqueness check for entities on PUT requests when one or more unique key fields are missing from the request. The LCM API now gives an error when the user submits a change for a read-only field. Previously, the LCM API ignored these changes with no error. (OPSC-15656)
  • LCM cluster model and related resources are now immutable while there are associated jobs in the queue. (OPSC-15714)
  • LCM import cluster and DataStax Agent installation jobs will no longer run concurrently with any other job type. This change prevents the troublesome situation where one of these job types is run on a cluster that is already managed while another job type is active on that cluster. (OPSC-15716)
  • Improved the error message for LCM job failures caused by the end of public updates for Oracle Java. (OPSC-15845)
  • Removed automatic downloads of the Oracle JRE due to Oracle licensing changes. (OPSC-15871)
  • After three unsuccessful attempts to update definitions, OpsCenter prints a log message: Experienced 3 consecutive failures downloading definitions, disabling updates until the next restart. Instead of continuing to check for definitions and logging a stacktrace error, OpsCenter does not attempt to update definitions again until OpsCenter restarts. (OPSC-10468)
Repair Service
  • Fixed an issue where a distributed subrange repair is paused, the current-task file is not removed from the DataStax Agent file system. (OPSC-14612)
  • Prevented the Repair Service from crashing due to a Java long overflow when a repair task is persisted in SQLite. (OPSC-15225)
  • Implemented a fix to prevent leaking file descriptors by cleaning up open resources after each repair job completes. (OPSC-15466)
  • Fixed Repair Service alert regression when opscenterd is restarted during repairs. (OPSC-15746)

OpsCenter 6.5.5 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.5 release.

cluster_name.conf

The location of the cluster_name.conf file depends on the type of installation:
  • Package installations: /etc/opscenter/clusters/cluster_name.conf
  • Tarball installations: install_location/conf/clusters/cluster_name.conf

31 January 2019

Highlights

  • Added KMS Managed Encryption (SSE-KMS) as an option for Amazon S3 backups.
  • Commit logs can now be archived while backing up a snapshot. The execution of the commit log retention policy has been refactored and will now be more consistent.
  • Increased the speed when backing up to a local file system and fixed a bug with throttling speed to the local file system.
  • Fixed a critical bug that caused OpsCenter to hang at regular intervals.

Changes in 6.5.5

The following changes are included in this release.

Backup Service
  • Commit logs can now be archived while backing up a snapshot. The execution of the commit log retention policy has been refactored and will now be more consistent. (OPSC-14126)
  • Fixed the destination pre-check to fail the entire backup if the pre-check fails. (OPSC-14508)
  • Increased the speed when backing up to a local file system and fixed a bug with throttling speed to the local file system. (OPSC-14876)
  • Added a fix to sort keyspaces in the selection list. (OPSC-15114)
  • Added node IP in Destination validation error messages. (OPSC-15166)
  • Added KMS Managed Encryption (SSE-KMS) as an option for Amazon S3 backups. (OPSC-15170)
  • Fixed an issue causing restores to fail when restoring a keyspace containing user defined types (UTD). (OPSC-15308)
  • Changed permissions so that the diagnostic tarball only requires read permissions on files. (OPSC-15382)
  • Fixed a bug with region selection when using the Amazon aws-cli to use the region specified for the destination. (OPSC-15435)
  • Fixed an issue with the Backup Service and SSTable attached secondary indexes. (OPSC-15489)
  • Improved the speed of backups to local file system destinations. (OPSC-15530)
  • Improved handling of exceptions when trying to back up to an Amazon S3 bucket that does not exist. (OPSC-15544)
  • Fixed a bug where the Location form does not close when adding a new location for a Point In Time restore. (OPSC-15573)
Core
  • Disconnecting a cluster no longer fails if the cluster configuration file was already removed. (OPSC-11318)
  • Remove non-determinism from RollupReporter restart. (OPSC-13798)
  • DNS names no longer try to resolve during configuration validation. (OPSC-14181)
  • Made a change to always remove the server response header from opscenterd web server responses for security purposes to combat vulnerabilities in a known version of Twisted web server. (OPSC-14866)
  • Fixed an issue where a benign warning message would be logged when opscenterd started. (OPSC-14912)
  • Fixed an issue where STOMP would not come up on some platforms using the LANG=C.UTF-8 variable. (OPSC-15251)
  • Fixed an issue where STOMP is attempting to reconnect, causing OpsCenter to hang. (OPSC-15357)
  • Fixed an issue where OpsCenter generates too many asynchronous CQL queries, which results in a NoHostAvailableException. (OPSC-15461)
Monitoring
  • Corrected an unhandled exception when retrieving metrics from clusters with datacenters that contain hyphens in their names. (OPSC-14747)
Performance Service
  • Fixed an issue where some configuration parameters in the [agent_config] section of cluster_name.conf could not be parsed by the DataStax Agents. (OPSC-12258)
Provisioning
  • Added a banner notification concerning the end of public availability of Oracle Java 8. (OPSC-14679)
Solr
  • Removed uses of ALLOW FILTERING from queries executed by OpsCenter to avoid triggering Best Practice Service rules. (OPSC-12992)

OpsCenter 6.5.4 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.4 release.

4 December 2018

Highlights

  • Implemented a fix for a critical bug that caused all active, compressed SSTable backups to be cleaned up unnecessarily, resulting in incomplete backups. Active, uncompressed SSTable backup files were unaffected.
  • Enabled support for restores on DSE clusters using configuration encryption and client-to-node encryption.

Changes in 6.5.4

The following changes are included in this release.

Core
  • Made changes to include the output of DESCRIBE FULL SCHEMA in the diagnostic tarball downloaded from OpsCenter. (OPSC-13290)
  • Enhanced OpsCenter to support LDAP searches for users without specifying an Organizational Unit (OU). Also added the ability the to follow LDAP referrals. (OPSC-13384)
  • Fixed an issue where OpsCenter indicated that a change to the OpsCenter keyspace replication strategy failed, when selecting the link from the notification about the OpsCenter keyspace using SimpleStrategy for replication in a multi-datacenter environment. (OPSC-14406)
  • OpsCenter now adds the HttpOnly flag to its login session cookie to help prevent XSS attacks. (OPSC-14868)
  • Packages now include extra build information in the following files to aid in troubleshooting and support: ds_branch.txt, ds_version.txt, and ds_timestamp.txt. These files now include branch, commit, version, and timestamp information.(OPSC-15201)
  • Upgraded Dojo to version 1.14, which includes security patches. See the NIST website for more information. (OPSC-15327)
Monitoring
  • Implemented a change so that data for average time and average request for Solr cores comes from QueryMetrics MBean rather than older Solr MBeans. (OPSC-14845)
Backup Service
  • Enabled support for restores on DSE clusters using configuration encryption and client-to-node encryption. (OPSC-12312)
  • Improved exception handling relating to periodic failures before and after running the Backup script. (OPSC-12405)
  • Removed requirement that the backup_storage_dir must be on the same partition as the DataStax Agent tmp_dir. (OPSC-13108)
  • Fixed a small rendering issue in the Restore from Backup: Other Location form. (OPSC-14226)
  • Fixed an error that displayed when clicking Cancel after OpsCenter prompts whether you want to delete a scheduled job. (OPSC-14715)
  • Fixed a bug that could cause problems when restoring materialized views. (OPSC-14727)
  • Reduced memory required when Backup Service is taking a snapshot. (OPSC-15046)
  • Fixed an issue with point-in-time restores when an On Server destination is the only destination defined. (OPSC-15052)
  • Fixed Solr restore handling to be case sensitive. (OPSC-15117)
  • Fixed a bug that generated an error indicating that a Solr core could not be created because the associated table did not exist. This error occurred when tables backed by Solr cores were dropped before running the restore, but the keyspace was not dropped. (OPSC-15187)
  • Prevented errors about missing schema.cql for system tables when taking a backup (OPSC-15198)
  • Fixed a bug where the text value of the button label was passed in the parameter to remove the selected backup destination when selecting Delete Backup Data. (OPSC-15215)
Best Practice Service
  • Fixed an error in the Secondary indexes cardinality Best Practice rule where a list of nodes displayed instead of information about too many secondary indexes in keyspaces and tables. (OPSC-15209)
Provisioning
  • Improved error message returned when the $JAVA_HOME environment variable is invalid. (OPSC-14390)
  • LCM now performs client-side health checks against each node in the job by executing a local query before the job is considered successful on that node. (OPSC-14848)
  • The DSE health check timeout (dse_healthcheck_startup_timeout) can now be set to configure how long LCM will wait for DSE to start up. (OPSC-15014)
  • LCM health checks for DSE startup now retry if the service script says the service is not running. It was observed that the status might be inaccurate early on during service start. (OPSC-15043)
  • Added the ability to have LCM transfer meld to a node using SCP rather than SFTP. The default behavior is to try SFTP first and fall back to SCP if a failure occurs. This option can be configured using the meld_upload_method parameter. (OPSC-15221)
Repair Service
  • Implemented a fix to prevent the first Repair Service progress alert from being triggered before the period (in seconds) configured by error_logging_window elapses first. (OPSC-13166)
  • Added safeguards to prevent orphaned repair tasks from affecting the currently running repair jobs, which could have caused Repair Service jobs to deadlock. (OPSC-14218)
  • Fixed a bug for Distributed Subrange Repair (DSR) to honor the max_parallel_repairs property, which was remaining at a value of 1 regardless of the specified value. (OPSC-14947)
  • Implemented a change to use Java long to prevent the Repair Service from crashing due to Java int overflow during subrange repairs. (OPSC-15182, OPSC-15255)

OpsCenter 6.5.3 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.3 release.

10 September 2018

Highlights

Implemented a fix for a critical bug that caused all active, compressed SSTable backups to be cleaned up unnecessarily, resulting in incomplete backups. Active, uncompressed SSTable backup files were unaffected.

Implemented DSR (Distributed Subrange Repair) as an alternative implementation of subrange repairs within the OpsCenter Repair Service, intended to better scale for large clusters. See Enabling distributed subrange repairs.

See New features for more details.

Changes in 6.5.3

The following changes are included in this release.

Core
  • Logging levels for OpsCenter and all DataStax agents in a cluster can now be set with a curl command. (OPSC-7105)
  • The DataStax agent now supports Transport Layer Security (TLS) with remote JMX. (OPSC-8375)
  • Added multi-role support for LDAP authentication. Added additional 'roles' field to '/users' and '/users/\{username\}' GET responses for getting all roles that a user belongs to. (OPSC-12740)
  • Corrected an issue that prevented the failover OpsCenter instance from connecting to the DataStax agents during failover. (OPSC-11742)
  • Improved favicon display in several browsers. (OPSC-13788)
  • Reduced memory usage in opscenterd when requests are made to agents. (OPSC-15037)
  • Added an authentication plugin framework to allow custom authentication strategies. (OPSC-14507)
Backup Service
  • Fixed an issue when using multi-level prefix paths in Backup Service. (OPSC-14687)
  • Restored marker in the backups location dialogue indicating that AWS key and secret are required for user supplied credentials. (OPSC-14702)
  • Fixed an issue where remote_backup_region values specified in the cluster configuration file were not used as bucket defaults. (OPSC-14775)
  • Fixed an issue with AWS Credentials Provider Chain related to IAM Roles. (OPSC-14939)
  • Fixed an issue in the UI where editing an Amazon S3 destination after restarting OpsCenter shows Enable S3 server-side encryption and Enable S3 transfer acceleration enabled when they are not. (OPSC-14982)
  • Fixed an issue that caused schema files to be repeatedly sent to a destination during a backup. (OPSC-15009)
  • Fixed a memory leak in the backup job execution cache. (OPSC-15015)
Repair Service
  • Implemented the DSR (Distributed Subrange Repair) feature, which is an alternative implementation of subrange repairs for the OpsCenter Repair Service. DSR is designed to scale for larger clusters by distributing more work to the agents. (OPSC-14283)
  • Omit verbose C3P0 logging from agent log file. (OPSC-14176)
  • Statistics of all DSR tasks are now reported by the API, not just stats of tasks that are completed or in-progress. (OPSC-14873)
Restore Service
  • Amazon S3 destinations now support selecting a region from all currently available regions in the UI. (OPSC-14692)
  • Destination validation logic now happens in a DataStax agent. (OPSC-14611)
  • OpsCenter will now properly log exceptions from LDAP containing Unicode characters. (OPSC-14452)
  • Corrected an issue that caused the restore status to initially show 100% then reset to 0%. (OPSC-14995)
Monitoring
  • Improved the color scheme in the node status UI. (OPSC-12618)
Provisioning
  • Added information about upgrade jobs to the cluster workspace tooltip. (OPSC-13107)
  • Enlarged the SSH Private Key field in LCM UI to improve readability when entering SSH keys. (OPSC-13509)
  • Improve LCM error messages when invalid characters are submitted for usernames or entity names. (OPSC-14411)
  • Improved error message when LCM attempts to update the default administrator password in the cassandra CQL account, but no new password has been specified on the LCM cluster model. (OPSC-14586)
  • Performing a minor upgrade on yum-based systems when dse-demos is installed no longer upgrades DSE to the latest available version. (OPSC-14608)

OpsCenter 6.5.2 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.2 release.

8 August 2018

Highlights

Implemented a fix for a critical bug that caused all active, compressed SSTable backups to be cleaned up unnecessarily, resulting in incomplete backups. Active, uncompressed SSTable backup files were unaffected.

See New features for more details.

Changes in 6.5.2

The following changes are included in this release.

Backup Service
  • Fixed a critical bug that caused all active, compressed SSTable backup files (.gz) to be cleaned up unnecessarily, resulting in incomplete backups. (OPSC-14880)

OpsCenter 6.5.1 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.1 release.

5 July 2018

Highlights

  • OpsCenter now drops compact storage option from all tables inside the configured OpsCenter keyspace.
  • Fixed an issue where LCM jobs would fail to terminate when aborted.
  • Fixed a bug in the repair service parallel repairs calculation for subrange repairs.

See New features for more details.

Changes in 6.5.2

The following changes are included in this release.

Core
  • Removed arrow from Refresh label in Event log. (OPSC-14316)
  • OpsCenter now drops compact storage option from all tables inside the configured OpsCenter keyspace. (OPSC-14442)
  • Fixed an issue where the DataStax agents would always verify subject alternative names in certificates if the STOMP address was a hostname. (OPSC-14551)
Backup Service
  • Added support to use system default credentials for Amazon S3 backups as described in https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/credentials.html. (OPSC-5161)
  • Enabled support for restores on clusters with Kerberos. (OPSC-14236)
  • Fixed an issue where a restore would fail if the backup was taken shortly after dropping a column from a table. (OPSC-13029)
  • Added support to configure the backup storage directory (backup_storage_dir) using the commit log backup settings. (OPSC-14496)
  • Optimized backup file comparison synchronization. (OPSC-14559)
  • Fixed an issue where a restore would fail if the backup was taken shortly after dropping a column from a table. (OPSC-13029)
Performance Service
  • Fixed Read Stage and Mutation Stage best practice rules when running on DSE 6.0 (OPSC-14430)
Repair Service
  • Fixed a bug in the repair service parallel repairs calculation for subrange repairs. Previously, while calculating the required number of parallel repairs, the current number of parallel repairs allowed was not taken into account. This would lead to the repair service running significantly less repairs in parallel than the configured maximum allowed even when repair throughput was lower than needed to complete the full repair cycle by the completion deadline. The updated algorithm resolved the issue. (OPSC-13172)
  • Eliminated unnecessary logging when the repair service is running tasks at the max rate. (OPSC-14476)
  • Fixed a bug that prevented a repair cycle if the Repair Service was unable to find a task to run of over max_down_node attempts. (OPSC-14733)
Provisioning
  • Fixed an issue where LCM jobs would fail to terminate when aborted. (OPSC-14410)
Dashboard
  • Fixed an issue where cluster overview sparklines would load, but not update. (OPSC-13913)

OpsCenter 6.5.0 release notes

Release notes for the OpsCenter and Lifecycle Manager version 6.5.0 release.

17 April 2018

Highlights

  • Support for DSE 6.0
  • NodeSync Service
  • Lifecycle Manager improvements: support upgrading DSE versions on a datacenter or node within a supported release series; cloning and error validation in config profiles.

See New features for more details.

Changes in 6.5.0

The following changes are included in this release.

Core
  • Updated to the 1.6.2 version of the DSE driver. (OPSC-12624)
  • The agent can now be configured to enforce that certificate subjects match the opscenterd server. (OPSC-11806)
  • Opscenterd now only allows the TLSv1.2 protocol when HTTPS is enabled. (OPSC-11981)
  • Removed support for DSE versions older than 5.0. (OPSC-12784)
  • Improved default algorithm (rsa 2048) and hash (SHA256) for automatically generated self-signed TLS certificates. (OPSC-9941)
  • OpsCenter will fail to start if config_encryption_active is True and the system key file is missing. (OPSC-10284)
  • The first cluster that a role has permissions for is now automatically selected in the Cluster list of the Edit Role dialog. (OPSC-11759)
  • The default max heap size of the DataStax Agents has been increased to 1GB so that it better supports metrics collections in mid-large size clusters. The default heap size of opscenterd has also been increased to 2GB to better support the Repair Service. (OPSC-12313)
  • OpsCenter uses the G1 garbage collector instead of the Concurrent Mark-Sweep (CMS) garbage collector. (OPSC-12548)
  • OpsCenter and the DataStax Agent have added support for the new native_transport_* parameters introduced inside of cassandra.yaml in DSE 6.0. (OPSC-13181)
  • Tables in the OpsCenter schema are no longer created with compact storage because this feature has been removed in DSE 6.0. For more information, see the DSE upgrade guide (5.1 to 6.0) and the CQL commands for migrating from compact storage (DSE 5.1.x), or the DSE upgrade guide (5.0 to 6.0) and the CQL commands for migrating from compact storage (DSE 5.0.x) to CQL table format, depending on the DSE version. (OPSC-13527)
  • Fixed handling of API errors when enabling or disabling Alert Rules. (OPSC-12814)
  • Fixed timeout that caused the OpsCenter UI to fail to load. (OPSC-13053)
  • For DSE version 6.0 and later, the DataStax Agent no longer collects information about nodetool thriftstats in diagnostic tarball generation because it has been removed from DSE. (OPSC-13560)
  • Updated opscenterd to use CQL rather than HTTP to create Solr cores when necessary. (OPSC-12621)
  • The primary OpsCenter instance URL for failover can now be configured in the opscenterd.conf file. (OPSC-5409)
  • The cluster configuration and the DataStax Agents now have a configurable timeout for read timeout requests for both monitored and storage clusters through the DataStax Java driver. (OPSC-13919)
  • The agent now sends its agent_rpc_broadcast_address to opscenterd at the longtime_interval configured in address.yaml. This improves the agent's ability to auto connect to opscenterd when using dense nodes. (OPSC-14049)
Monitoring
  • Fixed an issue with scrolling to a highlighted activity during bulk operations in the Activities area of OpsCenter Monitoring. (OPSC-5651)
  • Added support for NodeSync metrics. (OPSC-10611)
  • New dropped message metrics have been added, whereas the TP: Dropped Paged Range Reads and TP: Dropped Request Responses metrics have been removed for DSE 6.0 and later. Several metrics regarding dropped messages have had their labels changed from TP: <message type> to Dropped Messages: <message type>. See Dropped Messages metrics. (OPSC-12777)
  • Additional thread pool metrics have been added for monitoring DSE 6.0 and later. (OPSC-12839)
  • Pagination in Activity Event Log now works correctly upon navigating back to first events page. New events can be fetched from server by clicking on Refresh from first page. (OPSC-4937)
  • Fixed an issue with Dashboard graph line colors when switching from a percentile graph in OpsCenter Monitoring. (OPSC-5258)
  • Updated metric descriptions to warn that key cache metrics only apply to pre-DSE 6.0 SSTables. (OPSC-13368)
Backup Service
  • OpsCenter restore will now recreate a graph if a graph was present at backup. (OPSC-11505)
  • Fixed an issue with restore that prevented additional indexes from being recreated when the restored table also included a Solr core. (OPSC-12756)
  • Graphs are now represented as top level objects in the restore workflow. Keyspaces that are part of a graph are now bundled together and are no longer shown individually when restoring a graph backup. (OPSC-12989)
  • Added an endpoint to the API that fetches a dictionary to the graphs and their associated keyspaces that are present in a particular snapshot. (OPSC-13007)
  • The AWS CLI feature for bulk uploading backups to Amazon S3 has been promoted from an OpsCenter Labs feature to an official production feature. Adjust your use_s3_cli configuration settings from the [labs] section to the [backups] section. (OPSC-13165)
  • Changed how the On Server commit log storage works. Commit logs are still initially moved into the backup_staging_dir, but after the commit logs have been sent to any other configured locations, the commit logs are moved to the directory specified by a backup_storage_dir defined in address.yaml. This approach should resolve a number of problems customers have encountered when restarting agents due to large numbers of On Server commit logs being reprocessed. See Configuring commit log backups for details. (OPSC-14073)
NodeSync Service
  • Added support for DSE NodeSync. (OPSC-12602)
  • NodeSync status has been added to the Nodes Services panels in OpsCenter Monitoring. (OPSC-13301)
  • Added a NodeSync Service section to enable NodeSync and view NodeSync status for a cluster's keyspaces and tables. (OPSC-13281)
  • Added a Best Practice rule to validate that the NodeSync Service is running on every cluster node. (OPSC-13064)
Repair Service
  • The parallel_tasks_update_interval configuration option has been added to the Repair Service. (OPSC-14252)
Best Practice Service
  • Added a Best Practice rule to validate that the NodeSync Service is running on every cluster node. (OPSC-13064)
Lifecycle Manager (LCM) Provisioning
  • Added the ability to clone Config Profiles in Lifecycle Manager. (OPSC-6428), (OPSC-12595)
  • Repository setup in Lifecycle Manager is now optional for those who manually configure and manage their package repos externally from LCM. (OPSC-12343)
  • LCM will now drain nodes (stops accepting writes and flushes memtables) before they are restarted. This provides additional safety and speeds up restarts. (OPSC-12581)
  • LCM UI edit config profile page can now display multiple field validation errors, and makes it easy to navigate between errors. (OPSC-12615)
  • LCM has the ability to upgrade DSE on existing (previously installed) datacenters and nodes when assigning a new config profile with a higher DSE version patch. Upgrade the DSE version in LCM within a minor patch release series in DSE versions 5.0.x, 5.1.x, and later. (OPSC-9570), (OPSC-12721)
  • LCM is now using React 15.6.2, up from React 0.14.8. (OPSC-12808)
  • Added concurrency options when executing a job on a cluster. (OPSC-5120)
  • When authentication is enabled, it is no longer necessary to enter credentials in the Job dialogs every time a job is run. After enabling authentication in a Config Profile, entering credentials at the cluster level is required only one time. Future credential changes are allowed. (OPSC-9439)
  • Added job events to job detail screen and made job detail node list more informative. (OPSC-9572)
  • Improved the ability of LCM to recover from some Java installation errors. (OPSC-10400)
  • LCM now includes all explicit node IP addresses when attempting to connect to the cluster for changing the default cassandra user password. This should fix password change for certain network setups. (OPSC-11096)
  • Job termination in LCM will now wait for PID file removal for about 10 seconds before giving up. This should reduce the frequency of the abort ending before Meld has had a chance to shutdown cleanly. (OPSC-11099)
  • LCM API requests can now return multiple errors. (OPSC-11131)
  • A Show/Hide field descriptions button has been added to LCM for those fields that have descriptions or tooltips available. (OPSC-12283)
  • LCM job event detail may now contain a 'source' field containing the property path of the data related to the event. (OPSC-12340)
  • Configuration data in LCM is now validated against the definition files. This ensures proper structure and value types for config profile data. (OPSC-12433)
  • Configuration profile data is now validated against field dependencies in the definitions. This prevents inconsistent configuration data. (OPSC-12434)
  • LCM error responses now consistently use the message key for error text. (OPSC-12438)
  • The LCM API base URL has changed from v1 to v2. For additional details regarding LCM API updates, see Base URL version change in the OpsCenter New features. (OPSC-12500)
  • All passwords in Lifecycle Manager now require double entry for confirmation. (OPSC-12563)
  • Added OpsCenter and Lifecycle Manager support for DSE version 6.0.0. (OPSC-12634)
  • Updated serial install logic in converged DCs to be specific to the Automatic concurrency level. (OPSC-12780)
  • Added definition to support DSE startup property -Dcassandra.force_3_0_protocol_version=true. (OPSC-12944)
  • Added ability to filter on null values to LCM API. (OPSC-12990)
  • Updated dependency versions. (OPSC-13141)
  • Updated Korma version to latest. (OPSC-13170)
  • Lifecycle Manager edit dialogs for SSH credentials, repositories, and clusters now clearly indicate whether passwords have been set and provide change options. Removing a stored password is now an explicit option; inadvertently removing a stored password is no longer possible. (OPSC-13239)
  • Rearranged the Edit Credential comment field to be displayed after required fields on LCM UI. (OPSC-13621)
  • Improved the UI in LCM config profiles by providing additional room for custom list items. (OPSC-13680)
  • The workload options in the LCM Add Datacenter dialog have been renamed to align with DataStax products: Solr is now DSE Search; Spark is now DSE Analytics. (OPSC-13758)
  • Updated the LCM UI and API to reflect the DSE 6.0 rename of rpc_address and broadcast_rpc_address to native_transport_address and native_transport_broadcast_address respectively. (OPSC-13843)
  • Fixed a bug in LCM node IP address inheritance for broadcast_rpc_address. It now defaults to thenative_transport_address (formerly rpc_address) setting, as indicated by the UI form. (OPSC-11970)
  • LCM will no longer inherit profile properties from a parent level when there are config profiles in a cluster topology for different DSE versions. Instead, the profile at the lowest topology level takes precedence and a warning is posted in the job events. (OPSC-12314)
  • Fixed a bug in LCM cluster import when there are datacenter- or node-specific config options. (OPSC-13546)
  • Fixed an issue with configuring Graph serializers in dse.yaml. (OPSC-13955)
  • Added warning about using a remote JMX connection without authentication. (OPSC-13982)
  • Fixed a bug where LCM UI form dialogs would reset values to original state while editing. (OPSC-14025)

Known and resolved issues for OpsCenter 6.1 and later

Known issues, workarounds, and resolved issues for the OpsCenter and Lifecycle Manager 6.1 and later versions.

The following are known issues that exist in OpsCenter 6.1 and later versions. Each item has a link to more details including workarounds when available. These issues will be addressed in future releases where possible. If you have any questions, contact DataStax Support for assistance.

OpsCenter 6.5.0
  • When running an LCM job and attempting to abort or terminate the job while it is in progress, termination fails to stop the job unless the abort request is issued prior to the first node completing. Issuing a terminate or abort command after the first node has finished running has no effect. The job will continue to run to completion as if the terminate command had not been issued. (OPSC-14410)
OpsCenter 6.1.x and 6.5.0
  • OpsCenter does not automatically remove compact storage from its keyspaces when upgrading to OpsCenter 6.5.0. For important details, see Compact storage no longer supported. (OPSC-14442)
  • When restoring materialized views, OpsCenter does not correctly wait for the cluster schema to settle, which can cause errors when data is restored to the table on which the view is based. (OPSC-13029)
OpsCenter 6.1 and later
  • For DSE versions 5.1 and later, slow query data is only available since the last time the DataStax agent was restarted. (OPSC-11702)
  • If there are approximately 75 or more keyspaces, the DataStax Agent /tokenranges API call runs out of memory with the default heap size. As a temporary workaround, adjust the agent heap size. (OPSC-11975)
  • When using OpsCenter to restore a backup that contains multiple SASI indexes, some or all of these indexes might not restore correctly. The indexes appear in the table schema but might not function correctly. Indexes should be validated at restore time and rebuilt if errors are detected. For more information, see CREATE CUSTOM INDEX (SASI). (OPSC-11746)
    Note: SASI indexes are experimental for DSE. DataStax does not support SASI indexes for production.
OpsCenter 6.1
  • A large number of log messages might display regarding requests to /pit-cleanup if there are a large number of existing commit logs in the staging directory. (OPSC-8349)
  • Insufficient permissions on the staging directory can cause the agent to exhaust inotify watches on the system over time. (OPSC-10732)
  • Users will see an ungraceful error+stack trace in opscenterd.log if accessing a cluster through the UI/API that no longer exists. The error message contains ERROR: Unhandled error in Deferred: There are no clusters with name or ID.... This error message is harmless. (OPSC-8819)
  • Enabling SNMP alerts may cause opscenterd to hang on startup in some slower environments. (OPSC-9314; see More Details)
  • For DSE versions earlier than 5.0.7, the DataStax Agent can only estimate partition sizes and counts per node or keyspace for repairs by using JMX stats. For DSE versions 5.0.7 and later, the Datastax Agent queries the system size_estimates table for a more precise estimate of partition sizes and counts per range. (OPSC-11417, OPSC-11590)
  • For DSE versions 5.0 and later, object permissions currently are not persisted with an OpsCenter backup and thus are not re-applied when that backup is restored. As a result, users must manually manage object permissions externally from OpsCenter. For more details (no workaround available at this time), see the KB support article. (OPSC-11015)
  • The solr-index-size (displayed as Search: Core Size) metric in the OpsCenter Monitoring UI is unavailable for DSE versions 5.1.0 through 5.1.3. (OPSC-12267)
  • Lifecycle Manager (LCM)

    • Lifecycle Manager is not currently compatible with DSE Transparent data encryption. See Encrypted DSE configuration values for more details. (OPSC-7529)
    • DSE Graph properties: DSE Graph configuration in dse.yaml, which is configurable through LCM Config Profiles. All Graph properties in dse.yaml can be managed through the LCM UI with the exception of gremlin_server.serializers and gremlin_server.scriptEngines. If you are using LCM and need to customize these properties, be sure to leverage the LCM API to make the changes. Future changes to the Config Profile using the LCM UI will retain properties set through the API.
    • When configuring credentials in a Repository, special characters such as #, $, and so forth are supported, but non-ascii unicode characters are not. (OPSC-8921)