Adding an Amazon S3 backup location

Set a retention policy for the backup location.

About this task

Add an Amazon S3 or S3-compatible backup location. For more details, see backing up to Amazon S3 and the Amazon S3 transfer acceleration documentation.

Moving backup files from Amazon S3 to Amazon Glacier is not supported by the OpsCenter Backup Service.

The Backup Service requires control over the data and structure of its destination locations. The backup destinations must be dedicated for use only by OpsCenter. Any additional directories or files in those destinations can prevent the Backup Service from properly conducting a Backup or Restore operation.

Prerequisites

  • Locate the cluster_name.conf configuration file. The location of this file depends on the type of installation:

    • Package installations: /etc/opscenter/clusters/cluster_name.conf

    • Tarball installations: install_location/conf/clusters/cluster_name.conf

  • Ensure Java 8 is installed on the same machine where DataStax Agents are running. Agents require Java 8 to store at an S3 location.

  • Make sure you have the proper AWS IAM privileges for the AWS account that the S3 bucket is linked to.

  • Ensure that the selected Amazon S3 bucket meets the Amazon S3 bucket requirements.

The AWS credentials and bucket names are stored in cluster_name.conf (with the exception of ad-hoc backups). Be sure to use proper security precautions to ensure that this file is not readable by unauthorized users.

Procedure

  1. Access the Create (or Edit) Backup dialog:

  2. In the Create or Edit Backup dialog, under Location, click +Add Location.

    The Add Location dialog appears.

    Add Location dialog S3 location with Retention Policy for scheduled backups

  3. Select Amazon S3 or S3 Compatible as the backup Location.

  4. Enter the location of the S3 bucket so that OpsCenter can locate it.

    Option Description

    Amazon S3

    Enter the Region where the S3 bucket is located.

    If blank, OpsCenter will try to query S3 for the bucket region or use the remote_backup_region as a default.

    Some regions, such as China (Beijing), require a region to be specified and cannot be queried.

    S3 Compatible

    Enter a URL that points to an S3 Endpoint.

    For example, mys3endpoint:9000.

  5. Enter the S3 Bucket name.

    The bucket name must be at least 4 characters long. Bucket names must only contain lowercase letters, numbers, and hyphens. Additionally, OpsCenter requires that bucket prefixes contain only lowercase letters, numbers, and safe characters. See the S3 guidelines for more details about bucket naming restrictions.

    To indicate a bucket subfolder location, delineate the bucket name from the folder name with a forward slash (/) character.

    Example: mybucket/myfolder/mysubfolder. Remember that slashes are not allowed within bucket or folder names themselves.

  6. Select the source type of your AWS credentials.

    The AWS credentials and bucket names are stored in cluster_name.conf (with the exception of ad hoc backups). Be sure to use proper security precautions to ensure that this file is not readable by unauthorized users.

    Option Description

    User-Supplied Credentials

    Enter your AWS Key and AWS Secret.

    AWS Credential Provider chain

    Use the default credential provider chain to locate AWS credentials. See Working with AWS Credentials on the AWS website.

  7. Select any throttling, compressing, encryption, or acceleration of the data:

    For S3 Compatible backups, Throttle S3 transfer rate is the only option available.

    1. To avoid saturating your network, set a maximum upload rate. Select Throttle transfer rate and set the maximum MB per second.

      When the AWS CLI feature is enabled, the S3 throttle is ignored. A tooltip also mentions this current limitation. See Tuning throttling when using AWS CLI.

    2. To compress the backup data, select Enable compression. Compression reduces the amount of data going through your network and reduces the disk and data usage but increases the CPU load for the server.

    3. To enable server-side S3 encryption, select Enable S3 server-side encryption. Enabling server-side encryption increases the security of your backup files, but increases the time it takes to complete a backup. For more information on S3 server-side encryption, see Using Server Side Encryption on the AWS website.

      Choose the type of encryption you want to use:

      Option Description

      256-bit Advanced Encryption Standard

      SSE-S3 encryption encrypts each file in the backup set with a unique key, including the key itself, using a 256-bit AES cypher.

      KMS Managed Encryption

      SSE-KMS encryption uses customer master keys (CMKs) to encrypt Amazon S3 objects.Enter a KMS Key ID that is associated with your AWS account.

    4. To back up nodes running in multiple regions to a single bucket, select Enable S3 transfer acceleration. Instead of traffic crossing over the internet, acceleration mode uses Amazon CloudFront to cache S3 requests. Because the CloudFront servers are closer to the nodes in each region, the backup latency is reduced.

      Enabling S3 transfer acceleration can cause performance degradation, and might slow a standard backup configuration. Use this option only if backing up nodes in multiple regions to a single bucket.

  8. Optional: For scheduled backups, indicate how long the snapshot data should be retained by selecting a Retention Policy. Retain All (default) saves the snapshot data indefinitely. Or, define a set period of time. After the snapshot data is older than the time set in Retention Policy, the snapshot data is deleted.

    DataStax strongly recommends setting a retention policy to periodically remove backups. This practice helps to avoid long-term performance issues caused by an excessive number of backups.

    Setting a Retention Policy is not available for an ad hoc (Run Now) backup.

  9. Click Save Location.

    The newly added S3 location displays in the Location pane of the Create or Edit Backup dialog.

    opscBSFSLocalLocationPane

    Click the edit icon to the edit a location and its retention policy if applicable.

    Click the delete icon to delete a location. The On Server location cannot be deleted.

  10. Click Save Backup or Create Backup as applicable.

Bulk uploading S3 backups using the AWS CLI

Using the S3 CLI is a feature that must be enabled.

About this task

Use the AWS CLI instead of the AWS SDK when bulk loading backups to Amazon S3 locations. Using the AWS CLI rather than the AWS SDK can result in a performance increase, with a noticeable decrease in the time it takes to complete a backup.

This feature is available in OpsCenter versions 6.1.3 and later as an OpsCenter Labs feature. As of OpsCenter version 6.5 and later, the AWS CLI feature is officially a production feature.

For more information, see AWS CLI in the Amazon documentation.

When the AWS S3 CLI is enabled, the S3 throttling setting is ignored by OpsCenter during backups. See Tuning throttling for AWS CLI.

Prerequisites

  1. Install the AWS CLI package on every node. DataStax recommends using the Amazon bundled installer method and upgrading to the latest version of AWS CLI if it is already installed. See Install the AWS CLI using the bundled installer in the Amazon documentation for installation procedures.

    As a recommended best practice for OpsCenter, install the AWS CLI bundle using APT as follows:

    sudo apt-get install -y unzip
    curl 'https://s3.amazonaws.com/aws-cli/awscli-bundle.zip' -o awscli-bundle.zip
    unzip awscli-bundle.zip
    sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws

    Regardless of the install procedure used, make sure that the AWS CLI package is installed in the PATH of the cassandra user, or whichever user the DataStax agent runs as.

  2. Add an S3 location for backups.

Procedure

  1. Locate the cluster_name.conf file. The location of this file depends on the type of installation:

    • Package installations: /etc/opscenter/clusters/cluster_name.conf

    • Tarball installations: install_location/conf/clusters/cluster_name.conf

  2. Open cluster_name.conf for editing. Substitute cluster_name with the name of your cluster. Setting agent options through the cluster configuration file sets the corresponding property in address.yaml on every node.

    To configure the setting for all clusters managed by an OpsCenter instance, open opscenterd.conf for editing.

    The location of this file depends on the type of installation:

    • Package installations: /etc/opscenter/opscenterd.conf

    • Tarball installations: install_location/conf/opscenterd.conf

    If necessitated by your environment, open address.yaml for editing and configuring at the node level. The location of this file depends on the type of installation:

    • Package installations: /var/lib/datastax-agent/conf/address.yaml

    • Tarball installations: install_location/conf/address.yaml

Do so for every node that requires a specific configuration override.

  1. Add the following configuration option:

    [backups]
    use_s3_cli = True
  2. Save the configuration file or files.

  3. Restart the OpsCenter daemon.

  4. If you made changes to address.yaml, restart the DataStax agents.

Tuning throttling when using AWS CLI

About this task

Use alternative throttle options when using the AWS CLI for bulk uploads because the OpsCenter S3 throttle is ignored when the OpsCenter AWS CLI for S3 feature is enabled.

Procedure

  1. Adjust the max_concurrent_requests available in the AWS SDK. Refer to the AWS CLI S3 configuration documentation for details.

  2. If necessary, use a tool such as Trickle to limit bandwidth.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com