Getting started with DataStax Enterprise 6.9
This topic provides basic information and a roadmap to documentation for system administrators and developers new to DataStax Enterprise.
Which product?
To help you choose which DataStax products best fit your requirements, see Products on the DataStax website. DataStax Enterprise (DSE) 6.9 provides all the capabilities of Apache Cassandra™ plus a vector search add-on, Data API, enhanced storage-attached indexes (SAI), and advanced functionality.
Licenses
- Requirement for uniform licensing
-
All nodes in each DataStax licensed cluster must be uniformly licensed to use the same subscription. For example, if a cluster contains five nodes, all five nodes within that cluster must be DSE. Mixing different subscriptions within a cluster is not permitted. The DataStax Advanced Workloads Pack may be added to any DSE cluster in an incremental fashion. For example, extend three nodes in a 10-node DSE cluster to include the Advanced Workloads Pack. "Cluster" means a collection of nodes running the software which communicate with one another using gossip. See Enterprise Terms.
- DataStax Enterprise database
-
The DSE 6.9 bundle includes DSE Vector Add-on, Data API, and Mission Control that you may optionally set up and use. While Data API and Mission Control are free of charge with a paid DSE license, you must purchase the DSE Vector Add-on for use in production. Evaluate DSE 6.9 with Vector Add-on free of charge for 180 days. Note that non-production issues raised with DataStax support will incur a charge.
You can upgrade your existing DSE v6.8.x to DSE 6.9.
Learn
Before beginning administration tasks, save time when setting up and operating DataStax Enterprise (DSE) in a production environment by first learning a few basics:
- Differences between Cassandra or DSE and relational databases
-
Cassandra and DSE databases are much different than relational databases and use a data model based on the types of queries, not on modeling entities and relationships.
DataStax highly recommends reading Architecture in brief. It contains key concepts and terminology for understanding the database.
- Choose between two operational manager tools to set up, configure, and manage your environment
-
-
Mission Control replaces DSE OpsCenter and LifeCycle Manager, reducing the complexity of operations and deployment with cloud-native Kubernetes backed by enterprise-grade security, monitoring, and support.
-
DSE OpsCenter and Lifecycle Manager automate and simplify many administrative tasks.
-
Keep in mind that if you wish to have DataStax manage your database, you can use DataStax Astra, a cloud-native database-as-a-service built on Apache Cassandra.
- Learning resources
-
-
DataStax sample code and examples- Help for getting things done faster.
-
Learn menu - Available on every page where you can quickly access other resources such as blogs and DataStax Academy.
-
While not specific to administrators, these topics provide more database details:
-
Cassandra Query Language (CQL) is the query language for DataStax Enterprise.
-
DataStax provides drivers in several programming languages to connect client applications to the database.
-
APIs are available to interface with OpsCenter, DseGraphFrame, DataStax Spark Cassandra Connector, and the drivers.
Plan
Self-managed clusters require planning to ensure that you have the right hardware and software to support your workload. The Planning and testing guide contains guidelines for capacity planning and hardware selection in production environments. Explore the following topics to help you plan your cluster:
Install
Before starting development, you need to deploy a DataStax Enterprise (DSE) 6.9 database cluster. DataStax offers a variety of ways to set up a cluster. Select the method below that best suits your environment.
Method |
Description |
---|---|
Mission Control - preferred method |
Install DataStax Enterprise (DSE) 6.9 with Mission Control. Mission Control is a new cloud-based service that provides a unified management console for DSE and Apache Cassandra™. |
Tarball |
Install DataStax Enterprise (DSE) 6.9 with a tarball image. |
Docker - for test and development purposes |
Install DataStax Enterprise (DSE) 6.9 with [Docker containers. |
Packages for Linux-based platforms |
|
OpsCenter with LifeCycle Manager |
Install and deploy DSE using OpsCenter and Lifecycle Manager. |
For help with choosing an install type, see Which install method should I use?
Choose API and connect
DataStax Enterprise (DSE) 6.9 provides a variety of APIs for developing applications:
API |
Description |
---|---|
Data API with clients |
The Data API is the newest DataStax API for writing applications that store and query unstructured document data. The main development tool for writing applications that use the Data API are the clients that are currently available in three languages: Python, TypeScript, and Java. |
CQL API with drivers |
The Cassandra Query Language (CQL) is a SQL-like language for querying and managing databases. It stores structured data in tables and uses primary keys to index data. The main development tool for writing applications that use CQL are the community-supported and DataStax-supported drivers that are available in various languages. For example, see Python, NodeJS, or [Java quickstarts. |
CQL API with |
The Cassandra Query Language Shell (cqlsh) is a command-line utility for executing CQL commands. If you are in the
development phase and want to quickly test queries, |
Read Connection methods comparison for details.
Secure
DSE Advanced Security provides detailed user access controls to keep applications data protected and compliant with regulatory standards like PCI, SOX, HIPAA, and the European Union’s General Data Protection Regulation (GDPR). Key topics include:
The DSE database includes the default role |
Tune
Important tasks for optimizing the performance of the database include:
-
Enable the Nodesync service, which covers continuous background repair.
-
Load test your cluster before deployment.
Develop
-
DSE Vector Add-on: Vector Quickstart with CQL
-
Use Retrieval Augmented Generation (RAG) with your vector-enabled DSE 6.9 database: RAGstack Quickstart
-
DataStax Studio is an interactive developer tool for CQL (Cassandra Query Language), Spark SQL, and DSE Graph.
-
A variety of videos:
Operations
Operations involves all those tasks that are necessary to keep the database running smoothly. Mission Control automates and simplifies these tasks. You must have installed either Mission Control or DSE OpsCenter as your operational manager. The most common operations include:
Mission Control | DSE / OpsCenter |
---|---|
Adding or removing nodes, datacenters, or clusters with OpsCenter |
Load data
The primary tools for getting data into and out of the database are:
-
DataStax Bulk Loader, for data in compressed or uncompressed CSV or JSON format.
-
DataStax Apache Kafka™ Connector, synchronizes records from a Kafka topic to table rows in supported databases.
For other methods, see Migrating data to DataStax Enterprise.
Monitor clusters
DataStax provides the following tools to monitor clusters and view metrics:
Troubleshooting and help resources
DataStax provides a wide variety of resources for troubleshooting and other types of help:
Upgrade your database
Key topics in the Upgrade Guide include:
Advanced Functionality
In addition to all the capabilities of Apache Cassandra, DataStax Enterprise offers the following capabilities:
- DSE Analytics
-
Built on a production-certified version of Apache Spark™, with enhanced capabilities like AlwaysOn SQL for process streaming and historical data at scale.
- DSE Graph
-
DSE Graph is optimized for storing billions of items and their relationships. This enables you to identify and analyze hidden relationships between connected data and build powerful modern applications for real-time use cases: fraud detection, customer 360, social networks, IoT, and recommendation systems. The DSE Graph Quick Start is a great place to get started.
- DSE Search
-
Provides powerful search and indexing capabilities, including support for full-text, relevancy, sub-string, and fuzzy queries over large data sets, aggregation, and geospatial matchups.
- Mission Control
-
Focuses on the lifecycle management, security, operations, and observability of DataStax Enterprise (DSE) and Apache Cassandra™ clusters. Mission Control handles the orchestration of automation across regional cluster boundaries. This centralizes management of globally deployed clusters in a single location.
- DSE OpsCenter
-
Provides visual management and monitoring for DataStax Enterprise, including automatic backups, reduced manual operations, automatic failover, patch release upgrades, and secure management of DSE clusters.
- Lifecycle Manager
-
A visual provisioning and monitoring tool for DSE clusters. LCM allows you to define the cluster configuration including datacenter, node topology, and security. LCM monitoring helps you troubleshoot installation, configuration, and upgrade jobs.
- DSE Advanced Security
-
Provides fine-grained user and access controls to keep applications data protected and compliance with regulatory standards like PCI, SOX, HIPAA, and the European Union’s General Data Protection Regulation (GDPR).
- DSE Metrics Collector
-
Aggregates DSE metrics and integrates with existing monitoring solutions to facilitate problem resolution and remediation.
- DSE Management Services
-
DSE Management Services automatically handle administration and maintenance tasks and assist with overall database cluster management.
- NodeSync service
-
Continuous background repair that virtually eliminates manual efforts to run repair operations in a DataStax cluster.
- Advanced Replication
-
Advanced Replication allows a single cluster to have a primary hub with multiple spokes. This allows configurable, bi-directional distributed data replication to and from source and destination clusters.