Architecture FAQ

Frequently asked questions about Hyper-Converged Database (HCD).

What is NoSQL?

The NoSQL term originally referred to a new generation of databases that shunned SQL for other interfaces. The term NoSQL has recently become a catch-all term for post-relational "not-only SQL" databases that use a method of storage different from a relational, or SQL, database.

How is HCD different from relational databases?

HCD is a distributed and highly available database that uses peer-to-peer communication. Data modeling in HCD is similar to relational databases while differing in key areas to provide blazing fast interaction. Relational databases use joins between tables for relationships. HCD uses denormalization to achieve more robust querying.

What kind of hardware do I need to run HCD?

The HCD Capacity Planning Guide provides recommendations for choosing hardware including number of CPUs, amount of RAM and type of disks appropriate for the workload and environment.

What operational tools are included with HCD?

Mission Control is the next generation operations platform for deploying, managing, and maintaining HCD, DataStax Enterprise (DSE), and Apache Cassandra® clusters. It provides a web-based interface that simplifies management and operation of HCD clusters.

How do I install HCD?

The Docker image can be deployed as a standalone container for development environments.

The recommended installation method is to deploy HCD using Mission Control. See Installing HCD.

How do I interact with HCD?

HCD architecture allows any authorized user to connect to any node in any datacenter and access data using the Cassandra Query Language (CQL 3.4.5 protocol v5). For ease of use, CQL uses a similar syntax to SQL. The most basic way to interact with HCD is using the CQL shell, cqlsh. Using cqlsh, you can create keyspaces and tables, insert records and query tables, and much more. Other ways to interact with HCD are:

  • Clients that interact with the Data API, a schema-less, document-based, modern API that provides easy and intuitive access to both structured and unstructured data. It leverages the scalability, performance, and real-time indexing capabilities of Apache Cassandra® to support GenAI application development. There are clients for Python, TypeScript, and Java.

  • Drivers that interact with CQL for developers who want to take advantage of specific Cassandra features. There are a number of drivers in various programming languages. See DataStax drivers.

How do I move data to and from HCD?

HCD provides several solutions for migrating data from other databases: DataStax Bulk Loader (DSBulk) is an open-source utility for loading or unloading data in CSV or JSON format into and out of HCD databases.

  • The cqlsh COPY FROM and COPY TO commands mirror what the PostgreSQL RDBMS uses for file import/export.

  • The sstableloader utility allows you to bulk load SSTables into HCD from the snapshots of another Cassandra cluster.

  • For more sophisticated use cases that require transformation and manipulation of the source data, extract-transform-load (ETL) solutions that support Cassandra are available from vendors including Talend, Informatica, and Streamsets.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com