Develop applications with Django and Astra DB Classic
Django is a popular framework for building web applications in Python.
This guide provides best practices and examples for using Astra databases in your Django applications through the django-cassandra-engine package.
This package essentially provides Django object models on a Cassandra backend.
It uses the Apache Cassandra® Python driver to connect to Cassandra-based databases, including Astra databases.
Django, the django-cassandra-engine package, and the Cassandra Python driver are designed for use with fixed-schema tables.
Prerequisites
-
Familiarity with Django and Python
-
An Astra DB Classic database with at least one keyspace
With the Django Cassandra package, you don’t need to manually create the tables for your Django application. If you run the
sync_cassandracommand before running your application for the first time, tables are created automatically based on your model definitions. For an example, see the sample application. -
Your database’s Secure Connect Bundle (SCB)
Try the sample application
If you prefer to try the django-cassandra-engine in the context of a sample application, download and extract the Partyfinder sample application.
This sample demonstrates basic data access using the model paradigm as well as direct access to the underlying database connection for advanced, Cassandra-specific use cases. To fully incorporate Cassandra patterns into your Django applications, you need to understand both approaches.
For setup and usage instructions, see the sample application’s README.md.
Snippets of this sample application are used throughout this guide to illustrate key concepts.
Use object mappers and models with the Django Cassandra package
With Django’s built-in Object-Relational Mapper (ORM), you define models that map Python classes to database tables and objects to rows.
The django-cassandra-engine package applies Django’s object-mapper paradigm to the context of Cassandra-based databases, and it provides access to the underlying database connection for use cases that aren’t available at the Django model layer.
This means you have the flexibility to work with Python classes and objects as well as raw CQL queries when necessary.
Instead of using the built-in django.db.models.Model to define your models, you subclass django_cassandra_engine.models.DjangoCassandraModel.
The following example defines a Party model for a table with a composite primary key consisting of a partition key (city), a clustering column (id), and additional columns (name, people, and date):
import uuid
from django.utils import timezone
from cassandra.cqlengine import columns
from django_cassandra_engine.models import DjangoCassandraModel
# A model for this app
class Party(DjangoCassandraModel):
city = columns.Text(
primary_key=True,
)
id = columns.UUID(
primary_key=True,
clustering_order='asc', # (allowed: 'asc' , 'desc')
default=uuid.uuid4,
)
name = columns.Text()
people = columns.Integer(default=0)
date = columns.DateTime(default=timezone.now)
class Meta:
get_pk_field='id'
CQL data types and primary keys in Django models
-
For compatibility with CQL data types, model fields must be defined using column classes from the
cassandra.cqlengine.columnspackage. -
In Django Cassandra models, there is no
max_lengthparameter fortextfields. This aligns with the absence of such a property for the CQLTEXTdata type. -
For tables with composite primary keys, use dedicated syntax to specify clustering columns in your models. For an example, see the
Partymodel in the sample application. -
Don’t add the
editable=Falseparameter to primary key columns in your models. -
For models with multi-column primary keys, regardless of the partition/clustering distinction, you must provide a
get_pk_fieldattribute through aMetaclass. This allows the Django engine to resolve queries such asMODEL_CLASS.objects.get(pk=…). For an example, see thePartymodel in the sample application.If you don’t include this attribute, your application won’t start. However, this attribute’s functionality is of little importance to applications with well-designed data models, Django models, and application logic where such queries are avoided entirely, implicitly and explicitly, as a best practice.
Avoid inefficient queries with Django models
Models that are poorly defined or implemented can produce incorrect or inefficient queries.
For example, using models with broad methods, such as .all(), can lead to ALLOW FILTERING clauses in CQL queries, which is generally considered an anti-pattern.
Similarly, it is possible for the .filter(…) method to include conditions that don’t align with the table’s logical data model, which can cause performance issues or timeouts.
|
Queries that follow anti-patterns don’t always fail outright.
Instead, your applications might experience degraded performance, such as timeouts and latency spikes.
For example, Make sure your models produce queries that follow CQL conventions and best practices. To be performant in production, your models must respect your underlying data model and Cassandra access patterns. |
For example, the Party table from the preceding section has a composite primary key, expressed in CQL as PRIMARY KEY (( city ), id).
Ideally, queries on this table should include both a city and an ID:
parties = Party.objects.filter(city=city, id=id)
However, code that omits the city, such as parties = Party.objects.filter(id=id), doesn’t inherently produce an error.
Instead, the underlying CQL query runs with an ALLOW FILTERING clause, leading to poor performance and potential timeouts in production environments.
Adjust your data model when migrating RDBMS-based applications
|
If your applications use capabilities of relational databases, such as foreign keys, you must make adjustments to your data model because Cassandra data modeling is fundamentally different from relational data modeling. For more information, see Data modeling methodology for Cassandra-based databases and the Cassandra data modeling workshop. |
If your pre-existing relational-based applications contain RDBMS-related specifications, such as joins, foreign keys, and cascading deletes, you must make structural changes to your data model, application, or both to be compatible with the NoSQL architecture of Cassandra-based databases.
Example RDBMS-based model that is incompatible with NoSQL
The following example uses foreign keys and cascading deletes, making it incompatible with Cassandra-based databases like Astra:
from django.db import models
from whatever import AnotherModel
class MyEntity(models.Model):
fkField = models.ForeignKey(AnotherModel, on_delete=models.CASCADE)
To adapt this model for use with Cassandra-based databases, you must remove the foreign key and cascading delete specifications by restructuring your data model or offloading these responsibilities to your application logic (instead of building them into your data model).
DataStax recommends restructuring your data model to avoid operations that are inefficient or incompatible with NoSQL databases. This might involve denormalizing your data or using alternative patterns to achieve similar functionality. For example, consider the following strategies:
-
Use application-level logic to enforce relationships instead of relying on database-enforced foreign keys.
-
Use manual references by storing related entity IDs, and then implement logic in your application to manage these relationships.
-
Redesign your data model to embed related data within rows, reducing the need for joins.
-
Use batch operations to handle related data updates atomically, if supported by your use case.
-
Use secondary indexes to facilitate efficient querying of related data, if applicable.
Alternatively, you could offload RDBMS-related responsibilities to your application logic, making minimal or no changes to your data model. This approach is only recommended for data models that are already generally compatible with Cassandra access patterns.
FileField and ImageField aren’t supported
Models based on Django’s RDBMS-based ORM support fields of type FileField, which pair with a form field of the same name and upload files by storing the actual file content on disk and the filepath in the backend storage engine.
The Cassandra engine has no such facility.
Although you can still use the form field, your application code must manually handle file uploads sent to the application’s endpoint by form POST calls.
This pattern also applies to the more specific ImageField model field.
When to bypass the model layer
Although the django-cassandra-engine package isn’t as feature rich as Django’s built-in Object-Relational Mapper (ORM) for relational databases, it covers most common use cases at the model layer.
For certain NoSQL-specific use cases that don’t fit into the object-mapper paradigm, you can bypass the model layer. Examples of these use cases include batches, lightweight transactions (LWTs), and TTLs.
To do this, you can pass a complete CQL statement directly to the underlying Session object, which is the Cassandra Python driver’s database connection.
The driver executes your CQL statement as is, rather than running a CQL statement constructed by the Django model and object-mapper.
Then, you add logic to handle the results, particularly for read queries where the result is one or more rows of data.
The following example (from the sample application) demonstrates how to use a Django view function to pass a CQL statement directly to the Session object:
from django.db import connection
def change_party_people(request, city, id, prev_value, delta):
delta_num = int(delta)
cursor = connection.cursor()
change_applied = cursor.execute(
'UPDATE party SET people = %s WHERE city=%s AND id=%s IF people = %s',
(
delta_num + prev_value,
city,
uuid.UUID(id), # must respect Cassandra type system
prev_value,
),
).one()['[applied]']
if not change_applied:
lwt_message = '?LWT_FAILED=1'
else:
lwt_message = ''
This example uses LWTs to increment and decrement the people column, which tracks the number of people attending a particular event.
LWTs are one way to prevent concurrent updates from producing inconsistent or invalid data states, such as a negative number of attendees.
However, your applications might use other concurrency solutions that are more efficient and performant for your use case.
Use Astra as the storage engine for your Django application
In your project’s Django settings, use the following parameters and secrets to connect your Django application to Astra.
In the sample application, you can find the general project-level settings in parties/parties/settings.py, and you can find the dependencies in requirements.txt.
PlainTextAuthProvider
At the top of the settings file, import the PlaintTextAuthProvider to handle authentication to Astra:
from cassandra.auth import PlainTextAuthProvider
INSTALLED_APPS
In the INSTALLED_APPS list, add django_cassandra_engine to the top of the list:
INSTALLED_APPS = [
'django_cassandra_engine',
'partyfinder.apps.PartyfinderConfig',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
]
DATABASES
Configure the DATABASES dictionary to use the django_cassandra_engine backend and the connection details for your Astra database:
DATABASES = {
'default': {
'ENGINE': 'django_cassandra_engine',
'NAME': KEYSPACE_NAME,
'OPTIONS': {
'connection': {
'auth_provider': PlainTextAuthProvider(
AUTH_USERNAME,
AUTH_PASSWORD,
),
'cloud': {
'secure_connect_bundle': SECURE_BUNDLE_PATH,
},
}
}
}
}
DataStax recommends that you use environment variables or other secure references for your database connection details:
-
KEYSPACE_NAME: The name of the keyspace within your database that you want to use in your Django application -
AUTH_USERNAME: The literal stringtoken -
AUTH_PASSWORD: An Astra application token with access to the database that you want to use in your Django application -
SECURE_BUNDLE_PATH: The full path to your database’s Secure Connect Bundle (SCB)
For more information about these values, see the Prerequisites.
CASSANDRA_FALLBACK_ORDER_BY_PYTHON
Optionally, you can add CASSANDRA_FALLBACK_ORDER_BY_PYTHON = True to enable in-code sorting for order_by() directives that cannot be mapped to CQL according to the table’s clustering.
Only use this option for small datasets, because in-code sorting can be suboptimal, especially for large result sets.
If enabled, it is normal for application logs to contain warnings such as the following; however, such warnings are legitimate exceptions if you don’t enable the fallback option:
UserWarning: .order_by() with column "-date" failed!
Falling back to ordering in python.
Exception was:
Can't order on 'date', can only order on (clustered) primary keys
Dependencies and drivers
Django applications backed by Astra require, at minimum, the Django and django-cassandra-engine packages.
The django-cassandra-engine package automatically installs ScyllaDB’s version of the Cassandra Python driver, scylla-driver.
This isn’t the DataStax-supported cassandra-driver.
The drivers are extremely similar, but you should be aware of this difference because scylla-driver doesn’t officially support Astra, and its development might diverge from cassandra-driver.
However, both drivers are imported with statements like from cassandra.cluster import Cluster, and installing both drivers creates namespace collisions.
If you prefer to use cassandra-driver, uninstall the Scylla driver (pip uninstall scylla-driver), and then install the DataStax-supported Python driver (pip install cassandra-driver).
All other code remains the same unless you were previously using ScyllaDB with ScyllaDB-specific driver code.
In addition to the required dependencies, your application likely has additional dependencies.
For example, the sample application’s requirements.txt file also declares the python-dotenv package, which is used by that project to read secrets from the .env file in settings.py.
Troubleshoot Django applications backed by Astra
- Unapplied migrations
-
Unapplied migration warnings can be ignored as long as you have run
sync_cassandrabefore starting your application for the first time and after making changes to models.You can ignore these warnings because the
migratecommand is replaced bysync_cassandrain the Cassandra engine. - Crash with no error or Segmentation fault
-
If the application crashes, particularly at start up, with no error message or an unhelpful message like
Segmentation fault (core dumped), check the following:-
Make sure your database is not in Hibernated status. If it is, activate the database and then try starting your application.
-
After you change a model, run
sync_cassandrabefore starting the application. -
Before you start an application for the first time with a new model, run
sync_cassandrato create the necessary tables in the database.
-
- Poor performance or timeouts
-
If your Django applications exhibit poor performance when running queries against your database, review the advice provided throughout this guide to ensure that your models aren’t producing inefficient CQL queries.