cassandra.query
- Prepared Statements, Batch Statements, Tracing, and Row Factories
Functions
tuple_factory
(colnames, rows)Returns each row as a tuple
Example:
>>> from cassandra.query import tuple_factory
>>> session = cluster.connect('mykeyspace')
>>> session.row_factory = tuple_factory
>>> rows = session.execute("SELECT name, age FROM users LIMIT 1")
>>> print rows[0]
('Bob', 42)
Changed in version 2.0.0: moved from cassandra.decoder
to cassandra.query
named_tuple_factory
(colnames, rows)Returns each row as a namedtuple. This is the default row factory.
Example:
>>> from cassandra.query import named_tuple_factory
>>> session = cluster.connect('mykeyspace')
>>> session.row_factory = named_tuple_factory
>>> rows = session.execute("SELECT name, age FROM users LIMIT 1")
>>> user = rows[0]
>>> # you can access field by their name:
>>> print "name: %s, age: %d" % (user.name, user.age)
name: Bob, age: 42
>>> # or you can access fields by their position (like a tuple)
>>> name, age = user
>>> print "name: %s, age: %d" % (name, age)
name: Bob, age: 42
>>> name = user[0]
>>> age = user[1]
>>> print "name: %s, age: %d" % (name, age)
name: Bob, age: 42
Changed in version 2.0.0: moved from cassandra.decoder
to cassandra.query
dict_factory
(colnames, rows)Returns each row as a dict.
Example:
>>> from cassandra.query import dict_factory
>>> session = cluster.connect('mykeyspace')
>>> session.row_factory = dict_factory
>>> rows = session.execute("SELECT name, age FROM users LIMIT 1")
>>> print rows[0]
{u'age': 42, u'name': u'Bob'}
Changed in version 2.0.0: moved from cassandra.decoder
to cassandra.query
ordered_dict_factory
(colnames, rows)Like dict_factory()
, but returns each row as an OrderedDict,
so the order of the columns is preserved.
cassandra.decoder
to cassandra.query
class SimpleStatement
A simple, un-prepared query.
query_string should be a literal CQL statement with the exception
of parameter placeholders that will be filled through the
parameters argument of Session.execute()
.
See Statement
attributes for a description of the other parameters.
class PreparedStatement
A statement that has been prepared against at least one Cassandra node.
Instances of this class should not be created directly, but through
Session.prepare()
.
A PreparedStatement
should be prepared only once. Re-preparing a statement
may affect performance (as the operation requires a network roundtrip).
<b>A note about <code>*</code> in prepared statements</b>: Do not use *
in prepared statements if you might
change the schema of the table being queried. The driver and server each
maintain a map between metadata for a schema and statements that were
prepared against that schema. When a user changes a schema, e.g. by adding
or removing a column, the server invalidates its mappings involving that
schema. However, there is currently no way to propagate that invalidation
to drivers. Thus, after a schema change, the driver will incorrectly
interpret the results of SELECT *
queries prepared before the schema
change. This is currently being addressed in CASSANDRA-10786.
Methods
bind
(values)Creates and returns a BoundStatement
instance using values.
See BoundStatement.bind()
for rules on input values
.
class BoundStatement
A prepared statement that has been bound to a particular set of values.
These may be created directly or through PreparedStatement.bind()
.
prepared_statement should be an instance of PreparedStatement
.
See Statement
attributes for a description of the other parameters.
Attributes
prepared_statement
= NoneThe PreparedStatement
instance that this was created from.
values
= NoneThe sequence of values that were bound to the prepared statement.
Methods
bind
(values)Binds a sequence of values for the prepared statement parameters and returns this instance. Note that values must be:
-
a sequence, even if you are only binding one value, or
-
a dict that relates 1-to-1 between dict keys and columns
UNSET_VALUE
was introduced. These can be bound as positional parameters
in a sequence, or by name in a dict. Additionally, when using protocol v4+:short sequences will be extended to match bind parameters with UNSET_VALUE
names may be omitted from a dict with UNSET_VALUE implied.
Attributes
routing_key
The partition_key
portion of the primary key,
which can be used to determine which nodes are replicas for the query.
If the partition key is a composite, a list or tuple must be passed in. Each key component should be in its packed (binary) format, so all components should be strings.
class Statement
An abstract class representing a single query. There are three subclasses:
SimpleStatement
, BoundStatement
, and BatchStatement
.
These can be passed to Session.execute()
.
Attributes
retry_policy
= NoneAn instance of a cassandra.policies.RetryPolicy
or one of its
subclasses. This controls when a query will be retried and how it
will be retried.
consistency_level
= NoneThe ConsistencyLevel
to be used for this operation. Defaults
to None
, which means that the default consistency level for
the Session this is executed in will be used.
fetch_size
= <object object>How many rows will be fetched at a time. This overrides the default
of Session.default_fetch_size
This only takes effect when protocol version 2 or higher is used.
See Cluster.protocol_version
for details.
keyspace
= NoneThe string name of the keyspace this query acts on. This is used when
TokenAwarePolicy
is configured for
Cluster.load_balancing_policy
It is set implicitly on BoundStatement
, and BatchStatement
,
but must be set explicitly on SimpleStatement
.
custom_payload
= NoneCustom Payloads to be passed to the server.
These are only allowed when using protocol version 4 or higher.
New in version 2.6.0.routing_key
The partition_key
portion of the primary key,
which can be used to determine which nodes are replicas for the query.
If the partition key is a composite, a list or tuple must be passed in. Each key component should be in its packed (binary) format, so all components should be strings.
serial_consistency_level
The serial consistency level is only used by conditional updates
(INSERT
, UPDATE
and DELETE
with an IF
condition). For
those, the serial_consistency_level
defines the consistency level of
the serial phase (or “paxos” phase) while the normal
consistency_level
defines the consistency for the “learn” phase,
i.e. what type of reads will be guaranteed to see the update right away.
For example, if a conditional write has a consistency_level
of
QUORUM
(and is successful), then a
QUORUM
read is guaranteed to see that write.
But if the regular consistency_level
of that write is
ANY
, then only a read with a
consistency_level
of SERIAL
is
guaranteed to see it (even a read with consistency
ALL
is not guaranteed to be enough).
The serial consistency can only be one of SERIAL
or LOCAL_SERIAL
. While SERIAL
guarantees full
linearizability (with other SERIAL
updates), LOCAL_SERIAL
only
guarantees it in the local data center.
The serial consistency level is ignored for any query that is not a
conditional update. Serial reads should use the regular
consistency_level
.
Serial consistency levels may only be used against Cassandra 2.0+
and the protocol_version
must be set to 2 or higher.
See Lightweight Transactions (Compare-and-set) for a discussion on how to work with results returned from conditional statements.
New in version 2.0.0.Module Data
UNSET_VALUE
Specifies an unset value when binding a prepared statement.
Unset values are ignored, allowing prepared statements to be used without specify
See https://issues.apache.org/jira/browse/CASSANDRA-7304 for further details on semantics.
New in version 2.6.0.Only valid when using native protocol v4+
class BatchStatement
A protocol-level batch of operations which are applied atomically by default.
New in version 2.0.0.batch_type specifies The BatchType
for the batch operation.
Defaults to BatchType.LOGGED
.
retry_policy should be a RetryPolicy
instance for
controlling retries on the operation.
consistency_level should be a ConsistencyLevel
value
to be used for all operations in the batch.
custom_payload is a Custom Payloads passed to the server. Note: as Statement objects are added to the batch, this map is updated with any values found in their custom payloads. These are only allowed when using protocol version 4 or higher.
Example usage:
insert_user = session.prepare("INSERT INTO users (name, age) VALUES (?, ?)")
batch = BatchStatement(consistency_level=ConsistencyLevel.QUORUM)
for (name, age) in users_to_insert:
batch.add(insert_user, (name, age))
session.execute(batch)
You can also mix different types of operations within a batch:
batch = BatchStatement()
batch.add(SimpleStatement("INSERT INTO users (name, age) VALUES (%s, %s)"), (name, age))
batch.add(SimpleStatement("DELETE FROM pending_users WHERE name=%s"), (name,))
session.execute(batch)
New in version 2.0.0.
Changed in version 2.1.0: Added serial_consistency_level as a parameter
Changed in version 2.6.0: Added custom_payload as a parameter
Attributes
serial_consistency_level
= NoneThe same as Statement.serial_consistency_level
, but is only
supported when using protocol version 3 or higher.
batch_type
= NoneThe BatchType
for the batch operation. Defaults to
BatchType.LOGGED
.
Methods
clear
()This is a convenience method to clear a batch statement for reuse.
Note: it should not be used concurrently with uncompleted execution futures executing the same
BatchStatement
.
add
(statement, parameters=None)Adds a Statement
and optional sequence of parameters
to be used with the statement to the batch.
Like with other statements, parameters must be a sequence, even if there is only one item.
add_all
(statements, parameters)Adds a sequence of Statement
objects and a matching sequence
of parameters to the batch. Statement and parameter sequences must be of equal length or
one will be truncated. None
can be used in the parameters position where are needed.
class BatchType
A BatchType is used with BatchStatement
instances to control
the atomicity of the batch operation.
Attributes
LOGGED
= BatchType.LOGGEDAtomic batch operation.
UNLOGGED
= BatchType.UNLOGGEDNon-atomic batch operation.
COUNTER
= BatchType.COUNTERBatches of counter operations.
class ValueSequence
A wrapper class that is used to specify that a sequence of values should
be treated as a CQL list of values instead of a single column collection when used
as part of the parameters argument for Session.execute()
.
This is typically needed when supplying a list of keys to select. For example:
>>> my_user_ids = ('alice', 'bob', 'charles')
>>> query = "SELECT * FROM users WHERE user_id IN %s"
>>> session.execute(query, parameters=[ValueSequence(my_user_ids)])
class QueryTrace
A trace of the duration and events that occurred when executing an operation.
Attributes
request_type
= NoneA string that very generally describes the traced operation.
duration
= NoneA datetime.timedelta
measure of the duration of the query.
client
= NoneThe IP address of the client that issued this request
This is only available when using Cassandra 2.2+
coordinator
= NoneThe IP address of the host that acted as coordinator for this request.
parameters
= NoneA dict
of parameters for the traced operation, such as the
specific query string.
started_at
= NoneA UTC datetime.datetime
object describing when the operation
was started.
events
= NoneA chronologically sorted list of TraceEvent
instances
representing the steps the traced operation went through. This
corresponds to the rows in system_traces.events
for this tracing
session.
trace_id
= Noneuuid.UUID
unique identifier for this tracing session. Matches
the session_id
column in system_traces.sessions
and
system_traces.events
.
Methods
populate
(max_wait=2.0, wait_for_complete=True, query_cl=None)Retrieves the actual tracing details from Cassandra and populates the
attributes of this instance. Because tracing details are stored
asynchronously by Cassandra, this may need to retry the session
detail fetch. If the trace is still not available after max_wait
seconds, TraceUnavailable
will be raised; if max_wait is
None
, this will retry forever.
wait_for_complete=False bypasses the wait for duration to be populated. This can be used to query events from partial sessions.
query_cl specifies a consistency level to use for polling the trace tables, if it should be different than the session default.
class TraceEvent
Representation of a single event within a query trace.
Attributes
description
= NoneA brief description of the event.
datetime
= NoneA UTC datetime.datetime
marking when the event occurred.
source
= NoneThe IP address of the node this event occurred on.
source_elapsed
= NoneA datetime.timedelta
measuring the amount of time until
this event occurred starting from when source
first
received the query.
thread_name
= NoneThe name of the thread that this event occurred on.
exception TraceUnavailable
Raised when complete trace details cannot be fetched from Cassandra.