cassandra-stress write

Multiple concurrent writes against the cluster.

Multiple concurrent writes against the cluster.

Synopsis

cassandra-stress write [arguments]
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a valid option or user-defined value.
[ ] Optional. Square brackets ( [ ] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key:value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[ -- ] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files.

Definition

Command options

cl=?
Set the consistency level to use during cassandra-stress. Options are ONE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, and ANY. Default is LOCAL_ONE.
clustering=DIST(?)
Distribution clustering runs of operations of the same kind.
duration=?
Specify the time to run, in seconds, minutes or hours.
err<?
Specify a standard error of the mean; when this value is reached, cassandra-stress will end. Default is 0.02.
n>?
Specify a minimum number of iterations to run before accepting uncertainly convergence.
n<?
Specify a maximum number of iterations to run before accepting uncertainly convergence.
n=?
Specify the number of operations to run.
no-warmup
Do not warmup the process, do a cold start.
ops(?)
Specify what operations to run and the number of each. (only with the user option)
profile=?
Designate the YAML file to use with cassandra-stress. (only with the user option)
truncate=?
Truncate the table created during cassandra-stress. Options are never, once, or always. Default is never.

Command arguments

-col
Column details, such as size and count distribution, data generator, names, and comparator.
Usage:
-col names=? [slice] [super=?] [comparator=?] [timestamp=?] [size=DIST(?)]
 or 
-col [n=DIST(?)] [slice] [super=?] [comparator=?] [timestamp=?] [size=DIST(?)]
-errors
How to handle errors when encountered during stress testing.
Usage:
-errors [retries=N] [ignore] [skip-read-validation]
  • retries=N Number of times to try each operation before failing.
  • ignore Do not fail on errors.
  • skip-read-validation Skip read validation and message output.
-graph
Graph results of cassandra-stress tests. Multiple tests can be graphed together.
Usage:
-graph file=? [revision=?] [title=?] [op=?]
-insert
Insert specific options relating to various methods for batching and splitting partition updates.
Usage:
-insert [revisit=DIST(?)] [visits=DIST(?)] partitions=DIST(?) [batchtype=?] select-ratio=DIST(?) row-population-ratio=DIST(?)
-log
>Where to log progress and the interval to use.
Usage:
-log [level=?] [no-summary] [file=?] [hdrfile=?] [interval=?] [no-settings] [no-progress] [show-queries] [query-log-file=?]
-mode
Thrift or CQL with options.
Usage:
-mode thrift [smart] [user=?] [password=?]
  or 
-mode native [unprepared] cql3 [compression=?] [port=?] [user=?] [password=?] [auth-provider=?] [maxPending=?] [connectionsPerHost=?] [protocolVersion=?]
  or
-mode simplenative [prepared] cql3 [port=?]
-node
Nodes to connect to.
Usage:
-node [datacenter=?] [whitelist] [file=?] []
-pop
Population distribution and intra-partition visit order.
Usage:
-pop seq=? [no-wrap] [read-lookback=DIST(?)] [contents=?]
  or
-pop [dist=DIST(?)] [contents=?]
-port
Specify port for connecting Cassandra nodes. Port can be specified for Cassandra native protocol, Thrift protocol or a JMX port for retrieving statistics.
Usage:
-port [native=?] [thrift=?] [jmx=?]
-rate
Set the rate using the following options:
-rate threads=N [throttle=N] [fixed=N]
where
  • threads=N number of clients to run concurrently.
  • throttle=N throttle operations per second across all clients to a maximum rate (or less) with no implied schedule. Default is 0.
  • fixed=N expect fixed rate of operations per second across all clients with implied schedule. Default is 0.
OR
-rate [threads>=N] [threads<=N] [auto]
Where
  • threads>=N run at least this many clients concurrently. Default is 4.
  • threads<=N run at most this many clients concurrently. Default is 1000.
  • auto stop increasing threads once throughput saturates.
-schema
Replication settings, compression, compaction, and so on.
Usage:
-schema [replication(?)] [keyspace=?] [compaction(?)] [compression=?]
-sendto
Specify a server to send the stress command to.
Usage:
-sendto <host>
-tokenrange
Token range settings.
Usage:
-tokenrange [no-wrap] [split-factor=?] [savedata=?]
-transport
Custom transport factories.
Usage:
-transport [factory=?] [truststore=?] [truststore-password=?] [keystore=?] [keystore-password=?] [ssl-protocol=?] [ssl-alg=?] [store-type=?] [ssl-ciphers=?]

Simple write example

# Insert (write) one million rows
$ cassandra-stress write n=1000000 -rate threads=50

Populate the database

Generally it is easier to let cassandra-stress create the basic schema and then modify it in CQL:

#Load one row with default schema
$ cassandra-stress write n=1 cl=one -mode native cql3 -log file=create_schema.log
 
#Modify schema in CQL
$ cqlsh
 
#Run a real write workload
$ cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -log file=load_1M_rows.log

Change the replication strategy

Changes the replication strategy to NetworkTopologyStrategy and targets one node named existing.

$ cassandra-stress write n=500000 no-warmup -node existing -schema "replication(strategy=NetworkTopologyStrategy, existing=2)"

Split up a load over multiple cassandra-stress instances on different nodes

This example demonstrates loading into large clusters, where a single cassandra-stress load generator node cannot saturate the cluster. In this example, $NODES is a variable whose value is a comma delimited list of IP addresses such as 10.0.0.1, 10.0.0.2, and so on.

#On Node1
$ cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1..1000000 -log file=~/node1_load.log -node $NODES
 
#On Node2
$ cassandra-stress write n=1000000 cl=one -mode native cql3 -schema keyspace="keyspace1" -pop seq=1000001..2000000 -log file=~/node2_load.log -node $NODES 

Run cassandra-stress with authentication and SSL encryption

The following example shows using the -mode option to supply a username and password, and the -transportation option for SSL parameters:

cassandra-stress write n=100k cl=ONE no-warmup -mode native cql3 user=cassandra password=cassandra 
-transport truststore=/usr/local/lib/dsc-cassandra/conf/server-truststore.jks truststore-password=truststorePass 
factory=org.apache.cassandra.thrift.SSLTransportFactory 
keystore=/usr/local/lib/dsc-cassandra/conf/server-keystore.jks keystore-password=myKeyPass
Note: Cassandra authentication and SSL encryption must already be configured before executing cassandra-stress with these options. The example shown above uses self-signed CA certificates.

Run cassandra-stress using the truncate option

This option must be inserted before the mode option, otherwise the cassandra-stress tool won't apply truncation as specified.

The following example shows the truncate command:

$ cassandra-stress write n=100000000 cl=QUORUM truncate=always -schema keyspace=keyspace-rate threads=200 -log file=write_$NOW.log