Count options

Count options for the dsbulk command.

Specify options for the dsbulk count command. These options specify how counting will be accomplished by DataStax Bulk Loader.

Databases supported by DataStax Bulk Loader

DataStax Bulk Loader supports the use of the dsbulk load, dsbulk unload, and dsbulk count commands with:
--executor.continuousPaging.enabled, --dsbulk.executor.continuousPaging.enabled {true | false}

Enable or disable continuous paging. If the target cluster does not support continuous paging, or if datastax-java-driver.basic.request.consistency is not ONE or LOCAL_ONE, traditional paging is used regardless of this setting. Can be used with unload and count operations. Not applicable for load.

Default: true

--executor.continuousPaging.maxConcurrentQueries, --dsbulk.executor.continuousPaging.maxConcurrentQueries number
The maximum number of concurrent continuous paging queries that should be carried in parallel. Set this number to a value equal to, or lesser than, the value configured server-side for continuous_paging.max_concurrent_sessions in the cassandra.yaml configuration file. If not set as noted above, some requests may be rejected. Setting this option to any negative value or zero will disable it. Can be used with unload and count operations. Not applicable for load.

Default: 60

--stats.modes { global | ranges | hosts | partitions }
Kind(s) of statistics to compute. Only applicable for count, ignored otherwise. Valid values are:
  • global: Count the total number of rows in the table.
  • ranges: Count the total number of rows per token range in the table.
  • hosts: Count the total number of rows per hosts in the table.
  • partitions: Count the total number of rows in the N biggest partitions in the table. Choose how many partitions to track with stats.numPartitions option. For partitions, the results are organized as follows:
    1. Left column: partition key value
    2. Middle column: number of rows using that partition key value
    3. Right column: the partition's percentage of rows compared to the total number of rows in the table

Default: global

--stats.numPartitions number
The number of distinct partitions for which to count rows. Only applicable for count, ignored otherwise.

Default: 10