Logging options

Logging and error options for the dsbulk command.

Specify logging and error options for the dsbulk command. Log messages are only logged to the main log file, operation.log, and standard error, and nothing is printed to stdout.

The options can be used in short form (-k keyspace_name) or long form (--schema.keyspace keyspace_name).

-ansiMode, --log.ansiMode, --dsbulk.log.ansiMode { normal | force | disable }

Whether or not to use ANSI colors and other escape sequences in log messages printed to the console. By default, dsbulk uses colored output ( normal ) when the terminal is: (1) compatible with ANSI escape sequences; all common terminals on *nix and BSD systems, including MacOS, are ANSI-compatible, and some popular terminals for Windows (Mintty, MinGW) or (2) a standard Windows DOS command prompt (ANSI sequences are translated on the fly). The force value will cause dsbulk to use ANSI colors even for non ANSI-compatible terminals detected. There should be no reason to disable ANSI escape sequences, but if, for some reason, colored messages are not desired or not printed correctly, this option allows disabling ANSI support altogether. For Windows: ANSI support works best with the (Microsoft Visual C++ 2008 SP1 Redistributable Package) installed.

Default: normal

-maxErrors, --log.maxErrors, --dsbulk.log.maxErrors { number | "N%" }

The maximum number of errors to tolerate before aborting the entire operation. Set to either a number or a string of the form N% where N is a decimal number between 0 and 100. Setting this value to -1 disables this feature (not recommended).

Default: 10

-logDir, --log.directory, --dsbulk.log.directory path_to_directory

The writable directory where all log files will be stored; if the directory specified does not exist, it will be created. URLs are not acceptable (not even file:/ URLs). Log files for a specific run, or execution, will be located in a sub-directory under the specified directory. Each execution generates a sub-directory identified by an "execution ID". See engine.executionId for more information about execution IDs. Relative paths will be resolved against the current working directory. Also, for convenience, if the path begins with a tilde (~), that symbol will be expanded to the current user's home directory.

Default: ./logs

--log.sources, --dsbulk.log.sources boolean

Whether to print record sources in debug files. When set to true (the default), debug files contain – for each record that failed to be processed – its original source, such as the text line that the record was parsed from.

When loading, enabling this option also enables the creation of so-called "bad files." That is, files containing the original lines that could not be inserted. These files could then be used as the data source of a subsequent load operation that would load only the failed records.

This feature is useful to locate failed records more easily and diagnose processing failures – especially if the original data source is a remote one, such as an FTP or HTTP URL.

For this feature to exist, record sources must be kept in memory until the record is fully processed. For large record sizes (over 1 megabyte per record), retaining record sources in memory could put a high pressure on the JVM heap, thus exposing the operation to out-of-memory errors. This phenomenon is exacerbated when batching is enabled. If you are experiencing such errors, consider disabling this option.

Note: DataStax Bulk Loader for Apache Cassandra® always prints the record's resource, which is the file name or the database table from which it came; and always prints the record's position, which is the ordinal position of the record inside the resource, when available. For example, the information could be record's line number in a CSV file.

Default: true

--log.stmt.level, --dsbulk.log.stmt.level { ABRIDGED | NORMAL | EXTENDED }
The desired log level for printing to log files. Valid values are:
  • ABRIDGED: Print only basic information in summarized form.
  • NORMAL: Print basic information in summarized form, and the statement's query string, if available. For batch statements, this verbosity level also prints information about the batch's inner statements.
  • EXTENDED: Print full information, including the statement's query string, if available, and the statement's bound values, if available. For batch statements, this verbosity level also prints all information available about the batch's inner statements.

Default: EXTENDED

--log.stmt.maxBoundValueLength, --dsbulk.log.stmt.maxBoundValueLength number
The maximum length for a bound value. Bound values longer than this value will be truncated.
Important: Setting this value to -1 disables this feature (not recommended).
Default: 50
--log.stmt.maxBoundValues, --dsbulk.log.stmt.maxBoundValues number
The maximum number of bound values to print. If the statement has more bound values than this limit, the exceeding values will not be printed.
Important: Setting this value to -1 disables this feature (not recommended).

Default: 50

--log.stmt.maxInnerStatements, --dsbulk.log.stmt.maxInnerStatements number
The maximum number of inner statements to print for a batch statement. Only applicable for batch statements, ignored otherwise. If the batch statement has more children than this value, the exceeding child statements will not be printed.
Important: Setting this value to -1 disables this feature (not recommended).

Default: 10

--log.stmt.maxQueryStringLength, --dsbulk.log.stmt.maxQueryStringLength number
The maximum length for a query string. Query strings longer than this value will be truncated.
Important: Setting this value to -1 disables this feature (not recommended).

Default: 500

-verbosity, --log.verbosity, --dsbulk.log.verbosity { 0 | 1 | 2 }
Desired level of verbosity. Valid values are:
  • 0 (quiet): Only log WARN and ERROR messages.
  • 1 (normal): Log INFO, WARN and ERROR messages.
  • 2 (verbose) Log DEBUG, INFO, WARN and ERROR messages.

Default: 1