DataStax Enterprise 5.1のリリース・ノート
DataStax Enterpriseのリリース・ノートでは、DataStax Enterprise 5.1のクラスター要件、アップグレード・ガイダンス、コンポーネント、変更点と機能強化、問題点、および解決済みの問題を取り上げています。
統一ライセンス要件
各クラスターに属するノードは全て同一のサブスクリプションを使うように統一のライセンス許諾を得なければなりません。例えば、クラスターが5ノードからなる場合、そのクラスターの5ノードが全て DataStax Distribution of Apache Cassandra™ であるか、5ノードが全て DataStax Enterprise でなければなりません。クラスター内で異なるサブスクリプションを混在させることはできません。DataStax Advanced Workloads Pack は、DataStax Enterprise(DataStax Distribution of Apache Cassandra ではなく)に加算的に追加することができます。 例えば、10 ノード構成の DSE クラスターのうち 3 ノードに Advanced Workloads Pack を追加することで拡張できます。「クラスター」は、対象ソフトウェアを実行し互いに gossip プロトコルを使って通信し合うノードの集合を意味します。「Enterprise Terms をご参照ください」。
アップグレードの前に
DataStax Enterpriseの最新バージョンは5.1.6です。
- 『DataStaxアップグレード・ガイド』を必ずお読みください。DSE 5.1へのアップグレードは、DSE 5.0からのみサポートされています。
- 必ず現行バージョンの最新のパッチ・リリースにアップグレードしてから、新しいバージョンにアップグレードしてください。最新のパッチ・リリースに含まれている修正によって、アップグレード・プロセスが向上するか、スムーズになる場合があります。
- ドライバーの互換性を必ず確認してください。お使いのドライバーがこのバージョンと互換性がない場合や、再コンパイルが必要になる場合があります。
DSE 5.1.6のリリース・ノート
DSE 5.1.6のリリース・ノート。
2017年11月/12月?
- 5.1.6のコンポーネント
- RNdse.html#RNdse516__516H
- 5.1.6の変更点と機能強化
- 5.1.6の解決済みの問題点
- 5.1.6のCHANGES.txt
- 5.1.6のNEWS.txt
5.1.6のコンポーネント
- Apache Cassandra™ 3.11.1.1939
- Apache Solr™ 6.0.1.0.1949
- Apache Tomcat® 8.0.44
- DataStax Spark Cassandra Connector 2.0.5
- TinkerPop 3.2.7-20171101-6b46fc5e
5.1.6のハイライト
5.1.6 DataStax Enterpriseコアのハイライト
DSE Advanced Replication(DSE拡張レプリケーション)のハイライト
5.1.6 DSE AnalyticsおよびDSEFSのハイライト
5.1.6 DSE Graphのハイライト
5.1.6 DSE Searchのハイライト
5.1.6の変更点と機能強化
5.1.6 DataStax Enterpriseの主な変更点と機能強化
- 複数のソースからの置換をサポートするための-Dcassandra.replace_consistency start-upパラメーターが追加されました。(DSP-14775)
- ネイティブ・トランスポートの起動を遅延させるための-Dcassandra.native_transport_startup_delay_seconds start-upパラメーターが追加されました。(DSP-14839)
- nodetool rebuild mode reset-no-snapshotオプションが追加されました。(DSP-14827)
- nodetool abortrebuildコマンドが追加されました。(DSP-14827)
- コーディネーションを読み取るためのローカルの非レプリカ要求のメトリクスが追加されました(
org.apache.cassandra.metrics:type=ReadCoordination,name=LocalNodeNonreplicaRequests
)。(DSP-14775) - cross_dc_rtt_in_msがデータ・センター間の要求に追加されました(デフォルトは0)。(DSP-14775)
- batchlog-replaysの新しいメトリクスが追加されました。(DSP-14839)
- テーブルからThrift互換性を削除するための新しいCQL ALTER TABLE DROP COMPACT STORAGEオプションが追加されました。(DSP-14839)
5.1.6 DSE Advanced Replication(DSE拡張レプリケーション)の変更点と機能強化
- ホストへの接続に使用するGremlin Consoleのコマンド・ライン・オプション。(DSP-12726)
5.1.6 DSE Analyticsの変更点と機能強化
- org.apache.spark.rpcのデフォルトのロギング・レベルがERRORに変更されました。(DSP-14651)
- 起動時に作成されるすべてのキースペース(spark_system、dse_leases、cfs、cfs_archive)は、SimpleStrategyまたはEverywhereStrategyを使用する必要があります。自動的に作成されたキースペースでは、NetworkTopologyStrategyを使用できません。DSE 5.1.6以降にアップグレードするには、これらのキースペースがすべてのノードで同じ設定を使用している必要があります。(DSP-11787)
- Sparkシェルの起動時間が改善されました。(DSP-14704)
- ドライバー・ポートが閉じているか、または到達不可能な場合は、Sparkエグゼキューターが再起動されません。(DSP-14824)
5.1.6 DSE Graphの変更点と機能強化
- Gremlin Consoleのplugins.txtは、デフォルトで読み取り専用になっています。(DSP-13372)
5.1.6 DSEFSの変更点と機能強化
5.1.6 DSE Searchの変更点と機能強化
- インデックスのパフォーマンスを高めるためのCREATE SEARCH INDEXのindexed true|falseオプション。(DSP-14364)
- SolrFilterCache使用時のフィルター・キャッシュ・エントリー数が32,000に制限されました。(DSP-14534)
- DSE Searchパフォーマンス・オブジェクトのスケジュールされたスナップショット収集に対する遅延が排除されました。(DSP-14561)
5.1.6の解決済みの問題点
5.1.6 DataStax Enterpriseの解決済みの問題点
- 監査ログで、プリペアド・ステートメントのUNSET値がサポートされていない。(DSP-13043)
- DSEプロセスでメモリー・リークにより、エグゼキューターの記述が累積する。(DSP-14868)
- 静的行を持つ空のパーティションの連続ページング状態が処理されるようになりました。(DSP-14959)
- 範囲の移動に関するベース・テーブルのストリーミング中にビューの作成がスキップされるようになりました。(DSP-14959)
- build.xml内のすべての関連テスト・エントリーにinvalid-sstable-root JVM引数が追加されました。(DSP-14827)
- プロトコル例外の場合にボディ・バッファーがリークしなくなり、Nettyが4.0.52にアップグレードされました。(DSP-14775)
- リストおよびセット・セレクターの要素がすべて同じ型であることが保証されるようになりました。(DSP-14775)
- nodetool引数にスペースが含まれているとスクリプト・エラーが出力される。(DSP-14959)
- RFとラック数が等しい場合はRF=1メソッドを使用するように、トークンの割り当てが変更されました。(DSP-14959)
- ブートストラップのストリーミングが失敗すると、認証が未初期化状態のままになる。(DSP-14839)
- バージョン・ハンドシェイク時のスレッドのラウンドトリップが排除されました。(DSP-14827)
- トークンの欠落に対するassassinateの回復性が向上しました。(DSP-14827)
- OOMを防止するため、MVリペア・ストリーミング中にベース・パーティションが抑制されるようになりました。(DSP-14775)
5.1.6 DSE Advanced Replication(DSE拡張レプリケーション)の解決済みの問題点
- TokenServiceへのデータ・センターの受け渡しに整合性が欠けると、マルチ・データ・センターのレプリケーション・エラーが発生する。(DSP-14767)
5.1.6 DSE Analyticsの解決済みの問題点
- dse client-tool構成のエクスポート/インポートで、cfsがデフォルトのファイル・システムとして間違って使用される。(DSP-14535)
- Sparkシャッフル・サービスで、アプリケーションの再試行時にシークレットを更新できない。(DSP-15038)
- Graph OLAP Sparkドライバーを実行する専用のユーザーが必要。(DSP-14869)
- Spark Jobserverジョブのログが欠落している。(DSP-14981)
5.1.6 DSEFSの解決済みの問題点
- getScheme、getDefaultPort、およびconcatメソッドの実装がDseFileSystem Hadoop APIに追加されました。(DSP-14605)
- 読み取り時に、
Response body rejected
エラーが誤って表示される。(DSP-14615) - DSEの権限管理を有効にすると、DSEFSの権限管理も有効になる。DSEFSでもDseAuthorizerの移行モードがサポートされました。(DSP-14616)
- DSEFSでクエリーのリトライが行われない。(DSP-14649)
- コマンド実行に失敗すると、終了コード0が誤って返される。(DSP-14652)
- ディレクトリーでのcat操作の実行が禁止されており、これを実行すると
Not a regular file <path>
メッセージが表示される。(DSP-14696) - セキュリティが有効になっていない場合に、
User name/password was not provided
警告がDSEFSシェル・ログに記録される。(DSP-14708)
5.1.6 DSE Graphの解決済みの問題点
5.1.6 DSE Searchの解決済みの問題点
- 認証が有効になっている場合、dsetool upgrade_index_filesが機能しない。(DSP-14114)
- 書き込みの進行中に、UpdateMetrics::Latency::Meanが「使用不可」になる。(DSP-14392)
- キースペースRF=(ノード数)でCQL検索クエリーを実行した場合に、トークン・フィルターが作成されなくなり、クエリーの速度が上がりました。(DSP-14468)
- CREATE SEARCH INDEXで、タプルおよびUDTフィールドを直接制御できない。(DSP-14639)
- CVE-2016-6809のコード実行の脆弱性が除去されました。(DSP-14747)
- Extended DisMax(eDisMax)クエリー・パーサーとローカル・パラメーターで、解析の無限ループが発生する可能性がある。(DSP-14748)
- solr/admin/cores?action=STATUS&memory=trueの場合に内部サーバー・エラー500が発生する。(DSP-14783)
- q.op=ANDであり、mmが明示的に設定されていない場合、ExtendedDismaxQParser(edismax)でBoolean ORが無視される。(DSP-14799)
- TrieDateField and DatePointFieldによるグループ化が失敗する。(DSP-14808)
- バージョンが混在するクラスターで、トークン・フィルター処理が失敗する場合がある。(DSP-14898)
- Solr UIでjson.facetパラメーターがサポートされました。(DSP-14893)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
パッケージ・インストールInstaller-Servicesインストール |
/etc/dse/dse.yaml |
tarボール・インストールInstaller-No Servicesインストール |
installation_location/resources/dse/conf/dse.yaml |
DSE 5.1.6のCHANGES.txt
DataStax Enterprise 5.1.6に含まれている、Apache Cassandra™ 3.11.1の実稼働環境で認定済みの変更点のリスト。
DataStax Enterprise (DSE) 5.1.6には、それより前のDSEリリースのすべての変更点と、Apache Cassandra™ 3.11.1に加えられた実稼働環境で認定済みの以下の変更点が含まれています。これらの変更点はCHANGES.txtにリストされています。
DSE 5.1.6のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
DSE 5.1.5のリリース・ノート
DataStax Enterprise 5.1.5のリリース・ノート。
2017年10月19日
5.1.5のコンポーネント
- Apache Solr™ 6.0.1.0.1984
5.1.5のハイライト
- CVE-2017-12629により、DSE Searchが有効なクラスターのセキュリティを強化するために、XML External Entity(XXE)攻撃に対するSolr XMLParser保護が追加され、Solr RunExecutableListenerが削除されました。(DSP-14618)
DSE 5.1.4のリリース・ノート
DataStax Enterprise 5.1.4のリリース・ノート。
2017年10月12日
- 5.1.4のコンポーネント
- RNdse.html#RNdse514__514H
- 5.1.4の変更点と機能強化
- 5.1.4の解決済みの問題点
- 5.1.4のCHANGES.txt
- 5.1.4のNEWS.txt
5.1.4のコンポーネント
- Apache Cassandra™ 3.11.0.1900
- Apache Solr™ 6.0.1.0.1949
- Apache Tomcat® 8.0.44
- DataStax Spark Cassandra Connector 2.0.5
- TinkerPop 3.2.7-20170926-2e5c13b7
5.1.4のハイライト
5.1.4 DSE Graphのハイライト
- セキュリティ:Graphサンドボックスはデフォルトで有効になり、構成されます。(DSP-11679)
- カスタムIDを持つ頂点は、プロパティとしてIDコンポーネントを返します。(DSP-14262)
5.1.4 DSE Searchのハイライト
- インデックスが作成されていないフィールドを扱う場合の安定性とパフォーマンスが向上しました。(DSP-6501)
すべてのスキーマ・フィールドでフル検証を行うと、アップグレード後に、検証エラーが発生する可能性がありました。「RNdse.html#RNdse514__514search」を参照してください。
- 検索パフォーマンスのオブジェクション回帰の問題が修正されました。(DSP-14241)
- インデックスを暗号化する際のメモリー・リークの問題が修正されました。(DSP-13826)
5.1.4の変更点と機能強化
5.1.4 DataStax Enterpriseの変更点と機能強化
- scrubにより、パーティション・キーが検証されます。スキーマのミューテーション作成に検証が追加されました。(DSP-14366)
- 常にcqlsh.pyでexecution_profilesを定義します。(DSP-14494)
- レプリケーション係数を大きくすると、フル・リペアを実行する前に警告が生成されるようになりました。(DSP-14494)
- アンチコンパクション・メトリクスが追加され、インクリメンタル・リペアが効率的でない場合に警告が生成されるようになりました。(DSP-14494)
5.1.4 DSE Advanced Replication(DSE拡張レプリケーション)の変更点と機能強化
- コマンドライン・インターフェイスでは、不明なコマンドに対してゼロ以外の終了コードを使用する必要があります。(DSP-13590)
5.1.4 DSE Graphの変更点と機能強化
- セキュリティを強化するため、デフォルトでGraphサンドボックスが有効になり構成されます。(DSP-11679)
- GraphFrame 0.5で、グラフ・フレーム・アルゴリズムが修正されました。(DSP-14271)
- Gremlin Consoleは、DSEディストリビューションでデフォルトのplugins.txtを使用します。
bin/dse gremlin-console ~/gremlin-console
でユーザーのホームが指定されている場合、plugins.txtにデータが追加されたことを確認するために追加のチェックが実施されます。(DSP-14286) - パーティション/クラスター化キーのマルチ・プロパティを防止します。(DSP-14300)
graph.tx().commit(); graph.tx().config().option("allow_scan", true).open(); g.V().count()
でgraph.tx().commit();
呼び出しを行うことはできません。代わりに、graph.tx().config().option("allow_scan", true).open(); g.V().count()
を使用してください。(DSP-14482)
5.1.4 DSEFSの変更点と機能強化
- DSEFSシェル・コマンドのエラー・メッセージが改善されました。(DSP-14157)
- ファイルの読み取り中にエラーが発生した場合にDSEFSシェルなどのDSEFSクライアントに渡されるエラー・メッセージが改善されました。(DSP-14371)
- SparkがDSEFSサーバーに接続できなかった場合のエラー・メッセージが改善されました。(DSP-14388)
- フィルター処理を改善するため、HTTP通信のロギング・レベルがDEBUGからTRACEに変更されました。(DSP-14400)
- ワークロードが高い場合のDSEFSの安定性が向上しました。DSEFSによって、Java Cassandraドライバーが過負荷になりBusyPoolExceptionが発生する可能性は低くなります。StackOverflowExceptionおよびDSEFSのロックアップが生じる可能性があるエッジケースが修正されました。(DSP-14408)
5.1.4 DSE Searchの変更点と機能強化
- すべてのスキーマ・フィールドでフル検証を行うと、アップグレード後に、検証エラーが発生する可能性がありました。(DSP-6501)
- フィールドのインデックスが作成されていない場合、フィールドにdocValuesが適用されている場合、またはフィールドがコピーフィールド・ソースに使用されている場合でも、スキーマ内のすべてのフィールド定義が検証され、DSE Searchと互換性があります。
- アップグレードの前にスキーマを調整してください。フィールドのインデックスが作成されていない場合、フィールドにdocValuesが適用されている場合、またはフィールドがコピーフィールド・ソースに使用されている場合でも、スキーマ内のすべてのフィールド定義が検証され、DSE Searchと互換性があります。インデックスを調整することで、特に未使用の大きいBLOBでパフォーマンスの向上が認められます。
5.1.4の解決済みの問題点
5.1.4 DataStax Enterpriseの解決済みの問題点
- DSE 5.1へのアップグレード後、ノードが起動せず、
unable to activate HistogramInfoPlugin
メッセージが表示される。(DSP-13301) - Apache HttpClientディレクトリー探索で使用するURIの形式が正しくない。(DSP-13580)
- トークンの作成、キャンセル、および更新のセキュリティを強化する必要がある。(DSP-14311)
- stress-toolが行を出力しない。(DSP-14494)
5.1.4 DSE Analyticsの解決済みの問題点
- Hiveメタストアのセッション管理が破損している。(DSP-12363)
- 送信パーミッションなしでユーザーがアプリケーションを送信したときに、例外メッセージに問題が示されていない。(DSP-13234)
- サービス・オプションを使用したスタンドアロン・インストール後に、Sparkシェルを使用できない。(DSP-14361)
- DseCassandraConnectionFactoryでポート設定が優先されない。(DSP-14442)
- Sparkマスター/ワーカーWeb UIは、デフォルトでRPCリッスン・アドレスにバインドし、RPCブロードキャスト・アドレスを公開する必要がある。(DSP-14433)
5.1.4 DSEFSの解決済みの問題点
- service dse stopコマンドが、プロセスが完全に終了するまで待機していない。(DSP-14014)
- DSEFSで、データ・ディレクトリーのsymlinkがサポートされていない。(DSP-14110)
- ファイル・システムが空でも、DSEFS fsckが処理されたブロックの数を常に出力する。(DSP-14235)
5.1.4 DSE Graphの解決済みの問題点
- IDプロパティ・キーの頂点インデックスが正しく機能しない。(DSP-9208)
- すべてのグラフ文のdse_security.digest_tokensに対して不要なINSERTおよびDELETEがネイティブ・プロトコル全体で実行される。(DSP-13670)
- Kerberos認証が有効なクラスターへのgremlin-console接続の構成が効率化されました。(DSP-14164)
- DataFrames deletesで、範囲またはパーティション・レベルのトゥームストーンが使用されない。(DSP-14249)
- カスタムIDを持つ頂点が、プロパティとしてIDコンポーネントを返さない(OLTP、OLAP、およびGraphFrameのg.V().properties()やg.V().values()など)。(DSP-14262)
- Spark+Graphノードのシャットダウン時のDseResourceManagerの警告メッセージ。(DSP-14276)
- Graphサンドボックスでは、デフォルトでorg.apache.tinkerpop.gremlin.structure.ioがホワイトリストに登録されている必要がある。(DSP-14540)
5.1.4 DSE Searchの解決済みの問題点
- 対応するCQLカラムがなくても、動的な複数値フィールドが可能になりました。(DSP-13277)
- インデックスが作成されていないfrozenマップ・カラムで、予期しない結果が発生し、エラー・メッセージが表示されない。(DSP-13997)
- インデックスが作成されていないフィールドで、データのインデックスを作成できない。(DSP-14001)
- シングルパスのCQL Solrクエリーで、一部のデータ型を選択できない。(DSP-14022)
- テキスト・フィールドが操作別のグループ化で機能せず、テキスト・フィールドの予期しないdocvalues型のSORTED_SETエラー・メッセージが表示される。(DSP-14106)
- パーティションIDに空の文字列があるSolrセカンダリ・インデックスのクリーンアップで解析エラーが発生する。(DSP-14234)
- solr_index_stats_optionsについて、Solrインデックス統計が収集されない。(DSP-14241)
- 起動時のCPUレイアウト・アサーションがログ・ファイルに記録されず、起動が停止する。(DSP-14281)
- トレースをオンにしてクエリーを実行した後に、トレースをオフにできない。(DSP-14439)
- solrslowlogを有効にすると、wikiデモのインデックスを作成できない。(DSP-14521)
- 検索パフォーマンス・オブジェクトが機能しない。(DSP-14241)
- インデックスの暗号化中のメモリー・リーク。(DSP-13826)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
パッケージ・インストールInstaller-Servicesインストール |
/etc/dse/dse.yaml |
tarボール・インストールInstaller-No Servicesインストール |
installation_location/resources/dse/conf/dse.yaml |
DSE 5.1.4のCHANGES.txt
DataStax Enterprise 5.1.4に含まれている、Apache Cassandra™ 3.11.0の実稼働環境で認定済みの変更点のリスト。
- 十分な活性のテーブルで制限を正しく処理(CASSANDRA-13883)
- 単一リーフとオーバーフローの衝突がある場合にAbstractTokenTreeBuilder#serializedSizeが間違った値を返す(CASSANDRA-13869)
- BTree.Builderメモリー・リーク(CASSANDRA-13754)
- 正確性のため、非PKカラム・フィルター処理のサポートについてのCASSANDRA-10368を取り消す(CASSANDRA-13798)
- クラスター接続中にエラーが発生した場合のcassandra-stressのハングの問題を修正(CASSANDRA-12938)
- (発生する可能性のある)範囲の移動によりブロックされた場合のブートストラップ障害メッセージを改善(CASSANDRA-13744)
- "ignore"オプションがsstableloaderで無視される(CASSANDRA-13721)
- AbstractCommitLogSegmentManagerのデッドロック(CASSANDRA-13652)
- SASI操作でアナライザーに渡す前にバッファーを複製(CASSANDRA-13512)
- プリペアド・ステートメント・キャッシュからpstmtsを正しく排除(CASSANDRA-13641)
- SuperColumnテーブルのサポートを修正(CASSANDRA-12373)
- カウンター・リーダー候補からRPCに対応していないノードを削除(CASSANDRA-13043)
- 短い読み取りの保護のパフォーマンスを改善(CASSANDRA-13794)
- マルチスライスのrange-tombstone-markerをサポートするためにsstableリーダーを修正(CASSANDRA-13787)
- クラスター化カラムのないテーブルの短い読み取りの保護を修正(CASSANDRA-13880)
- PartitionUpdateでisBuiltを揮発性にする(CASSANDRA-13619)
- CellTestとRowsTestのタイムスタンプの整数オーバーフローを防止(CASSANDRA-13866)
- 短い読み取りの保護のカウンター・アプリケーション順序を修正(CASSANDRA-12872)
- 検証予定でRepairJob実行をブロックしない(CASSANDRA-13797)
- CLSMをシャットダウンする前にすべての管理タスクが完了するまで待機する(CASSANDRA-13123)
- デフォルトのDESC順序のクラスター化カラムとしてタプル型を使用するとINSERT文が失敗する(CASSANDRA-13717)
- ローカルおよびリモートのペアのミューテーションがある場合の保留中のビュー・ミューテーション処理を修正し、バッチログをクリーンアップ(CASSANDRA-13069)
- オーバーフローおよびNPEについての構成の検証と記録を改善(CASSANDRA-13622)
- CASバッチでの範囲削除が無視される(CASSANDRA-13655)
- IndexSummary > 2Gの場合のアサーション・エラーを回避(CASSANDRA-12014)
- 小さい範囲のリペア・未ッドポイントのロギングを変更(CASSANDRA-13603)
- 破損した最終コミットログ・セグメントの処理が向上(CASSANDRA-11995)
- StreamingHistogramはスレッド・セーフではない(CASSANDRA-13756)
- MVタイムスタンプの問題を修正(CASSANDRA-11500)
- 適切な形式でないbcryptのハッシュへの対応を改善(CASSANDRA-13626)
- 読み取りコマンドのシリアライズの競合状態を修正(CASSANDRA-13363)
- 短い読み取りの保護のAssertionErrorを修正(CASSANDRA-13747)
- 起動時に破損しているsstableをスキップしない(CASSANDRA-13620)
- 異なるユーザー・タイプ・バージョンのセルのマージを修正(CASSANDRA-13776)
- cqlsh.py do_loginでセッション・プロパティをコピー(CASSANDRA-13640)
- 範囲トゥームストーンおよびパーティション削除のReadRepairの実行中に発生する可能性のあるAssertionError(CASSANDRA-13719)
- n=0の場合はウォームアップ・データのstress writeを行わない(CASSANDRA-13773)
- バッチ・コミット・ログの使用時にゴシップ・スレッドが減速する(CASSANDRA-12966)
- 1つまたは2つのラックを使用してバッチログ・エンドポイント選択をランダム化(CASSANDRA-12884)
- カウンター・セルのダイジェスト計算を修正(CASSANDRA-13750)
- frozen以外のコレクションのColumnDefinition.cellValueType()を修正し、type.toJSONString()を使用するようにSSTabledumpを変更(CASSANDRA-13573)
- ベース・テーブルが存在しない場合、マテリアライズド・ビューの追加をスキップ(CASSANDRA-13737)
- テーブルの削除によりdropped_columnsテーブル内の対応するエントリが削除される必要がある(CASSANDRA-13730)
- レガシー認証テーブルが移行されるまで警告メッセージをログに記録(CASSANDRA-13371)
- 2.0で作成されたカウンター・セルの[2.1 <- 3.0]シリアライズの誤りを修正(CASSANDRA-13691)
- nullセルの無効なwritetimeを修正(CASSANDRA-13711)
- 変更がテーブルとそのMVに自動的に伝搬されるようにALTER TABLE文を修正(CASSANDRA-12952)
- ヒント・ファイルにUnknownColumnFamilyがある場合のダイジェスト不一致例外を修正(CASSANDRA-13696)
- nodetool tablestatsコマンドのあいまいな出力を修正(CASSANDRA-13722)
- 期限切れのセルによって作成されたトゥームストーンをパージ(CASSANDRA-13643)
- 異なるカラム・サブセットのあるイテレーターでconcatが動作するように変更(CASSANDRA-13482)
- コアとメモリー・サイズに基づきtest.runnersを設定(CASSANDRA-13078)
- 異なるNUMACTL_ARGSの受け渡しを許容(CASSANDRA-13557)
- COMPACTテーブルのセカンダリー・インデックス・クエリーを修正(CASSANDRA-13627)
- スナップショットがない場合、Nodetool listsnapshots出力に改行がない(CASSANDRA-13568)
- sstabledumpが属性の順序の使用の誤りを報告(CASSANDRA-13532)
- JSONへの出力時に空のバッファーを安全に処理(CASSANDRA-13868)
- cqlsh.py do_loginでセッション・プロパティをコピー(CASSANDRA-13847)
- IndexSummaryRedistributionの負荷の過剰計算の問題を修正(CASSANDRA-13738)
- 捕捉されないコンパクションおよびフラッシュの例外を修正(CASSANDRA-13833)
- Nettyパイプラインに未検出の例外(CASSANDRA-13649)
- exabyteファイルシステムの整数オーバーフローを防止(CASSANDRA-13067)
- クラスター化カラムに対するLIMITとフィルター処理を使用するクエリーを修正(CASSANDRA-11223)
- ブートストラップの再開が失敗した場合に発生する可能性のあるNPEを修正(CASSANDRA-13272)
- UTD、タプル、およびコレクション型のtoJSONStringを修正(CASSANDRA-13592)
- ネストされたタプル/UDTの検証を修正(CASSANDRA-13646)
- ゴシップ・メッセージの作成時にHeartBeatStateを複製し、その世代/バージョンを揮発性にする(CASSANDRA-13700)
- CQLSSTableWriterでネイティブ関数呼び出しを許容(CASSANDRA-12606)
- MessagingServiceテストの文字列比較を正規表現/数値チェックで置き換え(CASSANDRA-13216)
- CQLSHの期間カラムの形式を修正(CASSANDRA-13549)
- SASIでページングを使用する際の重複行の問題を修正(CASSANDRA-13302)
- パーティション・キーとその要素に対してCONTAINS文によるフィルター処理が可能(CASSANDRA-13275)
- トークンの分散が不均等な場合、vnodeを含むクラスターで均等な範囲が計算し直される(CASSANDRA-13229)
- duration型の検証を修正してオーバーフローを回避(CASSANDRA-13218)
- パーティション・キー列でサポートされていないSASIインデックスの作成を禁止(CASSANDRA-13228)
- CQL文法のキーに対する複数値を拒否(CASSANDRA-13369)
- 入力行がない場合にUDAが失敗する(CASSANDRA-13399)
- daemonInitializationを使用してcompaction-stressを修正(CASSANDRA-13188)
- V5プロトコル・フラグのデコード破損(CASSANDRA-13443)
- コンパクション・ストラテジからsstableを削除するために読み取りロックではなく書き込みロックを使用(CASSANDRA-13422)
- JMXEnabledThreadPoolExecutorsでmaxPoolSizeに等しいcorePoolSizeを使用(CASSANDRA-13329)
- 値を含んでいないSASIインデックスのリビルドを回避(CASSANDRA-12962)
- アナライザー入力ストリームにcharsetを追加(CASSANDRA-13151)
- StandardTokenizerImpl.jflexから無効な文字を削除(CASSANDRA-13417)
- cqlshの自動プロトコル・ダウングレードの不具合を修正(CASSANDRA-13307)
- QueryMessageからトレース・セッションに渡されないペイロードのトレース(CASSANDRA-12835)
- 大きいパーティションの警告サイズの計算時にintオーバーフローの発生を防止(CASSANDRA-13172)
- ColumnFilterのコーディネーターとレプリカ間でパーティション・カラムのビューの整合性を確保(CASSANDRA-13004)
- キースペース削除中のmbeanの登録解除が失敗(CASSANDRA-13346)
- nodetool scrub/cleanup/upgradesstables終了コードが間違っている(CASSANDRA-13542)
- 読み取り1件ごとにアクセスされたsstableデータ・ファイルの数の報告内容を修正(CASSANDRA-13120)
- 3.0.12より前のバージョンからのローリング・アップグレード中のスキーマ・ダイジェストの不一致を修正(CASSANDRA-13559)
- JNAバージョンを4.4.0にアップグレード(CASSANDRA-13072)
- インターン処理後のColumnIdentifiersでByteBuffersの最小値を使用する必要がある(CASSANDRA-13533)
- 2.1から3.0へのアップグレード中にReverseIndexedReaderで行が削除されることがある(CASSANDRA-13525)
- 小さい範囲の開始/終了トークンの制限に違反するリペア・プロセスを修正(CASSANDRA-13052)
- sstableloaderにストレージ・ポート・オプションを追加(CASSANDRA-13518)
- cqlsh DESCRIBE出力で引用符で囲まれたインデックス名を適切に処理(CASSANDRA-12847)
- 古い形式のsstableから静的行が2度読み取られるのを回避(CASSANDRA-13236)
- StorageService.excise()のNPEを修正(CASSANDRA-13163)
- 単一スレッドによりOutboundTcpConnectionメッセージを失効させる(CASSANDRA-13265)
- 受信した応答が不十分な場合、リペアが失敗する(CASSANDRA-13397)
- 読み込んだテーブルに削除されたカラムが含まれている場合のSSTableLoaderの失敗を修正(CASSANDRA-13276)
- CassandraIndexTestでの名前の競合を回避(CASSANDRA-13427)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- 使用廃止に関するヒントのリプレイを中断(CASSANDRA-13308)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- StorageServiceでのNPEの問題を修正(CASSANDRA-13060)
- 範囲トゥームストーンの読み取りの信頼性を向上(CASSANDRA-12811)
- スキーマ・テーブルが完全にフラッシュされていないことによる起動時の問題を修正(CASSANDRA-12213)
- 再起動時にデータが除外されることのあるビュー・ビルダーのバグを修正(CASSANDRA-13405)
- 標準カラムがない場合の2iページ・サイズの計算を修正(CASSANDRA-13400)
- 標準カラム・データのない2.X期限切れ行の変換を修正(CASSANDRA-13395)
- prefer_localが有効な場合にext+internal IPを使用する際のヒント配信を修正(CASSANDRA-13020)
- Nodetool upgradesstables/scrub/compactがシステム・テーブルを無視する(CASSANDRA-13410)
- ローリング・アップグレードのスキーマ・バージョン計算を修正(CASSANDRA-13441)
- 認証が有効な場合、join_ring=Falseに設定して起動したノードは要求に応答できる必要がある(CASSANDRA-11381)
- cqlsh COPY FROM:試行ではなく失敗した場合にのみエラー・カウントをインクリメント(CASSANDRA-13209)
- RemoveTestでのgossiperの起動を回避(CASSANDRA-13407)
- JMXとNodeToolによって報告されるrow-cacheのweightedSize()を修正(CASSANDRA-13393)
- JVMメトリクス名を修正(CASSANDRA-13103)
- 結合ストラテジのスリープが過剰(CASSANDRA-13090)
- 静的カラムのあるテーブルのパーティション・キーの2ndaryインデックス・クエリーを修正(CASSANDRA-13147)
- cqlsh copy fromのParseErrorのunhashable型リストを修正(CASSANDRA-13364)
DSE 5.1.4のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================
Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again. Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)
When upgrading major versions of Cassandra, you will be unable to
restore snapshots created with the previous major version using the
'sstableloader' tool. You can upgrade the file format of your snapshots
using the provided 'sstableupgrade' tool.
DSE 5.1.3
=========
Upgrading
---------
- Creating Materialized View with filtering on non-primary-key base column
(added in CASSANDRA-10368) is disabled, because the liveness of view row
is depending on multiple filtered base non-key columns and base non-key
column used in view primary-key. This semantic cannot be supported without
storage format change, see CASSANDRA-13826. For append-only use case, you
may still use this feature with a startup flag: "-Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true"
- The table system_auth.resource_role_permissons_index is no longer used and should be dropped
after all nodes are on 5.1.3. Note that upgrades from DSE 5.0 series since 5.0.10 to DSE
versions before 5.1.3 are not recommended.
- Full repairs are now default if no option is specified on nodetool repair, unless
incremental repair was already run on the table/keyspace being repaired, to maintain
backward compatibility. Incremental repair may be run on new tables by using the -inc option.
- Full repairs will no longer run repair unless the --run-anticompaction option is specified
- Incremental repairs are no longer supported on tables with materialized views or CDC until
its limitations are addressed. An incremental repair triggered on a base table or
materialized view run a full repair instead. See CASSANDRA-12888 for details.
Materialized Views (only when upgrading from DSE 5.1.1 or 5.1.2 or any version lower than DSE 5.0.10)
---------------------------------------------------------------------------------------
- Cassandra will no longer allow dropping columns on tables with Materialized Views.
- A change was made in the way the Materialized View timestamp is computed, which
may cause an old deletion to a base column which is view primary key (PK) column
to not be reflected in the view when repairing the base table post-upgrade. This
condition is only possible when a column deletion to an MV primary key (PK) column
not present in the base table PK (via UPDATE base SET view_pk_col = null or DELETE
view_pk_col FROM base) is missed before the upgrade and received by repair after the upgrade.
If such column deletions are done on a view PK column which is not a base PK, it's advisable
to run repair on the base table of all nodes prior to the upgrade. Alternatively it's possible
to fix potential inconsistencies by running repair on the views after upgrade or drop and
re-create the views. See CASSANDRA-11500 for more details.
- Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column
= null or DELETE unselected_column FROM base) may not be properly reflected in the view in some
situations so we advise against doing deletions on base columns not selected in views
until this is fixed on CASSANDRA-13826.
3.11.0
======
Upgrading
---------
- Creating Materialized View with filtering on non-primary-key base column
(added in CASSANDRA-10368) is disabled, because the liveness of view row
is depending on multiple filtered base non-key columns and base non-key
column used in view primary-key. This semantic cannot be supported without
storage format change, see CASSANDRA-13826. For append-only use case, you
may still use this feature with a startup flag: "-Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true"
- ALTER TABLE (ADD/DROP COLUMN) operations concurrent with a read might
result into data corruption (see CASSANDRA-13004 for more details).
Fixing this bug required a messaging protocol version bump. By default,
Cassandra 3.11 will use 3014 version for messaging.
Since Schema Migrations rely the on exact messaging protocol version
match between nodes, if you need schema changes during the upgrade
process, you have to start your nodes with `-Dcassandra.force_3_0_protocol_version=true`
first, in order to temporarily force a backwards compatible protocol.
After the whole cluster is upgraded to 3.11, do a rolling
restart of the cluster without setting that flag.
3.11 nodes with and withouot the flag set will be able to do schema
migrations with other 3.x and 3.0.x releases.
While running the cluster with the flag set to true on 3.11 (in
compatibility mode), avoid adding or removing any columns to/from
existing tables.
If your cluster can do without schema migrations during the upgrade
time, just start the cluster normally without setting aforementioned
flag.
If you are upgrading from 3.0.14+ (of 3.0.x branch), you do not have
to set an flag while upgrading to ensure schema migrations.
- The NativeAccessMBean isAvailable method will only return true if the
native library has been successfully linked. Previously it was returning
true if JNA could be found but was not taking into account link failures.
- Primary ranges in the system.size_estimates table are now based on the keyspace
replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
- In 2.1, the default for otc_coalescing_strategy was 'DISABLED'.
In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown
to be a performance regression. The default for 3.11.0 and newer has
been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should
be aware that the default has changed.
- The StorageHook interface has been modified to allow to retrieve read information from
SSTableReader (CASSANDRA-13120).
DSE 5.1.3のリリース・ノート
DataStax Enterprise 5.1.3のリリース・ノート。
2017年9月6日
- 5.1.3のコンポーネント
- RNdse.html#RNdse513__513H
- 5.1.3の変更点と機能強化
- 5.1.3の解決済みの問題点
- 5.1.3のCHANGES.txt
- 5.1.3のNEWS.txt
5.1.3のコンポーネント
- Apache Cassandra™ 3.11.0.1855
- Apache Solr™ 6.0.1∑.0.1833
- Apache Tomcat® 8.0.44
- DataStax Spark Cassandra Connector 2.0.5
- TinkerPop 3.2.6-20170821-ac1bbb27
- 特定のHadoopライブラリ
5.1.3のハイライト
.1.3 DataStax Enterpriseコアのハイライト
- インクリメンタル・リペアは、nodetool repairのデフォルトではなくなりました。
nodetool repair -full
またはnodetool repair -pr
を使用している場合でも、DSE 5.1.0~5.1.2は、インクリメンタルとして実行され、sstableをリペア済みとマークするため、アンチコンパクションを引き起こします。(DSP-14464)DSE 5.1.0~5.1.2からDSE 5.1.3以降にアップグレードした後は、アップグレード・ガイドの手順に従って、インクリメンタル・リペアから移行する必要があります。インクリメンタル・リペアの実行を続けるには、
nodetool repair -inc
を使用します。
5.1.3 DSE AnalyticsおよびDSEFSのハイライト
- 本来はオープン・ソースApache Spark向けに作成されたアプリケーションに対応するため、-frameworkオプションがdse sparkコマンドに追加されました。DSEバージョン(デフォルト)またはオープン・ソースSpark 2.0の同様のパスのいずれのクラスパスを使用するかを指定します。(DSP-12954)
- DSEFSでは、重要な安定性が一部修正され、パフォーマンスが改善されました。実稼働環境でDSEFSを使用する場合は、これらの改善点を活かすためにDSE 5.1.3にアップグレードすることを強く推奨します。
5.1.3 DSE Graphのハイライト
- グラフ・クエリーのパフォーマンスが大幅に向上しました。(DSP-11534)
- ドメイン固有の言語がサポートされました。(DSP-13545)
- 複数のキー付き頂点のグラフ・カスタムIDがサポートされました。(DGL-258)
5.1.3 DSE Searchのハイライト
- deletesを自動的に削除できるようにTieredMergePolicyが拡張されました。(DSP-13626)
5.1.3の変更点と機能強化
>5.1.3 DataStax Enterpriseの変更点と機能強化
- nodetool rebuildおよびbootstrapが改善されました。(DSP-13870)
- ロール・パーミッションの処理が簡略化されました。(DSP-14159)
テーブルsystem_auth.resource_role_permissons_indexが使用されなくなりました。すべてのノードをDSE 5.0.10にアップグレードした後、このテーブルを削除してください。DSE 5.0.10以降から、5.1.3より前のDSEバージョンへのアップグレードは推奨されません。「DSE 5.1.3にアップグレードする際の制限事項」を参照してください。
- 新しいnodetoolnodetool mark_unrepairedコマンドは、リペア済みと未リペアのコンパクション・バケットを1つにまとめます。(DSP-14255)
- nodetool repairが変更されました。(DSP-14464)
- 新しいテーブルでオプションを指定せずに実行した場合のデフォルト動作が
nodetool repair -full
になりました。(以前のバージョンでオプションを指定しない場合のデフォルトは、インクリメンタルでした。) - キースペースまたはテーブルのセットでオプションを指定せずに実行すると、nodetool repairは、以前にリペアしたテーブルでインクリメンタル・リペアを実行し、新しいテーブルではフル・リペアを実行します。
- フル・リペア後にアンチコンパクションが実行されなくなりました。以前の動作を復元するには、
nodetool repair --run-anticompaction
を使用します。 - MVとCDCを使用するテーブルでは、インクリメンタル・リペアがサポートされなくなりました。MVまたはCDCを使用するテーブルでインクリメンタル・リペアを実行すると、代わりにフル・リペアが実行されます。
DSE 5.1.0~5.1.2からDSE 5.1.3以降にアップグレードした後は、アップグレード・ガイドの手順に従って、インクリメンタル・リペアから移行する必要があります。インクリメンタル・リペアの実行を続けるには、
nodetool repair -inc
を使用します。 - 新しいテーブルでオプションを指定せずに実行した場合のデフォルト動作が
5.1.3 DSE Analyticsの変更点と機能強化
- Spark://Master URLのエラーが改善されました。(DSP-13366)
- 本来はオープン・ソースApache Spark向けに作成されたアプリケーションに対応するため、-frameworkオプションがdse sparkコマンドに追加されました。DSEバージョン(デフォルト)またはオープン・ソースSpark 2.0の同様のパスのいずれのクラスパスを使用するかを指定します。(DSP-12954)
- Sparkアプリケーションにターゲットのデータ・センターが指定されていない場合のエラー・メッセージが改善されました。(DSP-13236)
>5.1.3 DSE Graphの変更点と機能強化
- 事前に形式が指定されているデータのデータ・バッチ読み込みが改善され、簡略化されました。(DGL-235)
サポートの変更:
- スキーマ探索とスキーマ生成は廃止予定です。(DGL-246)
- 標準IDは廃止予定です。(DGL-247)
- 変換は廃止予定です。(DGL-248)
- 標準頂点IDは廃止予定です。代わりにカスタム頂点IDを使用してください。(DSP-13485)
- スキーマ探索とスキーマ生成は廃止予定です。(DGL-246)
- 複数のキー付き頂点のグラフ・カスタムIDがサポートされました。(DGL-258)
- クエリー・エンジンが大幅に改善され、インデックスを使用することで、多くのクエリーの要求を満たすことができるようになりました。特に、ANDおよびORクエリーが処理されるようになり、複数のバックエンド・クエリー、または可能であれば1つの検索クエリーに透過的に変換されます。(DSP-11534)
- ORDER BY句でインデックスを使用できるようになりました。(DSP-11931)
- エッジのつながりのチェック時に、不要なバックエンド・クエリーが実行されなくなりました。(DSP-12863)
- 可能な場合は、between述語を使用するエッジ・クエリーでインデックスを使用できるようになりました。(DSP-13541)
- Gremlinにおけるドメイン固有言語(DSL)のサポートが改善され、DataStaxドライバーでTraversalSourceを指定できるようになりました。(DSP-13545)
- トランザクション・レベルのcache=falseに、AdjacencyListStoreImplおよびIndexStoreImplの無効化が含まれるようになりました。(DSP-13560)
- 複数のプロパティを持たない頂点は、プロパティを1つずつ要求するのではなく、1つのクエリーですべてのプロパティをフェッチします。複数のプロパティを頂点として使用することは推奨されません。これは、グラフ探索で複数のカーディナリティ(複数のプロパティ)の取得が1つのカーディナリティ・プロパティの取得に比べて遅いためです。複数のプロパティを持つ頂点の動作は、プロパティを個別に要求する以前の動作にデフォルトで設定されます。(DSP-13646)
- DSEGraphFramesで、dedup、sort、limit、filter、+ as()/select()、or()のGremlin APIが新たにサポートされました。(DSP-13649)
- 可能な場合に、プロパティ/エッジ・テーブル・エントリーのパーティションの削除が行われます。(DSP-13671)
- グラフ探索のタイムアウトが、要求の受信時に開始されるようになりました。以前のリリースでは、処理の開始時に開始されていました。タイムアウトは過負荷のサーバーで発生しやすくなります。(DSP-13828)
- 数値のsack値を明示的に型指定する必要がなくなりました(3.0Dなど)。ただし、予想される戻り値の型で特異性を高めることもできます。(DSP-14026)
- sack()ステップに指定されたラムダがLambdaRestrictionStrategyで認識されるようになりました。このメソッドを呼び出すには、restrict_lambda設定を無効にする必要があります。(DSP-14118)
- エッジとプロパティでユーザー指定のIDがサポートされました。IDはJava UUIDでなければなりません。(DSP-12932)
5.1.3 DSEFSの変更点と機能強化
- DSEFSリペア機能が拡張されました。DSEFS fsckは、データ・ブロックの存在を主張するリモート・ノードにデータ・ブロックが存在するかどうかをチェックします。アップグレード時のバージョンの混在はサポートされていません。DSEFS fsckを使用する前に、クラスター内のすべてのノードをアップグレードしてください。(DSP-13081)
- DSEFSの読み取りパフォーマンスが向上しました。(DSP-13309)
- 指定したホストを優先してDSEFSシェルが起動されるようになりました。(DSP-14108)
- クライアントとサーバー間のアイドル状態の接続が閉じるまで待機する時間を定義するidle_connection_timeout_msオプションがdse.yamlに新たに追加されました。これにより、接続を効率良く再利用できます。(DSP-14010)
- プロトコルの変更により、DSEFSサーバーとクライアント間のJSON配列のやり取りの効率が向上しました。アップグレード時のバージョンの混在はサポートされていません。DSEFSシェルを使用して、クラスター内のすべてのノードをアップグレードします。(DSP-14107)
5.1.3 DSE Searchの変更点と機能強化
- OffheapPostingsがデモと自動生成されたsolrconfig.xmlファイルにデフォルトで含まれています。(DSP-10088)
- デフォルトのフィルター・キャッシュ設定が変更されました。(DSP-13153)
- 自動生成される検索インデックスのautoSolrConfig.xmlテンプレートが効率化されました。TieredMergePolicyFactoryのCQL ALTER SEARCH INDEX CONFIG、ALTER SEARCH INDEX Schema、およびCREATE SEARCH INDEXのショートカット。(DSP-13229)
- DeleteByIdは廃止予定です。(DSP-13988)
- deletesを自動的に削除できるようにTieredMergePolicyが拡張されました。(DSP-13626)
- デフォルトで、DSE SearchのSSD用のインデックス作成が最適化されました。回転式ディスク検知ロジックは削除されました。(DSP-13924)
- 無効なsolr_queryのエラー・メッセージが改善され、無効なクエリーと構文エラーの説明がより具体的になりました。(DSP-14003)
5.1.3の解決済みの問題点
5.1.3 DataStax Enterpriseの解決済みの問題点
- DSEの起動時に、ディレクトリーの所有権を調整しチェックします。(DSP-13245)
- CVE-2017-7957 xstream-coreにサービス拒否(DoS)攻撃に対する脆弱性がある。(DSP-13419)
- 復元後に、sstableloaderを使用してSSTableを階層化ストレージにストリーミングすると、データをクエリー処理できない。(DSP-14188)
- 新しいカーネルを使用した場合にMemoryOnlyStrategyリージョンが物理メモリーにすぐに読み込まれない。(DSP-14169)
- フル・リペアがデフォルトになり、MV/CDCテーブルでのインクリメンタル・リペアが無効になりました。(DSP-14255)
- AbstractReadCommandBuilderのCASSANDRA-11223動作が元に戻りました。(DSP-14135)
- 圧縮されたデータをシャドウ化するリモートSSTableがリペア済みとしてマークされなくなりました。(DSP-14141)
- リビルド・ロギングが常に0バイトを示している。(DSP-13870)
- タイムスタンプ/キーの重複チェックを行わずに、完全に期限が切れているsstableを強制的に期限切れにできるようになりました。(DSP-13870)
- StreamingHistogramのバグのために、SSTableインデックス・ファイルが破損する可能性がある。(DSP-14279)
5.1.3 DSE Advanced Replication(DSE拡張レプリケーション)の解決済みの問題点
- DataStaxインストーラーで、DSE Advanced Replication(DSE拡張レプリケーション)が正しく設定されない。(DSP-13472)
- 高い挿入速度で取り込むとデータの損失や削除が発生する可能性がある。CDCログ・ファイルは、処理対象でなくても削除される場合がありました。(DSP-14043)
- DSEFSクライアントが不必要にリモート・ノードを切り替える。(DSP-14108)
- 競合状態で負荷が大きいと、ログ・ファイルに紛らわしい例外が送信される。(DSP-14180)
5.1.3 DSE Analyticsの解決済みの問題点
- RPCメソッドの失敗のロギング・レベルが下がりました。(DSP-13282)
- 要求を非同期実行プールに追加するときに、JoinWithCassandraおよびSaveToCassandraがブロックされる。(DSP-14178)
5.1.3 DSEFSの解決済みの問題点
- No Servicesインストール時にDataStaxインストーラーによってDSEFSが正しく設定されない。(DSP-13473)
- NullPointerException:fsck実行時の<dse keyspace>.inodesのカラムvalid_fromに予期しないnull値が含まれる。(DSP-12615)
- WebHDFS APIを誤って使用するとメモリー・リークが発生する。(DSP-13813)
- まれにクライアント側でParsingExceptionが発生する。(DSP-14000)
- DSEFSでSparkを使用すると誤ったFileNotFoundエラーが発生する。(DSP-14105)
5.1.3 DSE Graphの解決済みの問題点
- -helpでヘルプが2回出力される。(DGL-257)
- DGLで警告が過剰に出力される。(DGL-262)
- 頂点ラベルの数がグラフあたり200個に制限されている。(DSP-11078)
- メタプロパティにデータが挿入されない場合のグラフ・フレーム・エラー。(DSP-13063)
- デフォルトのログの場所を移動すると、Gremlin Serverのログ・ディレクトリー設定が機能しない。dse-env.shを使用してログの場所を変更してください。(DSP-13508)
- スキーマが空のグラフに対してDseGraphFrameがUnsupportedOperationExceptionをスローする。(DSP-13858)
- DseGraphRpc.getSchemaBlobは、SELECTではなくEXECUTEパーミッションを要求する必要がある。(DSP-13888)
- 単一のカーディナリティ・エッジの更新が正しく動作しない。(DSP-14185)
- DseGraphFrames.updateVertices()が不必要なIDカラムを要求する。(DSP-14175)
- インデックスが作成されていないエッジで内部述語が正しく機能しない。(DSP-13209)
5.1.3 DSE Searchの解決済みの問題
- シャード要求例外が、レプリカ・レベルでログに記録されない。(DSP-12691)
- ハード・コミット時に不要な二重のセグメント・フラッシュが発生する。(DSP-13971)
- 後方互換性のために、プロビジョニング/削除状態を再導入しました。グラフが見つかると警告が生成されます。(DSP-14111)
- クラスターの非Searchノードで検索パーミッションを管理できない。(DSP-14242)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
パッケージ・インストールInstaller-Servicesインストール |
/etc/dse/dse.yaml |
tarボール・インストールInstaller-No Servicesインストール |
installation_location/resources/dse/conf/dse.yaml |
DSE 5.1.3のCHANGES.txt
DataStax Enterprise 5.1.3に含まれている、Apache Cassandra™ 3.11.0の実稼働環境で認定済みの変更点のリスト。
- クラスター接続中にエラーが発生した場合のcassandra-stressのハングの問題を修正(CASSANDRA-12938)
- (発生する可能性のある)範囲の移動によりブロックされた場合のブートストラップ障害メッセージを改善(CASSANDRA-13744)
- "ignore"オプションがsstableloaderで無視される(CASSANDRA-13721)
- AbstractCommitLogSegmentManagerのデッドロック(CASSANDRA-13652)
- SASI操作でアナライザーに渡す前にバッファーを複製(CASSANDRA-13512)
- cqlsh.py do_loginでセッション・プロパティをコピー(CASSANDRA-13640)
- 範囲トゥームストーンおよびパーティション削除のReadRepairの実行中に発生する可能性のあるAssertionError(CASSANDRA-13719)
- n=0の場合はウォームアップ・データのstress writeを行わない(CASSANDRA-13773)
- バッチ・コミット・ログの使用時にゴシップ・スレッドが減速する(CASSANDRA-12966)
- 1つまたは2つのラックを使用してバッチログ・エンドポイント選択をランダム化(CASSANDRA-12884)
- カウンター・セルのダイジェスト計算を修正(CASSANDRA-13750)
- frozen以外のコレクションのColumnDefinition.cellValueType()を修正し、type.toJSONString()を使用するようにSSTabledumpを変更(CASSANDRA-13573)
- ベース・テーブルが存在しない場合、マテリアライズド・ビューの追加をスキップ(CASSANDRA-13737)
- テーブルの削除によりdropped_columnsテーブル内の対応するエントリが削除される必要がある(CASSANDRA-13730)
- レガシー認証テーブルが移行されるまで警告メッセージをログに記録(CASSANDRA-13371)
- 2.0で作成されたカウンター・セルの[2.1 <- 3.0]シリアライズの誤りを修正(CASSANDRA-13691)
- nullセルの無効なwritetimeを修正(CASSANDRA-13711)
- 変更がテーブルとそのMVに自動的に伝搬されるようにALTER TABLE文を修正(CASSANDRA-12952)
- ヒント・ファイルにUnknownColumnFamilyがある場合のダイジェスト不一致例外を修正(CASSANDRA-13696)
- nodetool tablestatsコマンドのあいまいな出力を修正(CASSANDRA-13722)
- 期限切れのセルによって作成されたトゥームストーンをパージ(CASSANDRA-13643)
- 異なるカラム・サブセットのあるイテレーターでconcatが動作するように変更(CASSANDRA-13482)
- コアとメモリー・サイズに基づきtest.runnersを設定(CASSANDRA-13078)
- sstabledumpが属性の順序の使用の誤りを報告(CASSANDRA-13532)
- Nettyパイプラインに未検出の例外(CASSANDRA-13649)
- exabyteファイルシステムの整数オーバーフローを防止(CASSANDRA-13067)
- クラスター化カラムに対するLIMITとフィルター処理を使用するクエリーを修正(CASSANDRA-11223)
- ブートストラップの再開が失敗した場合に発生する可能性のあるNPEを修正(CASSANDRA-13272)
- UTD、タプル、およびコレクション型のtoJSONStringを修正(CASSANDRA-13592)
- ゴシップ・メッセージの作成時にHeartBeatStateを複製し、その世代/バージョンを揮発性にする(CASSANDRA-13700)
DSE 5.1.3のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================
Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again. Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)
When upgrading major versions of Cassandra, you will be unable to
restore snapshots created with the previous major version using the
'sstableloader' tool. You can upgrade the file format of your snapshots
using the provided 'sstableupgrade' tool.
DSE 5.1.3
=========
Upgrading
---------
- Creating Materialized View with filtering on non-primary-key base column
(added in CASSANDRA-10368) is disabled, because the liveness of view row
is depending on multiple filtered base non-key columns and base non-key
column used in view primary-key. This semantic cannot be supported without
storage format change, see CASSANDRA-13826. For append-only use case, you
may still use this feature with a startup flag: "-Dcassandra.mv.allow_filtering_nonkey_columns_unsafe=true"
- The table system_auth.resource_role_permissons_index is no longer used and should be dropped
after all nodes are on 5.1.3. Note that upgrades from DSE 5.0 series since 5.0.10 to DSE
versions before 5.1.3 are not recommended.
- Full repairs are now default if no option is specified on nodetool repair, unless
incremental repair was already run on the table/keyspace being repaired, to maintain
backward compatibility. Incremental repair may be run on new tables by using the -inc option.
- Full repairs will no longer run repair unless the --run-anticompaction option is specified
- Incremental repairs are no longer supported on tables with materialized views or CDC until
its limitations are addressed. An incremental repair triggered on a base table or
materialized view run a full repair instead. See CASSANDRA-12888 for details.
Materialized Views (only when upgrading from DSE 5.1.1 or 5.1.2 or any version lower than DSE 5.0.10)
---------------------------------------------------------------------------------------
- Cassandra will no longer allow dropping columns on tables with Materialized Views.
- A change was made in the way the Materialized View timestamp is computed, which
may cause an old deletion to a base column which is view primary key (PK) column
to not be reflected in the view when repairing the base table post-upgrade. This
condition is only possible when a column deletion to an MV primary key (PK) column
not present in the base table PK (via UPDATE base SET view_pk_col = null or DELETE
view_pk_col FROM base) is missed before the upgrade and received by repair after the upgrade.
If such column deletions are done on a view PK column which is not a base PK, it's advisable
to run repair on the base table of all nodes prior to the upgrade. Alternatively it's possible
to fix potential inconsistencies by running repair on the views after upgrade or drop and
re-create the views. See CASSANDRA-11500 for more details.
- Removal of columns not selected in the Materialized View (via UPDATE base SET unselected_column
= null or DELETE unselected_column FROM base) may not be properly reflected in the view in some
situations so we advise against doing deletions on base columns not selected in views
until this is fixed on CASSANDRA-13826.
3.11.0
======
Upgrading
---------
- ALTER TABLE (ADD/DROP COLUMN) operations concurrent with a read might
result into data corruption (see CASSANDRA-13004 for more details).
Fixing this bug required a messaging protocol version bump. By default,
Cassandra 3.11 will use 3014 version for messaging.
Since Schema Migrations rely the on exact messaging protocol version
match between nodes, if you need schema changes during the upgrade
process, you have to start your nodes with `-Dcassandra.force_3_0_protocol_version=true`
first, in order to temporarily force a backwards compatible protocol.
After the whole cluster is upgraded to 3.11, do a rolling
restart of the cluster without setting that flag.
3.11 nodes with and withouot the flag set will be able to do schema
migrations with other 3.x and 3.0.x releases.
While running the cluster with the flag set to true on 3.11 (in
compatibility mode), avoid adding or removing any columns to/from
existing tables.
If your cluster can do without schema migrations during the upgrade
time, just start the cluster normally without setting aforementioned
flag.
If you are upgrading from 3.0.14+ (of 3.0.x branch), you do not have
to set an flag while upgrading to ensure schema migrations.
- The NativeAccessMBean isAvailable method will only return true if the
native library has been successfully linked. Previously it was returning
true if JNA could be found but was not taking into account link failures.
- Primary ranges in the system.size_estimates table are now based on the keyspace
replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
- In 2.1, the default for otc_coalescing_strategy was 'DISABLED'.
In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown
to be a performance regression. The default for 3.11.0 and newer has
been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should
be aware that the default has changed.
- The StorageHook interface has been modified to allow to retrieve read information from
SSTableReader (CASSANDRA-13120).
3.10
====
New features
------------
- New `DurationType` (cql duration). See CASSANDRA-11873
- Runtime modification of concurrent_compactors is now available via nodetool
- Support for the assignment operators +=/-= has been added for update queries.
- An Index implementation may now provide a task which runs prior to joining
the ring. See CASSANDRA-12039
- Filtering on partition key columns is now also supported for queries without
secondary indexes.
- A slow query log has been added: slow queries will be logged at DEBUG level.
For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms
in cassandra.yaml.
- Support for GROUP BY queries has been added.
- A new compaction-stress tool has been added to test the throughput of compaction
for any cassandra-stress user schema. see compaction-stress help for how to use.
- Compaction can now take into account overlapping tables that don't take part
in the compaction to look for deleted or overwritten data in the compacted tables.
Then such data is found, it can be safely discarded, which in turn should enable
the removal of tombstones over that data.
The behavior can be engaged in two ways:
- as a "nodetool garbagecollect -g CELL/ROW" operation, which applies
single-table compaction on all sstables to discard deleted data in one step.
- as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag,
which uses overlapping tables as a source of deletions/overwrites during all
compactions.
The argument specifies the granularity at which deleted data is to be found:
- If ROW is specified, only whole deleted rows (or sets of rows) will be
discarded.
- If CELL is specified, any columns whose value is overwritten or deleted
will also be discarded.
- NONE (default) specifies the old behavior, overlapping tables are not used to
decide when to discard data.
Which option to use depends on your workload, both ROW and CELL increase the
disk load on compaction (especially with the size-tiered compaction strategy),
with CELL being more resource-intensive. Both should lead to better read
performance if deleting rows (resp. overwriting or deleting cells) is common.
- Prepared statements are now persisted in the table prepared_statements in
the system keyspace. Upon startup, this table is used to preload all
previously prepared statements - i.e. in many cases clients do not need to
re-prepare statements against restarted nodes.
- cqlsh can now connect to older Cassandra versions by downgrading the native
protocol version. Please note that this is currently not part of our release
testing and, as a consequence, it is not guaranteed to work in all cases.
See CASSANDRA-12150 for more details.
- Snapshots that are automatically taken before a table is dropped or truncated
will have a "dropped" or "truncated" prefix on their snapshot tag name.
- Metrics are exposed for successful and failed authentication attempts.
These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess
and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
- Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET.
See CASSANDRA-11424 for details
- Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
- Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.
Upgrading
---------
- Support for alter types of already defined tables and of UDTs fields has been disabled.
If it is necessary to return a different type, please use casting instead. See
CASSANDRA-12443 for more details.
- Specifying the default_time_to_live option when creating or altering a
materialized view was erroneously accepted (and ignored). It is now
properly rejected.
- Only Java and JavaScript are now supported UDF languages.
The sandbox in 3.0 already prevented the use of script languages except Java
and JavaScript.
- Compaction now correctly drops sstables out of CompactionTask when there
isn't enough disk space to perform the full compaction. This should reduce
pending compaction tasks on systems with little remaining disk space.
- Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the
"full" request time on the coordinator. Previously, they only covered the time from
when the coordinator sent a message to a replica until the time that the replica
responded. Additionally, the previous behavior was to reset the timeout when performing
a read repair, making a second read to fix a short read, and when subranges were read
as part of a range scan or secondary index query. In 3.10 and higher, the timeout
is no longer reset for these "subqueries". The entire request must complete within
the specified timeout. As a consequence, your timeouts may need to be adjusted
to account for this. See CASSANDRA-12256 for more details.
- Logs written to stdout are now consistent with logs written to files.
Time is now local (it was UTC on the console and local in files). Date, thread, file
and line info where added to stdout. (see CASSANDRA-12004)
- The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided.
The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on
that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided
on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable
accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
- (Tools development) DatabaseDescriptor no longer implicitly startups components/services like
commit log replay. This may break existing 3rd party tools and clients. In order to startup
a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or
DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner,
snitch, encryption context. Client initialization just applies the configuration but does not
setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are
deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
- Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from
timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms
property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property.
- Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format
(e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).
3.8
===
New features
------------
- Shared pool threads are now named according to the stage they are executing
tasks for. Thread names mentioned in traced queries change accordingly.
- A new option has been added to cassandra-stress "-rate fixed={number}/s"
that forces a scheduled rate of operations/sec over time. Using this, stress can
accurately account for coordinated ommission from the stress process.
- The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
- hdr histograms have been added to stress runs, it's output can be saved to disk using:
"-log hdrfile=" option. This histogram includes response/service/wait times when used with the
fixed or throttle rate options. The histogram file can be plotted on
http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
- TimeWindowCompactionStrategy has been added. This has proven to be a better approach
to time series compaction and new tables should use this instead of DTCS. See
CASSANDRA-9666 for details.
- Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and
a brief explanation of on-disk locations for archived data in CommitLog form. This can
be enabled via ALTER TABLE ... WITH cdc=true.
Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to
the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables
will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached
between unflushed CommitLogSegments and cdc_raw.
NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version
cluster as it will lead to exceptions which can interrupt traffic. Once all nodes
have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.
Upgrading
---------
- The ReversedType behaviour has been corrected for clustering columns of
BYTES type containing empty value. Scrub should be run on the existing
SSTables containing a descending clustering column of BYTES type to correct
their ordering. See CASSANDRA-12127 for more details.
- Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address
to the public instance IP if this property is defined on cassandra.yaml.
- The name "json" and "distinct" are not valid anymore a user-defined function
names (they are still valid as column name however). In the unlikely case where
you had defined functions with such names, you will need to recreate
those under a different name, change your code to use the new names and
drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more
details).
Deprecation
-----------
- DateTieredCompactionStrategy has been deprecated - new tables should use
TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might
cause increased compaction load for a while after the migration so make sure you run
tests before migrating. Read CASSANDRA-9666 for background on this.
3.7
===
Upgrading
---------
- A maximum size for SSTables values has been introduced, to prevent out of memory
exceptions when reading corrupt SSTables. This maximum size can be set via
max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default
value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if
they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.
3.6
=====
New features
------------
- JMX connections can now use the same auth mechanisms as CQL clients. New options
in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated
to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings
still only expose JMX locally, and use the JVM's own security mechanisms when remote
connections are permitted. For more details on how to enable the new options, see the
comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for
the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details.
Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system
property at startup is deprecated. See CASSANDRA-11725 for more details.
- JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
- Collision checks are performed when joining the token ring, regardless of whether
the node should bootstrap. Additionally, replace_address can legitimately be used
without bootstrapping to help with recovery of nodes with partially failed disks.
See CASSANDRA-10134 for more details.
- Key cache will only hold indexed entries up to the size configured by
column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries
will never go into memory. See CASSANDRA-11206 for more details.
- For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL
from the inserted or updated values.
- Startup is now aborted if corrupted transaction log files are found. The details
of the affected log files are now logged, allowing the operator to decide how
to resolve the situation.
- Filtering expressions are made more pluggable and can be added programatically via
a QueryHandler implementation. See CASSANDRA-11295 for more details.
3.4
===
New features
------------
- Internal authentication now supports caching of encrypted credentials.
Reference cassandra.yaml:credentials_validity_in_ms
- Remote configuration of auth caches via JMX can be disabled using the
the system property cassandra.disable_auth_caches_remote_configuration
- sstabledump tool is added to be 3.0 version of former sstable2json. The tool only
supports v3.0+ SSTables. See tool's help for more detail.
Upgrading
---------
- Nothing specific to 3.4 but please see previous versions upgrading section,
especially if you are upgrading from 2.2.
Deprecation
-----------
- The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and
org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of
org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is
common across all caches in the auth subsystem. The specific mbean interfaces
for each individual cache will be removed in a subsequent major version.
3.2
===
New features
------------
- We now make sure that a token does not exist in several data directories. This
means that we run one compaction strategy per data_file_directory and we use
one thread per directory to flush. Use nodetool relocatesstables to make sure your
tokens are in the correct place, or just wait and compaction will handle it. See
CASSANDRA-6696 for more details.
- bound maximum in-flight commit log replay mutation bytes to 64 megabytes
tunable via cassandra.commitlog_max_outstanding_replay_bytes
- Support for type casting has been added to the selection clause.
- Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression.
Note: hints compression is currently disabled by default.
Upgrading
---------
- The compression ratio metrics computation has been modified to be more accurate.
- Running Cassandra as root is prevented by default.
- JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file
Deprecation
-----------
- The Thrift API is deprecated and will be removed in Cassandra 4.0.
3.1
=====
Upgrading
---------
- The return value of SelectStatement::getLimit as been changed from DataLimits
to int.
- Custom index implementation should be aware that the method Indexer::indexes()
has been removed as its contract was misleading and all custom implementation
should have almost surely returned true inconditionally for that method.
- GC logging is now enabled by default (you can disable it in the jvm.options
file if you prefer).
3.0
===
New features
------------
- EACH_QUORUM is now a supported consistency level for read requests.
- Support for IN restrictions on any partition key component or clustering key
as well as support for EQ and IN multicolumn restrictions has been added to
UPDATE and DELETE statement.
- Support for single-column and multi-colum slice restrictions (>, >=, <= and <)
has been added to DELETE statements
- nodetool rebuild_index accepts the index argument without
the redundant table name
- Materialized Views, which allow for server-side denormalization, is now
available. Materialized views provide an alternative to secondary indexes
for non-primary key queries, and perform much better for indexing high
cardinality columns.
See http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
- Hinted handoff has been completely rewritten. Hints are now stored in flat
files, with less overhead for storage and more efficient dispatch.
See CASSANDRA-6230 for full details.
- Option to not purge unrepaired tombstones. To avoid users having data resurrected
if repair has not been run within gc_grace_seconds, an option has been added to
only allow tombstones from repaired sstables to be purged. To enable, set the
compaction option 'only_purge_repaired_tombstones':true but keep in mind that if
you do not run repair for a long time, you will keep all tombstones around which
can cause other problems.
- Enabled warning on GC taking longer than 1000ms. See
cassandra.yaml:gc_warn_threshold_in_ms
Upgrading
---------
- Clients must use the native protocol version 3 when upgrading from 2.2.X as
the native protocol version 4 is not compatible between 2.2.X and 3.Y. See
https://www.mail-archive.com/user@cassandra.apache.org/msg45381.html for details.
- A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator,
representing the IP address of the client attempting authentication. It will be a breaking
change for any custom implementations.
- token-generator tool has been removed.
- Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9,
or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and
older versions is not supported.
- The 'memtable_allocation_type: offheap_objects' option has been removed. It should
be re-introduced in a future release and you can follow CASSANDRA-9472 to know more.
- Configuration parameter memory_allocator in cassandra.yaml has been removed.
- The native protocol versions 1 and 2 are not supported anymore.
- Max mutation size is now configurable via max_mutation_size_in_kb setting in
cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024.
- 3.0 requires Java 8u40 or later.
- Garbage collection options were moved from cassandra-env to jvm.options file.
- New transaction log files have been introduced to replace the compactions_in_progress
system table, temporary file markers (tmp and tmplink) and sstable ancerstors.
Therefore, compaction metadata no longer contains ancestors. Transaction log files
list sstable descriptors involved in compactions and other operations such as flushing
and streaming. Use the sstableutil tool to list any sstable files currently involved
in operations not yet completed, which previously would have been marked as temporary.
A transaction log file contains one sstable per line, with the prefix "add:" or "remove:".
They also contain a special line "commit", only inserted at the end when the transaction
is committed. On startup we use these files to cleanup any partial transactions that were
in progress when the process exited. If the commit line is found, we keep new sstables
(those with the "add" prefix) and delete the old sstables (those with the "remove" prefix),
vice-versa if the commit line is missing. Should you lose or delete these log files,
both old and new sstable files will be kept as live files, which will result in duplicated
sstables. These files are protected by incremental checksums so you should not manually
edit them. When restoring a full backup or moving sstable files, you should clean-up
any left over transactions and their temporary files first. You can use this command:
===> sstableutil -c ks table
See CASSANDRA-7066 for full details.
- New write stages have been added for batchlog and materialized view mutations
you can set their size in cassandra.yaml
- User defined functions are now executed in a sandbox.
To use UDFs and UDAs, you have to enable them in cassandra.yaml.
- New SSTable version 'la' with improved bloom-filter false-positive handling
compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade
is not necessary but recommended.
- Before upgrading to 3.0, make sure that your cluster is in complete agreement
(schema versions outputted by `nodetool describecluster` are all the same).
- Schema metadata is now stored in the new `system_schema` keyspace, and
legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details.
- Pig's support has been removed.
- Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use
CqlBulkOutputFormat and CqlBulkRecordWriter instead.
- Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed;
use CqlInputFormat and CqlOutputFormat instead.
- Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed;
use CqlRecordReader and CqlRecordWriter instead.
- hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers.
To specify a list of excluded data centers when hinted_handoff_enabled is set to true,
use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details.
- The `sstable_compression` and `chunk_length_kb` compression options have been deprecated.
The new options are `class` and `chunk_length_in_kb`. Disabling compression should now
be done by setting the new option `enabled` to `false`.
- The compression option `crc_check_chance` became a top-level table option, but is currently
enforced only against tables with enabled compression.
- Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax
has been deprecated since 2.1.0 and is being removed in 3.0.0.
- The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated
since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options,
has now been removed.
- Batchlog entries are now stored in a new table - system.batches.
The old one has been deprecated.
- JMX methods set/getCompactionStrategyClass have been removed, use
set/getCompactionParameters or set/getCompactionParametersJson instead.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The secondary index API has been comprehensively reworked. This will be a breaking
change for any custom index implementations, which should now look to implement
the new org.apache.cassandra.index.Index interface. New syntax has been added to create
and query row-based indexes, which are not explicitly linked to a single column in the
base table.
2.2.4
=====
Deprecation
-----------
- Pig support has been deprecated, and will be removed in 3.0.
Please see CASSANDRA-10542 for more details.
- Configuration parameter memory_allocator in cassandra.yaml has been deprecated
and will be removed in 3.0.0. As mentioned below for 2.2.0, jemalloc is
automatically preloaded on Unix platforms.
Operations
----------
- Switching data center or racks is no longer an allowed operation on a node
which has data. Instead, the node will need to be decommissioned and
rebootstrapped. If moving from the SimpleSnitch, make sure that the data
center and rack containing all current nodes is named "datacenter1" and
"rack1". To override this behaviour use -Dcassandra.ignore_rack=true and/or
-Dcassandra.ignore_dc=true.
- Reloading the configuration file of GossipingPropertyFileSnitch has been disabled.
Upgrading
---------
- The default for the inter-DC stream throughput setting
(inter_dc_stream_throughput_outbound_megabits_per_sec in cassandra.yaml) is
the same than the one for intra-DC one (200Mbps) instead of being unlimited.
Having it unlimited was never intended and was a bug.
New features
------------
- Time windows in DTCS are now limited to 1 day by default to be able to
handle bootstrap and repair in a better way. To get the old behaviour,
increase max_window_size_seconds.
- DTCS option max_sstable_age_days is now deprecated and defaults to 1000 days.
- Native protocol server now allows both SSL and non-SSL connections on
the same port.
2.2.3
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.2 if you are upgrading
from a previous version.
2.2.2
=====
Changed Defaults
----------------
- commitlog_total_space_in_mb will use the smaller of 8192, and 1/4
of the total space of the commitlog volume. (Before: always used
8192)
- The following INFO logs were reduced to DEBUG level and will now show
on debug.log instead of system.log:
- Memtable flushing actions
- Commit log replayed files
- Compacted sstables
- SStable opening (SSTableReader)
New features
------------
- Custom QueryHandlers can retrieve the column specifications for the bound
variables from QueryOptions by using the hasColumnSpecifications()
and getColumnSpecifications() methods.
- A new default assynchronous log appender debug.log was created in addition
to the system.log appender in order to provide more detailed log debugging.
In order to disable debug logging, you must comment-out the ASYNCDEBUGLOG
appender on conf/logback.xml. See CASSANDRA-10241 for more information.
2.2.1
=====
New features
------------
- COUNT(*) and COUNT(1) can be selected with other columns or functions
2.2
===
Upgrading
---------
- The authentication & authorization subsystems have been redesigned to
support role based access control (RBAC), resulting in a change to the
schema of the system_auth keyspace. See below for more detail.
For systems already using the internal auth implementations, the process
for converting existing data during a rolling upgrade is straightforward.
As each node is restarted, it will attempt to convert any data in the
legacy tables into the new schema. Until enough nodes to satisfy the
replication strategy for the system_auth keyspace are upgraded and so have
the new schema, this conversion will fail with the failure being reported
in the system log.
During the upgrade, Cassandra's internal auth classes will continue to use
the legacy tables, so clients experience no disruption. Issuing DCL
statements during an upgrade is not supported.
Once all nodes are upgraded, an operator with superuser privileges should
drop the legacy tables, system_auth.users, system_auth.credentials and
system_auth.permissions. Doing so will prompt Cassandra to switch over to
the new tables without requiring any further intervention.
While the legacy tables are present a restarted node will re-run the data
conversion and report the outcome so that operators can verify that it is
safe to drop them.
New features
------------
- The LIMIT clause applies now only to the number of rows returned to the user,
not to the number of row queried. By consequence, queries using aggregates will not
be impacted by the LIMIT clause anymore.
- Very large batches will now be rejected (defaults to 50kb). This
can be customized by modifying batch_size_fail_threshold_in_kb.
- Selecting columns,scalar functions, UDT fields, writetime or ttl together
with aggregated is now possible. The value returned for the columns,
scalar functions, UDT fields, writetime and ttl will be the ones for
the first row matching the query.
- Windows is now a supported platform. Powershell execution for startup scripts
is highly recommended and can be enabled via an administrator command-prompt
with: 'powershell set-executionpolicy unrestricted'
- It is now possible to do major compactions when using leveled compaction.
Doing that will take all sstables and compact them out in levels. The
levels will be non overlapping so doing this will still not be something
you want to do very often since it might cause more compactions for a while.
It is also possible to split output when doing a major compaction with
STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size.
This might be a bit better than old major compactions which created one big
file on disk.
- A new tool has been added bin/sstableverify that checks for errors/bitrot
in all sstables. Unlike scrub, this is a non-invasive tool.
- Authentication & Authorization APIs have been updated to introduce
roles. Roles and Permissions granted to them are inherited, supporting
role based access control. The role concept supercedes that of users
and CQL constructs such as CREATE USER are deprecated but retained for
compatibility. The requirement to explicitly create Roles in Cassandra
even when auth is handled by an external system has been removed, so
authentication & authorization can be delegated to such systems in their
entirety.
- In addition to the above, Roles are also first class resources and can be the
subject of permissions. Users (roles) can now be granted permissions on other
roles, including CREATE, ALTER, DROP & AUTHORIZE, which removesthe need for
superuser privileges in order to perform user/role management operations.
- Creators of database resources (Keyspaces, Tables, Roles) are now automatically
granted all permissions on them (if the IAuthorizer implementation supports
this).
- SSTable file name is changed. Now you don't have Keyspace/CF name
in file name. Also, secondary index has its own directory under parent's
directory.
- Support for user-defined functions and user-defined aggregates have
been added to CQL.
************************************************************************
IMPORTANT NOTE: user-defined functions can be used to execute
arbitrary and possibly evil code in Cassandra 2.2, and are
therefore disabled by default. To enable UDFs edit
cassandra.yaml and set enable_user_defined_functions to true.
CASSANDRA-9402 will add a security manager for UDFs in Cassandra
3.0. This will inherently be backwards-incompatible with any 2.2
UDF that perform insecure operations such as opening a socket or
writing to the filesystem.
************************************************************************
- Row-cache is now fully off-heap.
- jemalloc is now automatically preloaded and used on Linux and OS-X if
installed.
- Please ensure on Unix platforms that there is no libjnadispath.so
installed which is accessible by Cassandra. Old versions of
libjna packages (< 4.0.0) will cause problems - e.g. Debian Wheezy
contains libjna versin 3.2.x.
- The node now keeps up when streaming is failed during bootstrapping. You can
use new `nodetool bootstrap resume` command to continue streaming after resolving
an issue.
- Protocol version 4 specifies that bind variables do not require having a
value when executing a statement. Bind variables without a value are
called 'unset'. The 'unset' bind variable is serialized as the int
value '-2' without following bytes.
In an EXECUTE or BATCH request an unset bind value does not modify the value and
does not create a tombstone, an unset bind ttl is treated as 'unlimited',
an unset bind timestamp is treated as 'now', an unset bind counter operation
does not change the counter value.
Unset tuple field, UDT field and map key are not allowed.
In a QUERY request an unset limit is treated as 'unlimited'.
Unset WHERE clauses with unset partition column, clustering column
or index column are not allowed.
- New `ByteType` (cql tinyint). 1-byte signed integer
- New `ShortType` (cql smallint). 2-byte signed integer
- New `SimpleDateType` (cql date). 4-byte unsigned integer
- New `TimeType` (cql time). 8-byte long
- The toDate(timeuuid), toTimestamp(timeuuid) and toUnixTimestamp(timeuuid) functions have been added to allow
to convert from timeuuid into date type, timestamp type and bigint raw value.
The functions unixTimestampOf(timeuuid) and dateOf(timeuuid) have been deprecated.
- The toDate(timestamp) and toUnixTimestamp(timestamp) functions have been added to allow
to convert from timestamp into date type and bigint raw value.
- The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow
to convert from date into timestamp type and bigint raw value.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The default JVM flag -XX:+PerfDisableSharedMem will cause the following tools JVM
to stop working: jps, jstack, jinfo, jmc, jcmd as well as 3rd party tools like Jolokia.
If you wish to use these tools you can comment this flag out in cassandra-env.{sh,ps1}
Upgrading
---------
- Thrift rpc is no longer being started by default.
Set `start_rpc` parameter to `true` to enable it.
- Pig's CqlStorage has been removed, use CqlNativeStorage instead
- Pig's CassandraStorage has been deprecated. CassandraStorage
should only be used against tables created via thrift.
Use CqlNativeStorage for all other tables.
- IAuthenticator been updated to remove responsibility for user/role
maintenance and is now solely responsible for validating credentials,
This is primarily done via SASL, though an optional method exists for
systems which need support for the Thrift login() method.
- IRoleManager interface has been added which takes over the maintenance
functions from IAuthenticator. IAuthorizer is mainly unchanged. Auth data
in systems using the stock internal implementations PasswordAuthenticator
& CassandraAuthorizer will be automatically converted during upgrade,
with minimal operator intervention required. Custom implementations will
require modification, though these can be used in conjunction with the
stock CassandraRoleManager so providing an IRoleManager implementation
should not usually be necessary.
- Fat client support has been removed since we have push notifications to clients
- cassandra-cli has been removed. Please use cqlsh instead.
- YamlFileNetworkTopologySnitch has been removed; switch to
GossipingPropertyFileSnitch instead.
- CQL2 has been removed entirely in this release (previously deprecated
in 2.0.0). Please switch to CQL3 if you haven't already done so.
- The results of CQL3 queries containing an IN restriction will be ordered
in the normal order and not anymore in the order in which the column values were
specified in the IN restriction.
- Some secondary index queries with restrictions on non-indexed clustering
columns were not requiring ALLOW FILTERING as they should. This has been
fixed, and those queries now require ALLOW FILTERING (see CASSANDRA-8418
for details).
- The SSTableSimpleWriter and SSTableSimpleUnsortedWriter classes have been
deprecated and will be removed in the next major Cassandra release. You
should use the CQLSSTableWriter class instead.
- The sstable2json and json2sstable tools have been deprecated and will be
removed in the next major Cassandra release. See CASSANDRA-9618
(https://issues.apache.org/jira/browse/CASSANDRA-9618) for details.
- nodetool enablehandoff will no longer support a list of data centers starting
with the next major release. Two new commands will be added, enablehintsfordc and disablehintsfordc,
to exclude data centers from using hinted handoff when the global status is enabled.
In cassandra.yaml, hinted_handoff_enabled will no longer support a list of data centers starting
with the next major release. A new setting will be added, hinted_handoff_disabled_datacenters,
to exclude data centers when the global status is enabled, see CASSANDRA-9035 for details.
2.1.13
======
New features
------------
- New options for cqlsh COPY FROM and COPY TO, see CASSANDRA-9303 for details.
2.1.10
=====
New features
------------
- The syntax TRUNCATE TABLE X is now accepted as an alias for TRUNCATE X
2.1.9
=====
Upgrading
---------
- cqlsh will now display timestamps with a UTC timezone. Previously,
timestamps were displayed with the local timezone.
- Commit log files are no longer recycled by default, due to negative
performance implications. This can be enabled again with the
commitlog_segment_recycling option in your cassandra.yaml
- JMX methods set/getCompactionStrategyClass have been deprecated, use
set/getCompactionParameters/set/getCompactionParametersJson instead
2.1.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.7
=====
2.1.6
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.5
=====
Upgrading
---------
- The option to omit cold sstables with size tiered compaction has been
removed - it is almost always better to use date tiered compaction for
workloads that have cold data.
2.1.4
=====
Upgrading
---------
The default JMX config now listens to localhost only. You must enable
the other JMX flags in cassandra-env.sh manually.
2.1.3
=====
Upgrading
---------
- Prepending a list to a list collection was erroneously resulting in
the prepended list being reversed upon insertion. If you were depending
on this buggy behavior, note that it has been corrected.
- Incremental replacement of compacted SSTables has been disabled for this
release.
2.1.2
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
New features
------------
- Netty support for epoll on linux is now enabled. If for some
reason you want to disable it pass, the following system property
-Dcassandra.native.epoll.enabled=false
2.1
===
New features
------------
- Default data and log locations have changed. If not set in
cassandra.yaml, the data file directory, commitlog directory,
and saved caches directory will default to $CASSANDRA_HOME/data/data,
$CASSANDRA_HOME/data/commitlog, and $CASSANDRA_HOME/data/saved_caches,
respectively. The log directory now defaults to $CASSANDRA_HOME/logs.
If not set, $CASSANDRA_HOME, defaults to the top-level directory of
the installation.
Note that this should only affect source checkouts and tarballs.
Deb and RPM packages will continue to use /var/lib/cassandra and
/var/log/cassandra in cassandra.yaml.
- SSTable data directory name is slightly changed. Each directory will
have hex string appended after CF name, e.g.
ks/cf-5be396077b811e3a3ab9dc4b9ac088d/
This hex string part represents unique ColumnFamily ID.
Note that existing directories are used as is, so only newly created
directories after upgrade have new directory name format.
- Saved key cache files also have ColumnFamily ID in their file name.
- It is now possible to do incremental repairs, sstables that have been
repaired are marked with a timestamp and not included in the next
repair session. Use nodetool repair -par -inc to use this feature.
A tool to manually mark/unmark sstables as repaired is available in
tools/bin/sstablerepairedset. This is particularly important when
using LCS, or any data not repaired in your first incremental repair
will be put back in L0.
- Bootstrapping now ensures that range movements are consistent,
meaning the data for the new node is taken from the node that is no
longer a responsible for that range of keys.
If you want the old behavior (due to a lost node perhaps)
you can set the following property (-Dcassandra.consistent.rangemovement=false)
- It is now possible to use quoted identifiers in triggers' names.
WARNING: if you previously used triggers with capital letters in their
names, then you must quote them from now on.
- Improved stress tool (http://goo.gl/OTNqiQ)
- New incremental repair option (http://goo.gl/MjohJp, http://goo.gl/f8jSme)
- Incremental replacement of compacted SSTables (http://goo.gl/JfDBGW)
- The row cache can now cache only the head of partitions (http://goo.gl/6TJPH6)
- Off-heap memtables (http://goo.gl/YT7znJ)
- CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ... (http://goo.gl/kQl7GW)
Upgrading
---------
- commitlog_sync_batch_window_in_ms behavior has changed from the
maximum time to wait between fsync to the minimum time. We are
working on making this more user-friendly (see CASSANDRA-9533) but in the
meantime, this means 2.1 needs a much smaller batch window to keep
writer threads from starving. The suggested default is now 2ms.
- Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore
pre-2.0 sstables are not supported. This means that before upgrading
a node on 2.1, this node must be started on 2.0 and
'nodetool upgdradesstables' must be run (and this even in the case
of not-rolling upgrades).
- For size-tiered compaction users, Cassandra now defaults to ignoring
the coldest 5% of sstables. This can be customized with the
cold_reads_to_omit compaction option; 0.0 omits nothing (the old
behavior) and 1.0 omits everything.
- Multithreaded compaction has been removed.
- Counters implementation has been changed, replaced by a safer one with
less caveats, but different performance characteristics. You might have
to change your data model to accomodate the new implementation.
(See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the
blog post at http://goo.gl/qj8iQl for details).
- (per-table) index_interval parameter has been replaced with
min_index_interval and max_index_interval paratemeters. index_interval
has been deprecated.
- support for supercolumns has been removed from json2sstable
2.0.11
======
Upgrading
---------
- Nothing specific to this release, but refer to previous entries if you
are upgrading from a previous version.
New features
------------
- DateTieredCompactionStrategy added, optimized for time series data and groups
data that is written closely in time (CASSANDRA-6602 for details). Consider
this experimental for now.
2.0.10
======
New features
------------
- CqlPaginRecordReader and CqlPagingInputFormat have both been removed.
Use CqlInputFormat instead.
- If you are using Leveled Compaction, you can now disable doing size-tiered
compaction in L0 by starting Cassandra with -Dcassandra.disable_stcs_in_l0
(see CASSANDRA-6621 for details).
- Shuffle and taketoken have been removed. For clusters that choose to
upgrade to vnodes, creating a new datacenter with vnodes and migrating is
recommended. See http://goo.gl/Sna2S1 for further information.
2.0.9
=====
Upgrading
---------
- Default values for read_repair_chance and local_read_repair_chance have been
swapped. Namely, default read_repair_chance is now set to 0.0, and default
local_read_repair_chance to 0.1.
- Queries selecting only CQL static columns were (mistakenly) not returning one
result per row in the partition. This has been fixed and a SELECT DISTINCT
can be used when only the static column of a partition needs to be fetch
without fetching the whole partition. But if you use static columns, please
make sure this won't affect you (see CASSANDRA-7305 for details).
2.0.8
=====
New features
------------
- New snitches have been used for users of Google Compute Engine and of
Cloudstack.
Upgrading
---------
- Nothing specific to this release, but please see 2.0.7 if you are upgrading
from a previous version.
2.0.7
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.0.6 if you are upgrading
from a previous version.
2.0.6
=====
New features
------------
- CQL now support static columns, allows to batch multiple conditional updates
and has a new syntax for slicing over multiple clustering columns
(http://goo.gl/B6qz4j).
- Repair can be restricted to a set of nodes using the -hosts option in nodetool.
- A new 'nodetool taketoken' command relocate tokens with vnodes.
- Hinted handoff can be enabled only for some data-centers (see
hinted_handoff_enabled in cassandra.yaml)
Upgrading
---------
- Nothing specific to this release, but please see 2.0.5 if you are upgrading
from a previous version.
2.0.5
=====
New features
------------
- Batchlog replay can be, and is throttled by default now.
See batchlog_replay_throttle_in_kb setting in cassandra.yaml.
- Scrub can now optionally skip corrupt counter partitions. Please note
that this will lead to the loss of all the counter updates in the skipped
partition. See the --skip-corrupted option.
Upgrading
---------
- If your cluster began on a version before 1.2, check that your secondary
index SSTables are on version 'ic' before upgrading. If not, run
'nodetool upgradesstables' if on 1.2.14 or later, or run 'nodetool
upgradesstables ks cf' with the keyspace and secondary index named
explicitly otherwise. If you don't do this and upgrade to 2.0.x and it
refuses to start because of 'hf' version files in the secondary index,
you will need to delete/move them out of the way and recreate the index
when 2.0.x starts.
2.0.3
=====
New features
------------
- It's now possible to configure the maximum allowed size of the native
protocol frames (native_transport_max_frame_size_in_mb in the yaml file).
Upgrading
---------
- NaN and Infinity are new valid floating point constants in CQL3 and are now reserved
keywords. In the unlikely case you were using one of them as an identifier (for a
column, a keyspace or a table), you will now have to double-quote them (see
http://cassandra.apache.org/doc/cql3/CQL.html#identifiers for "quoted identifiers").
- The IEndpointStateChangeSubscriber has a new method, beforeChange, that
any custom implemenations using the class will need to implement.
2.0.2
=====
New features
------------
- Speculative retry defaults to 99th percentile
(See blog post at http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2)
- Configurable metrics reporting
(see conf/metrics-reporter-config-sample.yaml)
- Compaction history and stats are now saved to system keyspace
(system.compaction_history table). You can access historiy via
new 'nodetool compactionhistory' command or CQL.
Upgrading
---------
- Nodetool defaults to Sequential mode for repair operations
2.0.1
=====
Upgrading
---------
- The default memtable allocation has changed from 1/3 of heap to 1/4
of heap. Also, default (single-partition) read and write timeouts
have been reduced from 10s to 5s and 2s, respectively.
2.0.0
=====
Upgrading
---------
- Java 7 is now *required*!
- Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
goes for sstable compatibility as well as network. When
upgrading from an earlier release, upgrade to 1.2.9 first and
run upgradesstables before proceeding to 2.0.
- CAS and new features in CQL such as DROP COLUMN assume that cell
timestamps are microseconds-since-epoch. Do not use these
features if you are using client-specified timestamps with some
other source.
- Replication and strategy options do not accept unknown options anymore.
This was already the case for CQL3 in 1.2 but this is now the case for
thrift too.
- auto_bootstrap of a single-token node with no initial_token will
now pick a random token instead of bisecting an existing token
range. We recommend upgrading to vnodes; failing that, we
recommend specifying initial_token.
- reduce_cache_sizes_at, reduce_cache_capacity_to, and
flush_largest_memtables_at options have been removed from cassandra.yaml.
- CacheServiceMBean.reduceCacheSizes() has been removed.
Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
- authority option in cassandra.yaml has been deprecated since 1.2.0,
but it has been completely removed in 2.0. Please use 'authorizer' option.
- ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
typeAsBlob() conversion functions instead.
See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
- Inputting blobs as string constants is now fully deprecated in
favor of blob constants. Make sure to update your applications to use
the new syntax while you are still on 1.2 (which supports both string
and blob constants for blob input) before upgrading to 2.0.
- index_interval is now moved to ColumnFamily property. You can change value
with ALTER TABLE ... WITH statement and SSTables written after that will
have new value. When upgrading, Cassandra will pick up the value defined in
cassanda.yaml as the default for existing ColumnFamilies, until you explicitly
set the value for those.
- The deprecated native_transport_min_threads option has been removed in
Cassandra.yaml.
Operations
----------
- VNodes are enabled by default in cassandra.yaml. initial_token
for non-vnode deployments has been removed from the example
yaml, but is still respected if specified.
- Major compactions, cleanup, scrub, and upgradesstables will interrupt
any in-progress compactions (but not repair validations) when invoked.
- Disabling autocompactions by setting min/max compaction threshold to 0
has been deprecated, instead, use the nodetool commands 'disableautocompaction'
and 'enableautocompaction' or set the compaction strategy option enabled = false
- ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now.
See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
- CAS uses gc_grace_seconds to determine how long to keep unused paxos
state around for, or a minimum of three hours.
- A new hints created metric is tracked per target, replacing countPendingHints
- After performance testing for CASSANDRA-5727, the default LCS filesize
has been changed from 5MB to 160MB.
- cqlsh DESCRIBE SCHEMA no longer outputs the schema of system_* keyspaces;
use DESCRIBE FULL SCHEMA if you need the schema of system_* keyspaces.
- CQL2 has been deprecated, and will be removed entirely in 2.2. See
CASSANDRA-5918 for details.
- Commit log archiver now assumes the client time stamp to be in microsecond
precision, during restore. Please refer to commitlog_archiving.properties.
Features
--------
- Lightweight transactions
(http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)
- Alias support has been added to CQL3 SELECT statement. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for details.
- JEMalloc support (see memory_allocator in cassandra.yaml)
- Experimental triggers support. See examples/ for how to use. "Experimental"
means "tied closely to internal data structures; we plan to decouple this in
the future, which will probably break triggers written against this initial
API."
- Numerous improvements to CQL3 and a new version of the native protocol. See
http://www.datastax.com/dev/blog/cql-in-cassandra-2-0 for details.
1.2.11
======
Features
--------
- Added a new consistency level, LOCAL_ONE, that forces all CL.ONE operations to
execute only in the local datacenter.
- New replace_address to supplant the (now removed) replace_token and
replace_node workflows to replace a dead node in place. Works like the
old options, but takes the IP address of the node to be replaced.
1.2.9
=====
Features
--------
- A history of executed nodetool commands is now captured.
It can be found in ~/.cassandra/nodetool.history. Other tools output files
(cli and cqlsh history, .cqlshrc) are now centralized in ~/.cassandra, as well.
- A new sstablesplit utility allows to split large sstables offline.
1.2.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.2.7 if you are upgrading
from a previous version.
1.2.7
=====
Upgrading
---------
- If you have decommissioned a node in the past 72 hours, it is imperative
that you not upgrade until such time has passed, or do a full cluster
restart (not rolling) before beginning the upgrade. This only applies to
decommission, not removetoken.
1.2.6
=====
Upgrading
---------
- hinted_handoff_throttle_in_kb is now reduced by a factor
proportional to the number of nodes in the cluster (see
https://issues.apache.org/jira/browse/CASSANDRA-5272).
- CQL3 syntax for CREATE CUSTOM INDEX has been updated. See CQL3
documentation for details.
1.2.5
=====
Features
--------
- Custom secondary index support has been added to CQL3. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details and examples.
Upgrading
---------
- The native CQL transport is enabled by default on part 9042.
1.2.4
=====
Upgrading
---------
- 'nodetool upgradesstables' now only upgrades/rewrites sstables that are
not on the current version (which is usually what you want). Use the new
-a flag to recover the old behavior of rewriting all sstables.
Features
--------
- superuser setup delay (10 seconds) can now be overridden using
'cassandra.superuser_setup_delay_ms' property.
1.2.3
=====
Upgrading
---------
- CQL3 used to be case-insensitive for property map key in ALTER and CREATE
statements. In other words:
CREATE KEYSPACE test WITH replication = { 'CLASS' : 'SimpleStrategy',
'REPLICATION_FACTOR' : '1' }
was allowed. However, this was not consistent with the fact that string
literal are case sensitive in every other places and more importantly this
break NetworkTopologyStrategy for which DC names are case sensitive. Those
property map key are now case sensitive. So the statement above should be
changed to:
CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : '1' }
1.2.2
=====
Upgrading
---------
- CQL3 type validation for constants has been fixed, which may require
fixing queries that were relying on the previous loose validation. Please
refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
and in particular the changelog section for more details. Please note in
particular that inputing blobs as strings constants is now deprecated (in
favor of blob constants) and its support will be removed in a future
version.
Features
--------
- Built-in CQL3-based implementations of IAuthenticator (PasswordAuthenticator)
and IAuthorizer (CassandraAuthorizer) have been added. PasswordAuthenticator
stores usernames and hashed passwords in system_auth.credentials table;
CassandraAuthorizer stores permissions in system_auth.permissions table.
- system_auth keyspace is now alterable via ALTER KEYSPACE queries.
The default is SimpleStrategy with replication_factor of 1, but it's
advised to raise RF to at least 3 or 5, since CL.QUORUM is used for all
auth-related queries. It's also possible to change the strategy to NTS.
- Permissions caching with time-based expiration policy has been added to reduce
performance impact of authorization. Permission validity can be configured
using 'permissions_validity_in_ms' setting in cassandra.yaml. The default
is 2000 (2 seconds).
- SimpleAuthenticator and SimpleAuthorizer examples have been removed. Please
look at CassandraAuthorizer/PasswordAuthenticator instead.
1.2.1
=====
Upgrading
---------
- In CQL3, date string are no longer accepted as timeuuid value since a
date string is not a correct representation of a timeuuid. Instead, new
methods (minTimeuuid, maxTimeuuid, now, dateOf, unixTimestampOf) have been
introduced to make working on timeuuid from date string easy. cqlsh also
does not display timeuuid as date string (since this is a lossy
representation), but the new dateOf method can be used instead. Please
refer to the reference documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for more detail.
- For client implementors: CQL3 client using the thrift interface should
use the new execute_cql3_query, prepare_cql3_query and execute_prepared_cql3_query
since 1.2.0. However, Cassandra 1.2.0 was not complaining if CQL3 was set
through set_cql_version but the now CQL2 only methods were used. This is
now the case.
- Queries that uses unrecognized or bad compaction or replication strategy
options are now refused (instead of simply logging a warning).
1.2
===
Upgrading
---------
- IAuthenticator interface has been updated to support dynamic
user creation, modification and removal. Users, even when stored
externally, now have to be explicitly created using
CREATE USER query first. AllowAllAuthenticator and SimpleAuthenticator
have been updated for the new interface, but you'll have to update
your old IAuthenticator implementations for 1.2. To ease this process,
a new abstract LegacyAuthenticator class has been added - subclass it
in your old IAuthenticator implementaion and everything should just work
(this only affects users who implemented custom authenticators).
- IAuthority interface has been deprecated in favor of IAuthorizer.
AllowAllAuthority and SimpleAuthority have been renamed to
AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to
simplify the upgrade to the new interface, a new abstract
LegacyAuthorizer has been added - you should subclass it in your
old IAuthority implementation and everything should just work
(this only affects users who implemented custom authorities).
'authority' setting in cassandra.yaml has been renamed to 'authorizer',
'authority' is no longer recognized. This affects all upgrading users.
- 1.2 is NOT network-compatible with versions older than 1.0. That
means if you want to do a rolling, zero-downtime upgrade, you'll need
to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains
the ability to read data files from Cassandra versions at least
back to 0.6, so a non-rolling upgrade remains possible with just
one step.
- The default partitioner for new clusters is Murmur3Partitioner,
which is about 10% faster for index-intensive workloads. Partitioners
cannot be changed once data is in the cluster, however, so if you are
switching to the 1.2 cassandra.yaml, you should change this to
RandomPartitioner or whatever your old partitioner was.
- If you using counters and upgrading from a version prior to
1.1.6, you should drain existing Cassandra nodes prior to the
upgrade to prevent overcount during commitlog replay (see
CASSANDRA-4782). For non-counter uses, drain is not required
but is a good practice to minimize restart time.
- Tables using LeveledCompactionStrategy will default to not
creating a row-level bloom filter. The default in older versions
of Cassandra differs; you should manually set the false positive
rate to 1.0 (to disable) or 0.01 (to enable, if you make many
requests for rows that do not exist).
- The hints schema was changed from 1.1 to 1.2. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 1.2 for the first time. Additionally, upgraded nodes
will not store new hints destined for older (pre-1.2) nodes. It is
therefore recommended that you perform a cluster upgrade when all
nodes are up. Because hints will be lost, a cluster-wide repair (with
-pr) is recommended after upgrade of all nodes.
- The `nodetool removetoken` command (and corresponding JMX operation)
have been renamed to `nodetool removenode`. This function is
incompatible with the earlier `nodetool removetoken`, and attempts to
remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster,
is not supported.
- The somewhat ill-conceived CollatingOrderPreservingPartitioner
has been removed. Use Murmur3Partitioner (recommended) or
ByteOrderedPartitioner instead.
- Global option hinted_handoff_throttle_delay_in_ms has been removed.
hinted_handoff_throttle_in_kb has been added instead.
- The default bloom filter fp chance has been increased to 1%.
This will save about 30% of the memory used by the old default.
Existing columnfamilies will retain their old setting.
- The default partitioner (for new clusters; the partitioner cannot be
changed in existing clusters) was changed from RandomPartitioner to
Murmur3Partitioner which provides faster hashing as well as improved
performance with secondary indexes.
- The default version of CQL (and cqlsh) is now CQL3. CQL2 is still
available but you will have to use the thrift set_cql_version method
(that is already supported in 1.1) to use CQL2. For cqlsh, you will need
to use 'cqlsh -2'.
- CQL3 is now considered final in this release. Compared to the beta
version that is part of 1.1, this final version has a few additions
(collections), but also some (incompatible) changes in the syntax for the
options of the create/alter keyspace/table statements. Typically, the
syntax to create a keyspace is now:
CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : 2 };
Also, the consistency level cannot be set in the language anymore, but is
at the protocol level.
Please refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details.
- In CQL3, the DROP behavior from ALTER TABLE has currently been removed
(because it was not correctly implemented). We hope to add it back soon
(Cassandra 1.2.1 or 1.2.2)
Features
--------
- Cassandra can now handle concurrent CREATE TABLE schema changes
as well as other updates
- rpc_timeout has been split up to allow finer-grained control
on timeouts for different operation types
- num_tokens can now be specified in cassandra.yaml. This defines the
number of tokens assigned to the host on the ring (default: 1).
Also specifying initial_token will override any num_tokens setting.
- disk_failure_policy allows blacklisting failed disks in JBOD
configuration instead of erroring out indefinitely
- event tracing can be configured per-connection ("trace_next_query")
or globally/probabilistically ("nodetool settraceprobability")
- Atomic batches are now supported server side, where Cassandra will
guarantee that (at the price of pre-writing the batch to another node
first), all mutations in the batch will be applied, even if the
coordinator fails mid-batch.
- new IAuthorizer interface has replaced the old IAuthority. IAuthorizer
allows dynamic permission management via new CQL3 statements:
GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing
the permissions in Cassandra is being worked on and we expect to
include it in 1.2.1 or 1.2.2.
- IAuthenticator interface has been updated to support dynamic user
creation, modification and removal via new CQL3 statements:
CREATE USER, ALTER USER, DROP USER, LIST USERS. A native implementation
that stores users in Cassandra itself is being worked on and is expected to
become part of 1.2.1 or 1.2.2.
1.1.5
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.4
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.3
=====
Upgrading
---------
- Running "nodetool upgradesstables" after upgrading is recommended
if you use Counter columnfamilies.
Features
--------
- the cqlsh COPY command can now export to CSV flat files
- added a new tools/bin/token-generator to facilitate generating evenly distributed tokens
1.1.2
=====
Upgrading
---------
- If you have column families using the LeveledCompactionStrategy, you should run scrub on those column families.
Features
--------
- cqlsh has a new COPY command to load data from CSV flat files
1.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
Features
--------
- Continuous commitlog archiving and point-in-time recovery.
See conf/commitlog_archiving.properties
- Incremental repair by token range, exposed over JMX
1.1
===
Upgrading
---------
- Compression is enabled by default on newly created ColumnFamilies
(and unchanged for ColumnFamilies created prior to upgrading).
- If you are running a multi datacenter setup, you should upgrade to
the latest 1.0.x (or 0.8.x) release before upgrading. Versions
0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible
with 1.1.
- EACH_QUORUM ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a LOCAL_QUORUM read instead.)
- ANY ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a ONE read for range queries;
single-row and multiget reads already rejected ANY.)
- The largest mutation batch accepted by the commitlog is now 128MB.
(In practice, batches larger than ~10MB always caused poor
performance due to load volatility and GC promotion failures.)
Larger batches will continue to be accepted but will not be
durable. Consider setting durable_writes=false if you really
want to use such large batches.
- Make sure that global settings: key_cache_{size_in_mb, save_period}
and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
used instead of per-ColumnFamily options.
- JMX methods no longer return custom Cassandra objects. Any such methods
will now return standard Maps, Lists, etc.
- Hadoop input and output details are now separated. If you were
previously using methods such as getRpcPort you now need to use
getInputRpcPort or getOutputRpcPort depending on the circumstance.
- CQL changes:
+ Prior to 1.1, you could use KEY as the primary key name in some
select statements, even if the PK was actually given a different
name. In 1.1+ you must use the defined PK name.
- The sliced_buffer_size_in_kb option has been removed from the
cassandra.yaml config file (this option was a no-op since 1.0).
Features
--------
- Concurrent schema updates are now supported, with any conflicts
automatically resolved. Please note that simultaneously running
‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t
be safe until version 1.2 due to the nature of ColumnFamily
identifier generation, for more details see CASSANDRA-3794.
- The CQL language has undergone a major revision, CQL3, the
highlights of which are covered at [1]. CQL3 is not
backwards-compatibile with CQL2, so we've introduced a
set_cql_version Thrift method to specify which version you want.
(The default remains CQL2 at least until Cassandra 1.2.) cqlsh
adds a --cql3 flag to enable this.
[1] http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
- Row-level isolation: multi-column updates to a single row have
always been *atomic* (either all will be applied, or none)
thanks to the CommitLog, but until 1.1 they were not *isolated*
-- a reader may see mixed old and new values while the update
happens.
- Finer-grained control over data directories, allowing a ColumnFamily to
be pinned to specfic volume, e.g. one backed by SSD.
- The bulk loader is not longer a fat client; it can be run from an
existing machine in a cluster.
- A new write survey mode has been added, similar to bootstrap (enabled via
-Dcassandra.write_survey=true), but the node will not automatically join
the cluster. This is useful for cases such as testing different
compaction strategies with live traffic without affecting the cluster.
- Key and row caches are now global, similar to the global memtable
threshold. Manual tuning of cache sizes per-columnfamily is no longer
required.
- Off-heap caches no longer require JNA, and will work out of the box
on Windows as well as Unix platforms.
- Streaming is now multithreaded.
- Compactions may now be aborted via JMX or nodetool.
- The stress tool is not new in 1.1, but it is newly included in
binary builds as well as the source tree
- Hadoop: a new BulkOutputFormat is included which will directly write
SSTables locally and then stream them into the cluster.
YOU SHOULD USE BulkOutputFormat BY DEFAULT. ColumnFamilyOutputFormat
is still around in case for some strange reason you want results
trickling out over Thrift, but BulkOutputFormat is significantly
more efficient.
- Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat,
allowing index expressions to be evaluated server-side to reduce
the amount of data sent to Hadoop.
- Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via
a boolean parameter to setInputColumnFamily, that pages through
data column-at-a-time instead of row-at-a-time.
- Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT
to true. This will produce each row's columns in a bag.
1.0.8
=====
Upgrading
---------
- Nothing specific to 1.0.8
Other
-----
- Allow configuring socket timeout for streaming
1.0.7
=====
Upgrading
---------
- Nothing specific to 1.0.7, please report to instruction for 1.0.6
Other
-----
- Adds new setstreamthroughput to nodetool to configure streaming
throttling
- Adds JMX property to get/set rpc_timeout_in_ms at runtime
- Allow configuring (per-CF) bloom_filter_fp_chance
1.0.6
=====
Upgrading
---------
- This release fixes an issue related to the chunk_length_kb option for
compressed sstables. If you use compression on some column families, it
is recommended after the upgrade to check the value for this option on
these column families (the default value is 64). In case the option would
not be set correctly, you should update the column family definition,
setting the right value and then run scrub on the column family.
- Please report to instruction for 1.0.5 if coming from an older version.
1.0.5
=====
Upgrading
---------
- 1.0.5 comes to fix two important regression of 1.0.4. So all information
concerning 1.0.4 are valid for this release, but please avoids upgrading
to 1.0.4.
1.0.4
=====
Upgrading
---------
- Nothing specific to 1.0.4 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- A new upgradesstables command has been added to nodetool. It is very
similar to scrub but without the ability to discard corrupted rows (and
as a consequence it does not snapshot automatically before). This new
command is to be prefered to scrub in all cases where sstables should be
rewritten to the current format for upgrade purposes.
JMX
---
- The path for the data, commit log and saved cache directories exposed
through JMX
- The in-memory bloom filter sizes are now exposed through JMX
1.0.3
=====
Upgrading
---------
- Nothing specific to 1.0.3 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- For non compressed sstables (compressed sstable already include more
fine grained checsums), a sha1 for the full sstable is now automatically
created (in a fix with suffix -Digest.sha1). It can be used to check the
sstable integrity with sha1sum.
1.0.2
=====
Upgrading
---------
- Nothing specific to 1.0.2 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- Cassandra CLI queries now have timing information
1.0.1
=====
Upgrading
---------
- If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading
section
- For running on Windows as a Service, procrun is no longer discributed
with Cassandra, see README.txt for more information on how to download
it if necessary.
- The name given to snapshots directories have been improved for human
readability. If you had scripts relying on it, you may need to update
them.
1.0
===
Upgrading
---------
- Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible
with 1.0: upgrade to the most recent 0.8 release first.)
You do not need to bring down the whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- CQL inserts/updates now generate microsecond resolution timestamps
by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD
MIX milliseconds and microseconds, with clients talking to servers
generating milliseconds unable to overwrite the larger microsecond
timestamps. If you are using CQL and this is important for your
application, you can either perform a non-rolling upgrade to 1.0, or
update your application first to use explicit timestamps with the "USING
timestamp=X" syntax.
- The BinaryMemtable bulk-load interface has been removed (use the
sstableloader tool instead).
- The compaction_thread_priority setting has been removed from
cassandra.yaml (use compaction_throughput_mb_per_sec to throttle
compaction instead).
- CQL types bytea and date were renamed to blob and timestamp, respectively,
to conform with SQL norms. CQL type int is now a 4-byte int, not 8
(which is still available as bigint).
- Cassandra 1.0 uses arena allocation to reduce old generation
fragmentation. This means there is a minimum overhead of 1MB
per ColumnFamily plus 1MB per index.
- The SimpleAuthenticator and SimpleAuthority classes have been moved to
the example directory (and are thus not available from the binary
distribution). They never provided actual security and in their current
state are only meant as examples.
Features
--------
- SSTable compression is supported through the 'compression_options'
parameter when creating/updating a column family. For instance, you can
create a column family Cf using compression (through the Snappy library)
in the CLI with:
create column family Cf with compression_options={sstable_compression: SnappyCompressor}
SSTable compression is not activated by default but can be activated or
deactivated at any time.
- Compressed SSTable blocks are checksummed to protect against bitrot
- New LevelDB-inspired compaction algorithm can be enabled by setting the
Columnfamily compaction_strategy=LeveledCompactionStrategy option.
Leveled compaction means you only need to keep a few MB of space free for
compaction instead of (in the worst case) 50%.
- Ability to use multiple threads during a single compaction. See
multithreaded_compaction in cassandra.yaml for more details.
- Windows Service ("cassandra.bat install" to enable)
- A dead node may be replaced in a single step by starting a new node
with -Dcassandra.replace_token=<token>. More details can be found at
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
- It is now possible to repair only the first range returned by the
partitioner for a node with `nodetool repair -pr`. It makes it
easier/possible to repair a full cluster without any work duplication by
running this command on every node of the cluster.
New data types
--------------
- decimal
Other
-----
- Hinted Handoff has two major improvements:
- Hint replay is much more efficient thanks to a change in the data model
- Hints are created for all replicas that do not ack a write. (Formerly,
only replicas known to be down when the write started were hinted.)
This means that running with read repair completely off is much more
viable than before, and the default read_repair_chance is reduced from 1.0
("always repair") to 0.1 ("repair 10% of the time").
- The old per-ColumnFamily memtable thresholds
(memtable_throughput_in_mb, memtable_operations_in_millions,
memtable_flush_after_mins) are ignored, in favor of the global
memtable_total_space_in_mb and commitlog_total_space_in_mb settings.
This does not affect client compatibility -- the old options are
still allowed, but have no effect. These options may be removed
entirely in a future release.
- Backlogged compactions will begin five minutes after startup. The 0.8
behavior of never starting compaction until a flush happens is usually
not what is desired, but a short grace period is useful to allow caches
to warm up first.
- The deletion of compacted data files is not performed during Garbage
Collection anymore. This means compacted files will now be deleted
without delay.
0.8.5
=====
Features
--------
- SSTables copied to a data directory can be loaded by a live node through
nodetool refresh (may be handy to load snapshots).
- The configured compaction throughput is exposed through JMX.
Other
-----
- The sstableloader is now bundled with the debian package.
- Repair detects when a participating node is dead and fails instead of
hanging forever.
0.8.4
=====
Upgrading
---------
- Nothing specific to 0.8.4
Other
-----
- This release comes to fix a bug in counter that could lead to
(important) over-count.
- It also fixes a slight upgrade regression from 0.8.3. It is thus advised
to jump directly to 0.8.4 if upgrading from before 0.8.3.
0.8.3
=====
Upgrading
---------
- Token removal has been revamped. Removing tokens in a mixed cluster with
0.8.3 will not work, so the entire cluster will need to be running 0.8.3
first, except for the dead node.
Features
--------
- It is now possible to use thrift asynchronous and
half-synchronous/half-asynchronous servers (see cassandra.yaml for more
details).
- It is now possible to access counter columns through Hadoop.
Other
-----
- This release fix a regression of 0.8 that can make commit log segment to
be deleted even though not all data it contains has been flushed.
Upgrades from 0.8.* is very much encouraged.
0.8.2
=====
Upgrading
---------
- 0.8.0 and 0.8.1 shipped with a bug that was setting the
replicate_on_write option for counter column families to false (this
option has no effect on non-counter column family). This is an unsafe
default and 0.8.2 correct this, the default for replicate_on_write is
now true. It is advised to update your counter column family definitions
if replicate_on_write was uncorrectly set to false (before or after
upgrade).
0.8.1
=====
Upgrading
---------
- 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a
simple rolling restart.
- If upgrading for earlier version (0.7), please refer to the 0.8 section
for instructions.
Features
--------
- Numerous additions/improvements to CQL (support for counters, TTL, batch
inserts/deletes, index dropping, ...).
- Add two new AbstractTypes (comparator) to support compound keys
(CompositeType and DynamicCompositeType), as well as a ReverseType to
reverse the order of any existing comparator.
- New option to bypass the commit log on some keyspaces (for advanced
users).
Tools
-----
- Add new data bulk loading utility (sstableloader).
0.8
===
Upgrading
---------
- Upgrading from version 0.7.1 or later can be done with a rolling
restart, one node at a time. You do not need to bring down the
whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- Running nodetool drain before shutting down the 0.7 node is
recommended but not required. (Skipping this will result in
replay of entire commitlog, so it will take longer to restart but
is otherwise harmless.)
- 0.8 is fully API-compatible with 0.7. You can continue
to use your 0.7 clients.
- Avro record classes used in map/reduce and Hadoop streaming code have
been removed. Map/reduce can be switched to Thrift by changing
org.apache.cassandra.avro in import statements to
org.apache.cassandra.thrift (no class names change). Streaming support
has been removed for the time being.
- The loadbalance command has been removed from nodetool. For similar
behavior, decommission then rebootstrap with empty initial_token.
- Thrift unframed mode has been removed.
- The addition of key_validation_class means the cli will assume keys
are bytes, instead of strings, in the absence of other information.
See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details.
Features
--------
- added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
Python, respectively (see: drivers/ subdirectory and doc/cql)
- added distributed Counters feature;
see http://wiki.apache.org/cassandra/Counters
- optional intranode encryption; see comments around 'encryption_options'
in cassandra.yaml
- compaction multithreading and rate-limiting; see
'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
cassandra.yaml
- cassandra will limit total memtable memory usage to 1/3 of the heap
by default. This can be ajusted or disabled with the
memtable_total_space_in_mb option. The old per-ColumnFamily
throughput, operations, and age settings are still respected but
will be removed in a future major release once we are satisfied that
memtable_total_space_in_mb works adequately.
Tools
-----
- stress and py_stress moved from contrib/ to tools/
- clustertool was removed (see
https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
of how to script nodetool across the cluster instead)
Other
-----
- In the past, sstable2json would write column names and values as
hex strings, and now creates human readable values based on the
comparator/validator. As a result, JSON dumps created with
older versions of sstable2json are no longer compatible with
json2sstable, and imports must be made with a configuration that
is identical to the export.
- manually-forced compactions ("nodetool compact") will do nothing
if only a single SSTable remains for a ColumnFamily. To force it
to compact that anyway (which will free up space if there are
a lot of expired tombstones), use the new forceUserDefinedCompaction
JMX method on CompactionManager.
- most of contrib/ (which was not part of the binary releases)
has been moved either to examples/ or tools/. We plan to move the
rest for 0.8.1.
JMX
---
- By default, JMX now listens on port 7199.
0.7.6
=====
Upgrading
---------
- Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
0.7.5
=====
Upgrading
---------
- Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Changes
-------
- system_update_column_family no longer snapshots before applying
the schema change. (_update_keyspace never did. _drop_keyspace
and _drop_column_family continue to snapshot.)
- added memtable_flush_queue_size option to cassandra.yaml to
avoid blocking writes when multiple column families (or a colum
family with indexes) are flushed at the same time.
- allow overriding initial_token, storage_port and rpc_port using
system properties
0.7.4
=====
Upgrading
---------
- Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Features
--------
- Output to Pig is now supported as well as input
0.7.3
=====
Upgrading
---------
- 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
bloom filters to be generated when compacting sstables generated
with earlier versions. This would manifest in IOExceptions during
column name-based queries. 0.7.3 provides "nodetool scrub" to
rebuild sstables with correct bloom filters, with no data lost.
(If your cluster was never on 0.7.0 or earlier, you don't have to
worry about this.) Note that nodetool scrub will snapshot your
data files before rebuilding, just in case.
0.7.1
=====
Upgrading
---------
- 0.7.1 is completely backwards compatible with 0.7.0. Just restart
each node with the new version, one at a time. (The cluster does
not all need to be upgraded simultaneously.)
Features
--------
- added flush_largest_memtables_at and reduce_cache_sizes_at options
to cassandra.yaml as an escape valve for memory pressure
- added option to specify -Dcassandra.join_ring=false on startup
to allow "warm spare" nodes or performing JMX maintenance before
joining the ring
Performance
-----------
- Disk writes and sequential scans avoid polluting page cache
(requires JNA to be enabled)
- Cassandra performs writes efficiently across datacenters by
sending a single copy of the mutation and having the recipient
forward that to other replicas in its datacenter.
- Improved network buffering
- Reduced lock contention on memtable flush
- Optimized supercolumn deserialization
- Zero-copy reads from mmapped sstable files
- Explicitly set higher JVM new generation size
- Reduced i/o contention during saving of caches
0.7.0
=====
Features
--------
- Secondary indexes (indexes on column values) are now supported
- Row size limit increased from 2GB to 2 billion columns. rows
are no longer read into memory during compaction.
- Keyspace and ColumnFamily definitions may be added and modified live
- Streaming data for repair or node movement no longer requires
anticompaction step first
- NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for
use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments
in `cassandra.yaml.`
- Optional per-Column time-to-live field allows expiring data without
have to issue explicit remove commands
- `truncate` thrift method allows clearing an entire ColumnFamily at once
- Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out]
support
- Up to 8x faster reads from row cache
- A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
and orders keys by their byte value. This should be used in new
deployments instead of OrderPreservingPartitioner.
- Optional round-robin scheduling between keyspaces for multitenant
clusters
- Dynamic endpoint snitch mitigates the impact of impaired nodes
- New `IntegerType`, faster than LongType and allows integers of
both less and more bits than Long's 64
- A revamped authentication system that decouples authorization and
allows finer-grained control of resources.
Upgrading
---------
The Thrift API has changed in incompatible ways; see below, and refer
to http://wiki.apache.org/cassandra/ClientOptions for a list of
higher-level clients that have been updated to support the 0.7 API.
The Cassandra inter-node protocol is incompatible with 0.6.x
releases (and with 0.7 beta1), meaning you will have to bring your
cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
The hints schema was changed from 0.6 to 0.7. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 0.7 for the first time.
Keyspace and ColumnFamily definitions are stored in the system
keyspace, rather than the configuration file.
The process to upgrade is:
1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log
message "Node is drained" appears), stop the process.
2) Convert your storage-conf.xml to the new cassandra.yaml using
"bin/config-converter".
3) Rename any of your keyspace or column family names that do not adhere
to the '^\w+' regex convention.
4) Start up your cluster with the 0.7 version.
5) Initialize your Keyspace and ColumnFamily definitions using
"bin/schematool <host> <jmxport> import". _You only need to do
this to one node_.
Thrift API
----------
- The Cassandra server now defaults to framed mode, rather than
unframed. Unframed is obsolete and will be removed in the next
major release.
- The Cassandra Thrift interface file has been updated for Thrift 0.5.
If you are compiling your own client code from the interface, you
will need to upgrade the Thrift compiler to match.
- Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
CollatingOrderPreservingPartitioner continue to expect that keys contain
UTF-8 encoded strings, but RandomPartitioner now works on any key data.
- keyspace parameters have been replaced with the per-connection
set_keyspace method.
- The return type for login() is now AccessLevel.
- The get_string_property() method has been removed.
- The get_string_list_property() method has been removed.
Configuraton
------------
- Configuration file renamed to cassandra.yaml and log4j.properties to
log4j-server.properties
- PropertyFileSnitch configuration file renamed to
cassandra-topology.properties
- The ThriftAddress and ThriftPort directives have been renamed to
RPCAddress and RPCPort respectively.
- EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch
has been added.
- RackUnawareStrategy and RackAwareStrategy have been renamed to
SimpleStrategy and OldNetworkTopologyStrategy, respectively.
- RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
- GCGraceSeconds is now per-ColumnFamily instead of global
- Keyspace and column family names that do not confirm to a '^\w+' regex
are considered illegal.
- Keyspace and column family definitions will need to be loaded via
"bin/schematool <host> <jmxport> import". _You only need to do this to
one node_.
- In addition to an authenticator, an authority must be configured as
well. Users of SimpleAuthenticator should use SimpleAuthority for this
value (the default is AllowAllAuthority, which corresponds with
AllowAllAuthenticator).
- The format of access.properties has changed, see the sample configuration
conf/access.properties for documentation on the new format.
JMX
---
- StreamingService moved from o.a.c.streaming to o.a.c.service
- GMFD renamed to GOSSIP_STAGE
- {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
since it no longer has to wait til compaction to be computed
Other
-----
- If extending AbstractType, make sure you follow the singleton pattern
followed by Cassandra core AbstractType classes: provide a public
static final variable called 'instance'.
0.6.6
=====
Upgrading
---------
- As part of the cache-saving feature, a third directory
(along with data and commitlog) has been added to the config
file. You will need to set and create this directory
when restarting your node into 0.6.6.
0.6.1
=====
Upgrading
---------
- We try to keep minor versions 100% compatible (data format,
commitlog format, network format) within the major series, but
we introduced a network-level incompatibility in 0.6.1.
Thus, if you are upgrading from 0.6.0 to any higher version
(0.6.1, 0.6.2, etc.) then you will need to restart your entire
cluster with the new version, instead of being able to do a
rolling restart.
0.6.0
=====
Features
--------
- row caching: configure with the RowsCached attribute in
ColumnFamily definition
- Hadoop map/reduce support: see contrib/word_count for an example
- experimental authentication support, described under
Authenticator in storage.conf
Configuraton
------------
- MemtableSizeInMB has been replaced by MemtableThroughputInMB which
triggers a memtable flush when the specified amount of data has
been written, including overwrites.
- MemtableObjectCountInMillions has been replaced by the
MemtableOperationsInMillions directive which causes a memtable flush
to occur after the specified number of operations.
- Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by
BinaryMemtableThroughputInMB.
- Replication factor is now per-keyspace, rather than global.
- KeysCachedFraction is deprecated in favor of KeysCached
- RowWarningThresholdInMB added, to warn before very large rows
get big enough to threaten node stability
Thrift API
----------
- removed deprecated get_key_range method
- added batch_mutate meethod
- deprecated multiget and batch_insert methods in favor of
multiget_slice and batch_mutate, respectively
- added ConsistencyLevel.ANY, for when you want write
availability even when it may not be readable immediately.
Unlike CL.ZERO, though, it will throw an exception if
it cannot be written *somewhere*.
JMX metrics
-----------
- read and write statistics are reported as lifetime totals,
instead of averages over the last minute. average-since-last
requested are also available for convenience.
- cache hit rate statistics are now available from JMX under
org.apache.cassandra.db.Caches
- compaction JMX metrics are moved to
org.apache.cassandra.db.CompactionManager. PendingTasks is now
a much better estimate of compactions remaining, and the
progress of the current compaction has been added.
- commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
- progress of data streaming during bootstrap, loadbalance, or other
data migration, is available under
org.apache.cassandra.streaming.StreamingService.
See http://wiki.apache.org/cassandra/Streaming for details.
Installation/Upgrade
--------------------
- 0.6 network traffic is not compatible with earlier versions. You
will need to shut down all your nodes at once, upgrade, then restart.
0.5.0
=====
0. The commitlog format has changed (but sstable format has not).
When upgrading from 0.4, empty the commitlog either by running
bin/nodeprobe flush on each machine and waiting for the flush to finish,
or simply remove the commitlog directory if you only have test data.
(If more writes come in after the flush command, starting 0.5 will error
out; if that happens, just go back to 0.4 and flush again.)
The format changed twice: from 0.4 to beta1, and from beta2 to RC1.
.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist
in a cluster of 0.4 nodes or vice versa; you must upgrade your
whole cluster at the same time.
1. Bootstrap, move, load balancing, and active repair have been added.
See http://wiki.apache.org/cassandra/Operations. When upgrading
from 0.4, leave autobootstrap set to false for the first restart
of your old nodes.
2. Performance improvements across the board, especially on the write
path (over 100% improvement in stress.py throughput).
3. Configuration:
- Added "comment" field to ColumnFamily definition.
- Added MemtableFlushAfterMinutes, a global replacement for the
old per-CF FlushPeriodInMinutes setting
- Key cache settings
4. Thrift:
- Added get_range_slice, deprecating get_key_range
0.4.2
=====
1. Improve default garbage collector options significantly --
throughput will be 30% higher or more.
0.4.1
=====
1. SnapshotBeforeCompaction configuration option allows snapshotting
before each compaction, which allows rolling back to any version
of the data.
0.4.0
=====
1. On-disk data format has changed to allow billions of keys/rows per
node instead of only millions. The new format is incompatible with 0.3;
see 0.3 notes below for how to import data from a 0.3 install.
2. Cassandra now supports multiple keyspaces. Typically you will have
one keyspace per application, allowing applications to be able to
create and modify ColumnFamilies at will without worrying about
collisions with others in the same cluster.
3. Many Thrift API changes and documentation. See
http://wiki.apache.org/cassandra/API
4. Removed the web interface in favor of JMX and bin/nodeprobe, which
has significantly enhanced functionality.
5. Renamed configuration "<Table>" to "<Keyspace>".
6. Added commitlog fsync; see "<CommitLogSync>" in configuration.
0.3.0
=====
1. With enough and large enough keys in a ColumnFamily, Cassandra will
run out of memory trying to perform compactions (data file merges).
The size of what is stored in memory is (S + 16) * (N + M) where S
is the size of the key (usually 2 bytes per character), N is the
number of keys and M, is the map overhead (which can be guestimated
at around 32 bytes per key).
So, if you have 10-character keys and 1GB of headroom in your heap
space for compaction, you can expect to store about 17M keys
before running into problems.
See https://issues.apache.org/jira/browse/CASSANDRA-208
2. Because fixing #1 requires a data file format change, 0.4 will not
be binary-compatible with 0.3 data files. A client-side upgrade
can be done relatively easily with the following algorithm:
for key in old_client.get_key_range(everything):
columns = old_client.get_slice or get_slice_super(key, all columns)
new_client.batch_insert or batch_insert_super(key, columns)
The inner loop can be trivially parallelized for speed.
3. Commitlog does not fsync before reporting a write successful.
Using blocking writes mitigates this to some degree, since all
nodes that were part of the write quorum would have to fail
before sync for data to be lost.
See https://issues.apache.org/jira/browse/CASSANDRA-182
Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, because
compaction deserializes each row before merging.
See https://issues.apache.org/jira/browse/CASSANDRA-16
DSE 5.1.2のリリース・ノート
DataStax Enterprise 5.1.2のリリース・ノート。
2017年7月18日
- 5.1.2のコンポーネント
- RNdse.html#RNdse512__512H
- 5.1.2の変更点と機能強化
- 5.1.2の解決済みの問題点
- 5.1.2のCHANGES.txt
- 5.1.2のNEWS.txt
5.1.2のコンポーネント
- Apache Cassandra™ 3.11.0.1758
- Apache Solr™ 6.0.1.0.1716
- DataStax Spark Cassandra Connector 2.0.3
- DSEFS 5.1.2
- TinkerPop 3.2.6-20170623-d59f0b40
- 特定のHadoopライブラリ
5.1.2のハイライト
5.1.2 DataStax Enterpriseコアのハイライト
DataStax Enterprise 5.1.2には、カラムをテーブルに追加するとき、またはテーブルからカラムを削除するときに発生する可能性がある破損を修正する、CASSANDRA-13004が含まれています。(DSP-13684)
アップグレード前 | アップグレード後 | アップグレード手順 |
---|---|---|
5.0.0~5.0.8 | 5.1.2以降 | 「DataStax Enterprise 5.0から5.1へのアップグレード」の「アップグレードの準備」セクションの「DSE 5.0.0から5.0.8へのアップグレード」と「DSE 5.1.0および5.1.1からDSE 5.1.2のみへのアップグレード」の手順を参照してください。 |
5.1.0~5.1.1 | 5.1.2以降 | 「DataStax Enterpriseパッチ・リリースのアップグレード」の「アップグレードの準備」を参照してください。 |
5.1.2 DSE AnalyticsおよびDSEFSのハイライト
- DSEFSが有効になっていて(5.1のすべてのAnalyticsノードのデフォルト)、DSEFS作業ディレクトリーまたはデータ・ディレクトリーが見つからず、作成できない場合、DSEは起動しません。以前のリリースでは、DSEは起動しましたが、後でAnalyticsノードに検知が難しい問題が発生していました。(DSP-13238)
- 権限管理を有効にしたときのDSEFSのパフォーマンスが向上しました。 新たにdse.yamlの高度なDSEFSオプションとしてquery_cache_sizeとquery_cache_expire_after_ms により、認証情報のキャッシングが調整されます。(DSP-13107)
5.1.2 DSE Graphのハイライト
- パフォーマンスの向上:Gremlinスクリプトのコンパイル。(DSP-12789)
- 頂点プロパティの取得が大幅に改善されました。(DSP-13467)
- パーティション分割された頂点テーブル(PVT)は廃止予定です。(DSP-13501)
- Graph Loader:地理空間データ型の読み込みがサポートされました。(DGL-225)
5.1.2 DSE Searchのハイライト
- インデックスの再作成のパフォーマンスが向上しました。(DSP-13751)、(DSP-12923)
- solrのインデックス作成の管理タスクが修正されました。(DSP-13778)、(DSP-10088)、(DSP-13793)
5.1.2の変更点と機能強化
5.1.2 DataStax Enterpriseの変更点と機能強化
- Jacksonデシリアライザーの脆弱性。(DSP-13414)
- トライブルシューティングおよび監視のためのnodetool sjkコマンドが新たに追加されました。これはローカル・ノードでSwiss Java Knife(SJK)を実行します。(DSP-13544)
- メトリクス・レポーターを修正するため、o.a.c.metricsでorg.codahale.metricsを拡張しました。(DSP-13840)
- フィルター処理中に範囲クエリーが確実に処理されるようになりました。(DSP-13840)
- 1つのカラムを複数のSASIインデックスにマップできるようになりました。(DSP-13045)
- プリペアド・ステートメント・キャッシュからpstmtsが正しく排除されるようになりました。(DSP-13770)
- nodetool sequenceバッチ機能が追加されました。(DSP-13770)
- cqlshで正しいプロトコル・バージョンが表示されるようになりました。(DSP-13544)
- MemtablePostFlushのnullアサーション。(DSP-13544)
5.1.2 DSE Analyticsの変更点と機能強化
- ALLOW_SPARK_HOME=trueの場合に、SPARK_HOME環境変数を使用してユーザー固有のSparkホーム・ディレクトリーを指定できるようになりました。(DSP-8100)
- リース・マネージャー・ログ・メッセージが変更され、Sparkマスターのトラブルシューティングが改善されました。(DSP-12846)
5.1.2 DSE Graphの変更点と機能強化
- ディレクトリーの読み込みにファイルの一致パターンを指定できるようになりました。(DGL-177)
- Graph Loader:地理空間データ型の読み込みがサポートされました。(DGL-225)
- 初期化時にSpark送信で接続の問題が発生したときのエラー・メッセージが改善されました。(DSP-12632)
- パーティション分割された頂点テーブル(PVT)は廃止予定です。(DSP-13501)
- TinkerPopドライバーやCassandraネイティブ・プロトコルを使用するドライバーに対するグラフ・クエリー要求で、256個を超えるパラメーターを渡す場合は、変更が必要です。スクリプトの評価時間が比例して増加するため、多くのパラメーターを要求で渡すことはアンチパターンです。DataStaxでは、スクリプトのコンパイル時間を短縮するためにパラメーターの数を減らすことを推奨しています。単一のマップを渡すなど、スクリプトをパラメーター化する別の方法を検討してください。グラフ・クエリー要求に多くの引数が必要な場合は、リストを渡します。256個を超えるパラメーターを渡す場合は、dse.yamlのmax_query_paramsオプションを大きくしてください。(DSP-12789)
- グラフ内の文ごとにDseQueryHandlerはインスタンス化されません。(DSP-13287)
- GraphSON 2.0のシリアライズのパフォーマンスが向上しました。(DSP-13467)
- Spark SQLでDSEFSキースペースが表示されるようになりました。(DSP-13510)
- グラフ作成中のプロビジョニング状態が排除されました。グラフは稼働中か、存在しないかのいずれかです。(DSP-13686)
- スキーマの移行が改善されました。スキーマのプロビジョニングが排除されました。(DSP-13665)
5.1.2 DSEFSの変更点と機能強化
- 権限管理のパフォーマンスが向上しました。新たにdse.yamlの高度なDSEFSオプションとしてquery_cache_sizeとquery_cache_expire_after_msが追加されました。(DSP-13107)
- DSEFSのストレージ・スペースが少なくなったときのエラー・メッセージが改善されました。(DSP-13324)
- DSEFSのキースペース作成時は、レプリケーション係数を1にしてSimpleStrategyを使用します。クラスターの初回起動後は、適切なRFでNetworkTopologyStrategyを使用するようにキースペースを変更する必要があります。(DSP-12662)
5.1.2 DSE Searchの変更点と機能強化
- rtOffheapPostingsがデモと自動生成されたsolrconfig.xmlファイルにデフォルトで含まれ、trueになっています。(DSP-10088、DSP-13228)
- 個々のパーティション・インデックス作成タスクが並列実行されるため、リペア時のインデックス再作成の速度が大幅に向上しました。Cassandraのデフォルトのリペア後インデックス・ビルダーはオーバーライドされます。(DSP-12923)
- デフォルトのフィルター・キャッシュ設定が変更されました。(DSP-13153)
- Apache SolrにバンドルされているTika機能は廃止予定です。代わりに、スタンドアローンのApache Tikaプロジェクトを使用します。(DSP-14002)
5.1.2の解決済みの問題点
5.1.2 DataStax Enterpriseの解決済みの問題点
- インデックスが既に削除されている場合に、CqlSlowLogPluginでDropIndexStatementのテーブル名を特定できないことがある。(DSP-11811)
- ワークロードのインストーラー・オーバーライドが、No Services + Analyticsで機能しない。(DSP-13475)
5.1.2 DSE Advanced Replication(DSE拡張レプリケーション)の解決済みの問題点
なし。
5.1.2 DSE Analyticsの解決済みの問題点
- Sparkエグゼキューターやドライバーのデフォルトと指定のJVMオプションを間違えやすい。(DSP-12857)
- dse.yamlのDSEFS min_free_spaceのデフォルト値が5 GBに変わる。(DSP-13178)
- DSEに接続できないときにSparkシェルを中断できず、リトライが繰り返される。(DSP-13339)
- Sparkアプリケーションの構成接続では、ターゲットDCでSparkを実行するノードのみを選択するためにロード・バランス・ポリシーを使用する必要がある。(DSP-13325)
- Sparkドライバーおよびエグゼキューターを停止したときに監視DSEプロセスが終了すると、競合状態になりワーカーが停止した後も、Sparkエグゼキューターが動作し続ける。(DSP-13688)
- MultipleRetryポリシーのリトライで整合性レベルが正しくないことがある。(DSP-13542)
5.1.2 DSEFSの解決済みの問題点
- DSEFSが有効な場合に構成上の問題のためにDSEFSが起動に失敗すると、DSEも起動しない。(DSP-13238)
5.1.2 DSE Graphの解決済みの問題点
- Graph Loaderにより、grapshonおよびgryoファイル全体がメモリーに読み込まれる(DGL-209)
- 文字列の日付が正しく解析されるようになりました。(DSP-12259)
- 競合状態により、DSEノードのシャットダウン中にSparkエグゼキューターの作成ループが発生する場合がある。(DSP-12589)
- schema.describe()でpropertyKeysの順序が正しく指定されるようになりました。(DSP-12761)
- Gremlinスクリプトのコンパイルに時間がかかる。グラフ・クエリー要求で256個を超えるパラメーターを渡す場合は、必要な変更を確認してください。(DSP-12789)
- デバッグ・モードで起動すると、gremlin-consoleが正しく初期化されない。(DSP-12900)
- インデックスのランキングが、検索インデックス < セカンダリ・インデックス < MVインデックスの順に変更されました。(DSP-13212)
- Graph profile()の結果には、コンソールでもデフォルトでCQLが表示される必要がある。(DSP-13293)
- 要素を返さなかったクエリーについて、空の結果セットがキャッシュされる。(DSP-13342)
- GraphFrameを使用して、nullになる可能性があるプロパティでグループ化できる。(DSP-13406)
- Sparkシェルのグラフ・データのエクスポートの場合は、DseGraphFrameをシリアライズ可能にする必要がある。(DSP-13427)
- .select() .by()またはlocal()の後方互換性の問題。(DSP-13607)
- DseGraphFrame.updateEdges()により、単一のカーディナリティ・エッジが正しく挿入されるようになりました。(DSP-13865)
- グラフ・フレームの削除コマンドの実行時に、Sparkシェルが無期限にハングアップしたように見える。(DSP-13795)
.1.2 DSE Searchの解決済みの問題点
- Gremlin inside()関数で検索インデックスが使用されなくなりました。(DSP-13553)
- CREATE SEARCH INDEXがカスタム・リソースで失敗する。(DSP-13778)
- 複数のDSEプロセスがある場合にdse cassandra-stopを実行したときのエラー・メッセージが改善されました。(DSP-12938)
- インデックスが登録されていない状態で無効化を実行した場合に、Solr 2i無効化がデッドロック状態になる。(DSP-13751)
- 自動生成オプションを正しく検証する必要がある。(DSP-13793)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
パッケージ・インストールInstaller-Servicesインストール |
/etc/dse/dse.yaml |
tarボール・インストールInstaller-No Servicesインストール |
installation_location/resources/dse/conf/dse.yaml |
DSE 5.1.2のCHANGES.txt
DataStax Enterprise 5.1.2に含まれている、Apache Cassandra™ 3.11.0の実稼働環境で認定済みの変更点のリスト。
DataStax Enterprise (DSE) 5.1.2には、それより前のDSEリリースのすべての変更点と、Apache Cassandra™ 3.11.0に加えられた実稼働環境で認定済みの以下の変更点が含まれています。これらの変更点はCHANGES.txtにリストされています。
- プリペアド・ステートメント・キャッシュからpstmtsを正しく排除(CASSANDRA-13641)
- 異なるNUMACTL_ARGSの受け渡しを許容(CASSANDRA-13557)
- COMPACTテーブルのセカンダリー・インデックス・クエリーを修正(CASSANDRA-13627)
- スナップショットがない場合、Nodetool listsnapshots出力に改行がない(CASSANDRA-13568)
- UTD、タプル、およびコレクション型のtoJSONStringを修正(CASSANDRA-13592)
- ネストされたタプル/UDTの検証を修正(CASSANDRA-13646)
- MessagingServiceテストの文字列比較を正規表現/数値チェックで置き換え(CASSANDRA-13216)
- CQLSHの期間カラムの形式を修正(CASSANDRA-13549)
- 大きいパーティションの警告サイズの計算時にintオーバーフローの発生を防止(CASSANDRA-13172)
- ColumnFilterのコーディネーターとレプリカ間でパーティション・カラムのビューの整合性を確保(CASSANDRA-13004)
- キースペース削除中のmbeanの登録解除が失敗(CASSANDRA-13346)
- nodetool scrub/cleanup/upgradesstables終了コードが間違っている(CASSANDRA-13542)
- 読み取り1件ごとにアクセスされたsstableデータ・ファイルの数の報告内容を修正(CASSANDRA-13120)
- 3.0.12より前のバージョンからのローリング・アップグレード中のスキーマ・ダイジェストの不一致を修正(CASSANDRA-13559)
- JNAバージョンを4.4.0にアップグレード(CASSANDRA-13072)
- インターン処理後のColumnIdentifiersでByteBuffersの最小値を使用する必要がある(CASSANDRA-13533)
- 2.1から3.0へのアップグレード中にReverseIndexedReaderで行が削除されることがある(CASSANDRA-13525)
- 小さい範囲の開始/終了トークンの制限に違反するリペア・プロセスを修正(CASSANDRA-13052)
- 認証が有効な場合、join_ring=Falseに設定して起動したノードは要求に応答できる必要がある(CASSANDRA-11381)
- cqlsh COPY FROM:試行ではなく失敗した場合にのみエラー・カウントをインクリメント(CASSANDRA-13209)
- SASIでページングを使用する際の重複行の問題を修正(CASSANDRA-13302)
- パーティション・キーとその要素に対してCONTAINS文によるフィルター処理が可能(CASSANDRA-13275)
- トークンの分散が不均等な場合、vnodeを含むクラスターで均等な範囲が計算し直される(CASSANDRA-13229)
- duration型の検証を修正してオーバーフローを回避(CASSANDRA-13218)
- パーティション・キー列でサポートされていないSASIインデックスの作成を禁止(CASSANDRA-13228)
- CQL文法のキーに対する複数値を拒否(CASSANDRA-13369)
- 入力行がない場合にUDAが失敗する(CASSANDRA-13399)
- daemonInitializationを使用してcompaction-stressを修正(CASSANDRA-13188)
- V5プロトコル・フラグのデコード破損(CASSANDRA-13443)
- コンパクション・ストラテジからsstableを削除するために読み取りロックではなく書き込みロックを使用(CASSANDRA-13422)
- JMXEnabledThreadPoolExecutorsでmaxPoolSizeに等しいcorePoolSizeを使用(CASSANDRA-13329)
- 値を含んでいないSASIインデックスのリビルドを回避(CASSANDRA-12962)
- アナライザー入力ストリームにcharsetを追加(CASSANDRA-13151)
- StandardTokenizerImpl.jflexから無効な文字を削除(CASSANDRA-13417)
- cqlshの自動プロトコル・ダウングレードの不具合を修正(CASSANDRA-13307)
- QueryMessageからトレース・セッションに渡されないペイロードのトレース(CASSANDRA-12835)
- sstableloaderにストレージ・ポート・オプションを追加(CASSANDRA-13518)
- cqlsh DESCRIBE出力で引用符で囲まれたインデックス名を適切に処理(CASSANDRA-12847)
- 古い形式のsstableから静的行が2度読み取られるのを回避(CASSANDRA-13236)
- StorageService.excise()のNPEを修正(CASSANDRA-13163)
- 単一スレッドによりOutboundTcpConnectionメッセージの期限を設定する(CASSANDRA-13265)
- 受信した応答が不十分な場合、リペアが失敗する(CASSANDRA-13397)
- 読み込んだテーブルに削除されたカラムが含まれている場合のSSTableLoaderの失敗を修正(CASSANDRA-13276)
- CassandraIndexTestでの名前の競合を回避(CASSANDRA-13427)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- 使用廃止に関するヒントのリプレイを中断(CASSANDRA-13308)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- StorageServiceでのNPEの問題を修正(CASSANDRA-13060)
- 範囲トゥームストーンの読み取りの信頼性を向上(CASSANDRA-12811)
- スキーマ・テーブルが完全にフラッシュされていないことによる起動時の問題を修正(CASSANDRA-12213)
- 再起動時にデータが除外されることのあるビュー・ビルダーのバグを修正(CASSANDRA-13405)
- 標準カラムがない場合の2iページ・サイズの計算を修正(CASSANDRA-13400)
- 標準カラム・データのない2.X期限切れ行の変換を修正(CASSANDRA-13395)
- prefer_localが有効な場合にext+internal IPを使用する際のヒント配信を修正(CASSANDRA-13020)
- Nodetool upgradesstables/scrub/compactがシステム・テーブルを無視する(CASSANDRA-13410)
- ローリング・アップグレードのスキーマ・バージョン計算を修正(CASSANDRA-13441)
- RemoveTestでのgossiperの起動を回避(CASSANDRA-13407)
- JMXとNodeToolによって報告されるrow-cacheのweightedSize()を修正(CASSANDRA-13393)
- JVMメトリクス名を修正(CASSANDRA-13103)
- 結合ストラテジのスリープが過剰(CASSANDRA-13090)
- 静的カラムのあるテーブルのパーティション・キーの2ndaryインデックス・クエリーを修正(CASSANDRA-13147)
- cqlsh copy fromのParseErrorのunhashable型リストを修正(CASSANDRA-13364)
DSE 5.1.2のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================
Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again. Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)
When upgrading major versions of Cassandra, you will be unable to
restore snapshots created with the previous major version using the
'sstableloader' tool. You can upgrade the file format of your snapshots
using the provided 'sstableupgrade' tool.
3.11.1
======
Upgrading
---------
- Nothing specific to this version but please see previous upgrading sections,
especially if you are upgrading from 2.2.
3.11.0
======
Upgrading
---------
- ALTER TABLE (ADD/DROP COLUMN) operations concurrent with a read might
result into data corruption (see CASSANDRA-13004 for more details).
Fixing this bug required a messaging protocol version bump. By default,
Cassandra 3.11 will use 3014 version for messaging.
Since Schema Migrations rely the on exact messaging protocol version
match between nodes, if you need schema changes during the upgrade
process, you have to start your nodes with `-Dcassandra.force_3_0_protocol_version=true`
first, in order to temporarily force a backwards compatible protocol.
After the whole cluster is upgraded to 3.11, do a rolling
restart of the cluster without setting that flag.
3.11 nodes with and withouot the flag set will be able to do schema
migrations with other 3.x and 3.0.x releases.
While running the cluster with the flag set to true on 3.11 (in
compatibility mode), avoid adding or removing any columns to/from
existing tables.
If your cluster can do without schema migrations during the upgrade
time, just start the cluster normally without setting aforementioned
flag.
If you are upgrading from 3.0.14+ (of 3.0.x branch), you do not have
to set an flag while upgrading to ensure schema migrations.
- The NativeAccessMBean isAvailable method will only return true if the
native library has been successfully linked. Previously it was returning
true if JNA could be found but was not taking into account link failures.
- Primary ranges in the system.size_estimates table are now based on the keyspace
replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
- In 2.1, the default for otc_coalescing_strategy was 'DISABLED'.
In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown
to be a performance regression. The default for 3.11.0 and newer has
been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should
be aware that the default has changed.
- The StorageHook interface has been modified to allow to retrieve read information from
SSTableReader (CASSANDRA-13120).
3.10
====
New features
------------
- New `DurationType` (cql duration). See CASSANDRA-11873
- Runtime modification of concurrent_compactors is now available via nodetool
- Support for the assignment operators +=/-= has been added for update queries.
- An Index implementation may now provide a task which runs prior to joining
the ring. See CASSANDRA-12039
- Filtering on partition key columns is now also supported for queries without
secondary indexes.
- A slow query log has been added: slow queries will be logged at DEBUG level.
For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms
in cassandra.yaml.
- Support for GROUP BY queries has been added.
- A new compaction-stress tool has been added to test the throughput of compaction
for any cassandra-stress user schema. see compaction-stress help for how to use.
- Compaction can now take into account overlapping tables that don't take part
in the compaction to look for deleted or overwritten data in the compacted tables.
Then such data is found, it can be safely discarded, which in turn should enable
the removal of tombstones over that data.
The behavior can be engaged in two ways:
- as a "nodetool garbagecollect -g CELL/ROW" operation, which applies
single-table compaction on all sstables to discard deleted data in one step.
- as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag,
which uses overlapping tables as a source of deletions/overwrites during all
compactions.
The argument specifies the granularity at which deleted data is to be found:
- If ROW is specified, only whole deleted rows (or sets of rows) will be
discarded.
- If CELL is specified, any columns whose value is overwritten or deleted
will also be discarded.
- NONE (default) specifies the old behavior, overlapping tables are not used to
decide when to discard data.
Which option to use depends on your workload, both ROW and CELL increase the
disk load on compaction (especially with the size-tiered compaction strategy),
with CELL being more resource-intensive. Both should lead to better read
performance if deleting rows (resp. overwriting or deleting cells) is common.
- Prepared statements are now persisted in the table prepared_statements in
the system keyspace. Upon startup, this table is used to preload all
previously prepared statements - i.e. in many cases clients do not need to
re-prepare statements against restarted nodes.
- cqlsh can now connect to older Cassandra versions by downgrading the native
protocol version. Please note that this is currently not part of our release
testing and, as a consequence, it is not guaranteed to work in all cases.
See CASSANDRA-12150 for more details.
- Snapshots that are automatically taken before a table is dropped or truncated
will have a "dropped" or "truncated" prefix on their snapshot tag name.
- Metrics are exposed for successful and failed authentication attempts.
These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess
and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
- Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET.
See CASSANDRA-11424 for details
- Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
- Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.
Upgrading
---------
- Support for alter types of already defined tables and of UDTs fields has been disabled.
If it is necessary to return a different type, please use casting instead. See
CASSANDRA-12443 for more details.
- Specifying the default_time_to_live option when creating or altering a
materialized view was erroneously accepted (and ignored). It is now
properly rejected.
- Only Java and JavaScript are now supported UDF languages.
The sandbox in 3.0 already prevented the use of script languages except Java
and JavaScript.
- Compaction now correctly drops sstables out of CompactionTask when there
isn't enough disk space to perform the full compaction. This should reduce
pending compaction tasks on systems with little remaining disk space.
- Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the
"full" request time on the coordinator. Previously, they only covered the time from
when the coordinator sent a message to a replica until the time that the replica
responded. Additionally, the previous behavior was to reset the timeout when performing
a read repair, making a second read to fix a short read, and when subranges were read
as part of a range scan or secondary index query. In 3.10 and higher, the timeout
is no longer reset for these "subqueries". The entire request must complete within
the specified timeout. As a consequence, your timeouts may need to be adjusted
to account for this. See CASSANDRA-12256 for more details.
- Logs written to stdout are now consistent with logs written to files.
Time is now local (it was UTC on the console and local in files). Date, thread, file
and line info where added to stdout. (see CASSANDRA-12004)
- The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided.
The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on
that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided
on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable
accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
- (Tools development) DatabaseDescriptor no longer implicitly startups components/services like
commit log replay. This may break existing 3rd party tools and clients. In order to startup
a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or
DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner,
snitch, encryption context. Client initialization just applies the configuration but does not
setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are
deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
- Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from
timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms
property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property.
- Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format
(e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).
3.8
===
New features
------------
- Shared pool threads are now named according to the stage they are executing
tasks for. Thread names mentioned in traced queries change accordingly.
- A new option has been added to cassandra-stress "-rate fixed={number}/s"
that forces a scheduled rate of operations/sec over time. Using this, stress can
accurately account for coordinated ommission from the stress process.
- The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
- hdr histograms have been added to stress runs, it's output can be saved to disk using:
"-log hdrfile=" option. This histogram includes response/service/wait times when used with the
fixed or throttle rate options. The histogram file can be plotted on
http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
- TimeWindowCompactionStrategy has been added. This has proven to be a better approach
to time series compaction and new tables should use this instead of DTCS. See
CASSANDRA-9666 for details.
- Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and
a brief explanation of on-disk locations for archived data in CommitLog form. This can
be enabled via ALTER TABLE ... WITH cdc=true.
Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to
the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables
will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached
between unflushed CommitLogSegments and cdc_raw.
NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version
cluster as it will lead to exceptions which can interrupt traffic. Once all nodes
have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.
Upgrading
---------
- The ReversedType behaviour has been corrected for clustering columns of
BYTES type containing empty value. Scrub should be run on the existing
SSTables containing a descending clustering column of BYTES type to correct
their ordering. See CASSANDRA-12127 for more details.
- Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address
to the public instance IP if this property is defined on cassandra.yaml.
- The name "json" and "distinct" are not valid anymore a user-defined function
names (they are still valid as column name however). In the unlikely case where
you had defined functions with such names, you will need to recreate
those under a different name, change your code to use the new names and
drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more
details).
Deprecation
-----------
- DateTieredCompactionStrategy has been deprecated - new tables should use
TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might
cause increased compaction load for a while after the migration so make sure you run
tests before migrating. Read CASSANDRA-9666 for background on this.
3.7
===
Upgrading
---------
- A maximum size for SSTables values has been introduced, to prevent out of memory
exceptions when reading corrupt SSTables. This maximum size can be set via
max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default
value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if
they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.
3.6
=====
New features
------------
- JMX connections can now use the same auth mechanisms as CQL clients. New options
in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated
to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings
still only expose JMX locally, and use the JVM's own security mechanisms when remote
connections are permitted. For more details on how to enable the new options, see the
comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for
the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details.
Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system
property at startup is deprecated. See CASSANDRA-11725 for more details.
- JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
- Collision checks are performed when joining the token ring, regardless of whether
the node should bootstrap. Additionally, replace_address can legitimately be used
without bootstrapping to help with recovery of nodes with partially failed disks.
See CASSANDRA-10134 for more details.
- Key cache will only hold indexed entries up to the size configured by
column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries
will never go into memory. See CASSANDRA-11206 for more details.
- For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL
from the inserted or updated values.
- Startup is now aborted if corrupted transaction log files are found. The details
of the affected log files are now logged, allowing the operator to decide how
to resolve the situation.
- Filtering expressions are made more pluggable and can be added programatically via
a QueryHandler implementation. See CASSANDRA-11295 for more details.
3.4
===
New features
------------
- Internal authentication now supports caching of encrypted credentials.
Reference cassandra.yaml:credentials_validity_in_ms
- Remote configuration of auth caches via JMX can be disabled using the
the system property cassandra.disable_auth_caches_remote_configuration
- sstabledump tool is added to be 3.0 version of former sstable2json. The tool only
supports v3.0+ SSTables. See tool's help for more detail.
Upgrading
---------
- Nothing specific to 3.4 but please see previous versions upgrading section,
especially if you are upgrading from 2.2.
Deprecation
-----------
- The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and
org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of
org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is
common across all caches in the auth subsystem. The specific mbean interfaces
for each individual cache will be removed in a subsequent major version.
3.2
===
New features
------------
- We now make sure that a token does not exist in several data directories. This
means that we run one compaction strategy per data_file_directory and we use
one thread per directory to flush. Use nodetool relocatesstables to make sure your
tokens are in the correct place, or just wait and compaction will handle it. See
CASSANDRA-6696 for more details.
- bound maximum in-flight commit log replay mutation bytes to 64 megabytes
tunable via cassandra.commitlog_max_outstanding_replay_bytes
- Support for type casting has been added to the selection clause.
- Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression.
Note: hints compression is currently disabled by default.
Upgrading
---------
- The compression ratio metrics computation has been modified to be more accurate.
- Running Cassandra as root is prevented by default.
- JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file
Deprecation
-----------
- The Thrift API is deprecated and will be removed in Cassandra 4.0.
3.1
=====
Upgrading
---------
- The return value of SelectStatement::getLimit as been changed from DataLimits
to int.
- Custom index implementation should be aware that the method Indexer::indexes()
has been removed as its contract was misleading and all custom implementation
should have almost surely returned true inconditionally for that method.
- GC logging is now enabled by default (you can disable it in the jvm.options
file if you prefer).
3.0
===
New features
------------
- EACH_QUORUM is now a supported consistency level for read requests.
- Support for IN restrictions on any partition key component or clustering key
as well as support for EQ and IN multicolumn restrictions has been added to
UPDATE and DELETE statement.
- Support for single-column and multi-colum slice restrictions (>, >=, <= and <)
has been added to DELETE statements
- nodetool rebuild_index accepts the index argument without
the redundant table name
- Materialized Views, which allow for server-side denormalization, is now
available. Materialized views provide an alternative to secondary indexes
for non-primary key queries, and perform much better for indexing high
cardinality columns.
See http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
- Hinted handoff has been completely rewritten. Hints are now stored in flat
files, with less overhead for storage and more efficient dispatch.
See CASSANDRA-6230 for full details.
- Option to not purge unrepaired tombstones. To avoid users having data resurrected
if repair has not been run within gc_grace_seconds, an option has been added to
only allow tombstones from repaired sstables to be purged. To enable, set the
compaction option 'only_purge_repaired_tombstones':true but keep in mind that if
you do not run repair for a long time, you will keep all tombstones around which
can cause other problems.
- Enabled warning on GC taking longer than 1000ms. See
cassandra.yaml:gc_warn_threshold_in_ms
Upgrading
---------
- Clients must use the native protocol version 3 when upgrading from 2.2.X as
the native protocol version 4 is not compatible between 2.2.X and 3.Y. See
https://www.mail-archive.com/user@cassandra.apache.org/msg45381.html for details.
- A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator,
representing the IP address of the client attempting authentication. It will be a breaking
change for any custom implementations.
- token-generator tool has been removed.
- Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9,
or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and
older versions is not supported.
- The 'memtable_allocation_type: offheap_objects' option has been removed. It should
be re-introduced in a future release and you can follow CASSANDRA-9472 to know more.
- Configuration parameter memory_allocator in cassandra.yaml has been removed.
- The native protocol versions 1 and 2 are not supported anymore.
- Max mutation size is now configurable via max_mutation_size_in_kb setting in
cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024.
- 3.0 requires Java 8u40 or later.
- Garbage collection options were moved from cassandra-env to jvm.options file.
- New transaction log files have been introduced to replace the compactions_in_progress
system table, temporary file markers (tmp and tmplink) and sstable ancerstors.
Therefore, compaction metadata no longer contains ancestors. Transaction log files
list sstable descriptors involved in compactions and other operations such as flushing
and streaming. Use the sstableutil tool to list any sstable files currently involved
in operations not yet completed, which previously would have been marked as temporary.
A transaction log file contains one sstable per line, with the prefix "add:" or "remove:".
They also contain a special line "commit", only inserted at the end when the transaction
is committed. On startup we use these files to cleanup any partial transactions that were
in progress when the process exited. If the commit line is found, we keep new sstables
(those with the "add" prefix) and delete the old sstables (those with the "remove" prefix),
vice-versa if the commit line is missing. Should you lose or delete these log files,
both old and new sstable files will be kept as live files, which will result in duplicated
sstables. These files are protected by incremental checksums so you should not manually
edit them. When restoring a full backup or moving sstable files, you should clean-up
any left over transactions and their temporary files first. You can use this command:
===> sstableutil -c ks table
See CASSANDRA-7066 for full details.
- New write stages have been added for batchlog and materialized view mutations
you can set their size in cassandra.yaml
- User defined functions are now executed in a sandbox.
To use UDFs and UDAs, you have to enable them in cassandra.yaml.
- New SSTable version 'la' with improved bloom-filter false-positive handling
compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade
is not necessary but recommended.
- Before upgrading to 3.0, make sure that your cluster is in complete agreement
(schema versions outputted by `nodetool describecluster` are all the same).
- Schema metadata is now stored in the new `system_schema` keyspace, and
legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details.
- Pig's support has been removed.
- Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use
CqlBulkOutputFormat and CqlBulkRecordWriter instead.
- Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed;
use CqlInputFormat and CqlOutputFormat instead.
- Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed;
use CqlRecordReader and CqlRecordWriter instead.
- hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers.
To specify a list of excluded data centers when hinted_handoff_enabled is set to true,
use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details.
- The `sstable_compression` and `chunk_length_kb` compression options have been deprecated.
The new options are `class` and `chunk_length_in_kb`. Disabling compression should now
be done by setting the new option `enabled` to `false`.
- The compression option `crc_check_chance` became a top-level table option, but is currently
enforced only against tables with enabled compression.
- Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax
has been deprecated since 2.1.0 and is being removed in 3.0.0.
- The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated
since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options,
has now been removed.
- Batchlog entries are now stored in a new table - system.batches.
The old one has been deprecated.
- JMX methods set/getCompactionStrategyClass have been removed, use
set/getCompactionParameters or set/getCompactionParametersJson instead.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The secondary index API has been comprehensively reworked. This will be a breaking
change for any custom index implementations, which should now look to implement
the new org.apache.cassandra.index.Index interface. New syntax has been added to create
and query row-based indexes, which are not explicitly linked to a single column in the
base table.
2.2.4
=====
Deprecation
-----------
- Pig support has been deprecated, and will be removed in 3.0.
Please see CASSANDRA-10542 for more details.
- Configuration parameter memory_allocator in cassandra.yaml has been deprecated
and will be removed in 3.0.0. As mentioned below for 2.2.0, jemalloc is
automatically preloaded on Unix platforms.
Operations
----------
- Switching data center or racks is no longer an allowed operation on a node
which has data. Instead, the node will need to be decommissioned and
rebootstrapped. If moving from the SimpleSnitch, make sure that the data
center and rack containing all current nodes is named "datacenter1" and
"rack1". To override this behaviour use -Dcassandra.ignore_rack=true and/or
-Dcassandra.ignore_dc=true.
- Reloading the configuration file of GossipingPropertyFileSnitch has been disabled.
Upgrading
---------
- The default for the inter-DC stream throughput setting
(inter_dc_stream_throughput_outbound_megabits_per_sec in cassandra.yaml) is
the same than the one for intra-DC one (200Mbps) instead of being unlimited.
Having it unlimited was never intended and was a bug.
New features
------------
- Time windows in DTCS are now limited to 1 day by default to be able to
handle bootstrap and repair in a better way. To get the old behaviour,
increase max_window_size_seconds.
- DTCS option max_sstable_age_days is now deprecated and defaults to 1000 days.
- Native protocol server now allows both SSL and non-SSL connections on
the same port.
2.2.3
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.2 if you are upgrading
from a previous version.
2.2.2
=====
Changed Defaults
----------------
- commitlog_total_space_in_mb will use the smaller of 8192, and 1/4
of the total space of the commitlog volume. (Before: always used
8192)
- The following INFO logs were reduced to DEBUG level and will now show
on debug.log instead of system.log:
- Memtable flushing actions
- Commit log replayed files
- Compacted sstables
- SStable opening (SSTableReader)
New features
------------
- Custom QueryHandlers can retrieve the column specifications for the bound
variables from QueryOptions by using the hasColumnSpecifications()
and getColumnSpecifications() methods.
- A new default assynchronous log appender debug.log was created in addition
to the system.log appender in order to provide more detailed log debugging.
In order to disable debug logging, you must comment-out the ASYNCDEBUGLOG
appender on conf/logback.xml. See CASSANDRA-10241 for more information.
2.2.1
=====
New features
------------
- COUNT(*) and COUNT(1) can be selected with other columns or functions
2.2
===
Upgrading
---------
- The authentication & authorization subsystems have been redesigned to
support role based access control (RBAC), resulting in a change to the
schema of the system_auth keyspace. See below for more detail.
For systems already using the internal auth implementations, the process
for converting existing data during a rolling upgrade is straightforward.
As each node is restarted, it will attempt to convert any data in the
legacy tables into the new schema. Until enough nodes to satisfy the
replication strategy for the system_auth keyspace are upgraded and so have
the new schema, this conversion will fail with the failure being reported
in the system log.
During the upgrade, Cassandra's internal auth classes will continue to use
the legacy tables, so clients experience no disruption. Issuing DCL
statements during an upgrade is not supported.
Once all nodes are upgraded, an operator with superuser privileges should
drop the legacy tables, system_auth.users, system_auth.credentials and
system_auth.permissions. Doing so will prompt Cassandra to switch over to
the new tables without requiring any further intervention.
While the legacy tables are present a restarted node will re-run the data
conversion and report the outcome so that operators can verify that it is
safe to drop them.
New features
------------
- The LIMIT clause applies now only to the number of rows returned to the user,
not to the number of row queried. By consequence, queries using aggregates will not
be impacted by the LIMIT clause anymore.
- Very large batches will now be rejected (defaults to 50kb). This
can be customized by modifying batch_size_fail_threshold_in_kb.
- Selecting columns,scalar functions, UDT fields, writetime or ttl together
with aggregated is now possible. The value returned for the columns,
scalar functions, UDT fields, writetime and ttl will be the ones for
the first row matching the query.
- Windows is now a supported platform. Powershell execution for startup scripts
is highly recommended and can be enabled via an administrator command-prompt
with: 'powershell set-executionpolicy unrestricted'
- It is now possible to do major compactions when using leveled compaction.
Doing that will take all sstables and compact them out in levels. The
levels will be non overlapping so doing this will still not be something
you want to do very often since it might cause more compactions for a while.
It is also possible to split output when doing a major compaction with
STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size.
This might be a bit better than old major compactions which created one big
file on disk.
- A new tool has been added bin/sstableverify that checks for errors/bitrot
in all sstables. Unlike scrub, this is a non-invasive tool.
- Authentication & Authorization APIs have been updated to introduce
roles. Roles and Permissions granted to them are inherited, supporting
role based access control. The role concept supercedes that of users
and CQL constructs such as CREATE USER are deprecated but retained for
compatibility. The requirement to explicitly create Roles in Cassandra
even when auth is handled by an external system has been removed, so
authentication & authorization can be delegated to such systems in their
entirety.
- In addition to the above, Roles are also first class resources and can be the
subject of permissions. Users (roles) can now be granted permissions on other
roles, including CREATE, ALTER, DROP & AUTHORIZE, which removesthe need for
superuser privileges in order to perform user/role management operations.
- Creators of database resources (Keyspaces, Tables, Roles) are now automatically
granted all permissions on them (if the IAuthorizer implementation supports
this).
- SSTable file name is changed. Now you don't have Keyspace/CF name
in file name. Also, secondary index has its own directory under parent's
directory.
- Support for user-defined functions and user-defined aggregates have
been added to CQL.
************************************************************************
IMPORTANT NOTE: user-defined functions can be used to execute
arbitrary and possibly evil code in Cassandra 2.2, and are
therefore disabled by default. To enable UDFs edit
cassandra.yaml and set enable_user_defined_functions to true.
CASSANDRA-9402 will add a security manager for UDFs in Cassandra
3.0. This will inherently be backwards-incompatible with any 2.2
UDF that perform insecure operations such as opening a socket or
writing to the filesystem.
************************************************************************
- Row-cache is now fully off-heap.
- jemalloc is now automatically preloaded and used on Linux and OS-X if
installed.
- Please ensure on Unix platforms that there is no libjnadispath.so
installed which is accessible by Cassandra. Old versions of
libjna packages (< 4.0.0) will cause problems - e.g. Debian Wheezy
contains libjna versin 3.2.x.
- The node now keeps up when streaming is failed during bootstrapping. You can
use new `nodetool bootstrap resume` command to continue streaming after resolving
an issue.
- Protocol version 4 specifies that bind variables do not require having a
value when executing a statement. Bind variables without a value are
called 'unset'. The 'unset' bind variable is serialized as the int
value '-2' without following bytes.
In an EXECUTE or BATCH request an unset bind value does not modify the value and
does not create a tombstone, an unset bind ttl is treated as 'unlimited',
an unset bind timestamp is treated as 'now', an unset bind counter operation
does not change the counter value.
Unset tuple field, UDT field and map key are not allowed.
In a QUERY request an unset limit is treated as 'unlimited'.
Unset WHERE clauses with unset partition column, clustering column
or index column are not allowed.
- New `ByteType` (cql tinyint). 1-byte signed integer
- New `ShortType` (cql smallint). 2-byte signed integer
- New `SimpleDateType` (cql date). 4-byte unsigned integer
- New `TimeType` (cql time). 8-byte long
- The toDate(timeuuid), toTimestamp(timeuuid) and toUnixTimestamp(timeuuid) functions have been added to allow
to convert from timeuuid into date type, timestamp type and bigint raw value.
The functions unixTimestampOf(timeuuid) and dateOf(timeuuid) have been deprecated.
- The toDate(timestamp) and toUnixTimestamp(timestamp) functions have been added to allow
to convert from timestamp into date type and bigint raw value.
- The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow
to convert from date into timestamp type and bigint raw value.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The default JVM flag -XX:+PerfDisableSharedMem will cause the following tools JVM
to stop working: jps, jstack, jinfo, jmc, jcmd as well as 3rd party tools like Jolokia.
If you wish to use these tools you can comment this flag out in cassandra-env.{sh,ps1}
Upgrading
---------
- Thrift rpc is no longer being started by default.
Set `start_rpc` parameter to `true` to enable it.
- Pig's CqlStorage has been removed, use CqlNativeStorage instead
- Pig's CassandraStorage has been deprecated. CassandraStorage
should only be used against tables created via thrift.
Use CqlNativeStorage for all other tables.
- IAuthenticator been updated to remove responsibility for user/role
maintenance and is now solely responsible for validating credentials,
This is primarily done via SASL, though an optional method exists for
systems which need support for the Thrift login() method.
- IRoleManager interface has been added which takes over the maintenance
functions from IAuthenticator. IAuthorizer is mainly unchanged. Auth data
in systems using the stock internal implementations PasswordAuthenticator
& CassandraAuthorizer will be automatically converted during upgrade,
with minimal operator intervention required. Custom implementations will
require modification, though these can be used in conjunction with the
stock CassandraRoleManager so providing an IRoleManager implementation
should not usually be necessary.
- Fat client support has been removed since we have push notifications to clients
- cassandra-cli has been removed. Please use cqlsh instead.
- YamlFileNetworkTopologySnitch has been removed; switch to
GossipingPropertyFileSnitch instead.
- CQL2 has been removed entirely in this release (previously deprecated
in 2.0.0). Please switch to CQL3 if you haven't already done so.
- The results of CQL3 queries containing an IN restriction will be ordered
in the normal order and not anymore in the order in which the column values were
specified in the IN restriction.
- Some secondary index queries with restrictions on non-indexed clustering
columns were not requiring ALLOW FILTERING as they should. This has been
fixed, and those queries now require ALLOW FILTERING (see CASSANDRA-8418
for details).
- The SSTableSimpleWriter and SSTableSimpleUnsortedWriter classes have been
deprecated and will be removed in the next major Cassandra release. You
should use the CQLSSTableWriter class instead.
- The sstable2json and json2sstable tools have been deprecated and will be
removed in the next major Cassandra release. See CASSANDRA-9618
(https://issues.apache.org/jira/browse/CASSANDRA-9618) for details.
- nodetool enablehandoff will no longer support a list of data centers starting
with the next major release. Two new commands will be added, enablehintsfordc and disablehintsfordc,
to exclude data centers from using hinted handoff when the global status is enabled.
In cassandra.yaml, hinted_handoff_enabled will no longer support a list of data centers starting
with the next major release. A new setting will be added, hinted_handoff_disabled_datacenters,
to exclude data centers when the global status is enabled, see CASSANDRA-9035 for details.
2.1.13
======
New features
------------
- New options for cqlsh COPY FROM and COPY TO, see CASSANDRA-9303 for details.
2.1.10
=====
New features
------------
- The syntax TRUNCATE TABLE X is now accepted as an alias for TRUNCATE X
2.1.9
=====
Upgrading
---------
- cqlsh will now display timestamps with a UTC timezone. Previously,
timestamps were displayed with the local timezone.
- Commit log files are no longer recycled by default, due to negative
performance implications. This can be enabled again with the
commitlog_segment_recycling option in your cassandra.yaml
- JMX methods set/getCompactionStrategyClass have been deprecated, use
set/getCompactionParameters/set/getCompactionParametersJson instead
2.1.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.7
=====
2.1.6
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.5
=====
Upgrading
---------
- The option to omit cold sstables with size tiered compaction has been
removed - it is almost always better to use date tiered compaction for
workloads that have cold data.
2.1.4
=====
Upgrading
---------
The default JMX config now listens to localhost only. You must enable
the other JMX flags in cassandra-env.sh manually.
2.1.3
=====
Upgrading
---------
- Prepending a list to a list collection was erroneously resulting in
the prepended list being reversed upon insertion. If you were depending
on this buggy behavior, note that it has been corrected.
- Incremental replacement of compacted SSTables has been disabled for this
release.
2.1.2
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
New features
------------
- Netty support for epoll on linux is now enabled. If for some
reason you want to disable it pass, the following system property
-Dcassandra.native.epoll.enabled=false
2.1
===
New features
------------
- Default data and log locations have changed. If not set in
cassandra.yaml, the data file directory, commitlog directory,
and saved caches directory will default to $CASSANDRA_HOME/data/data,
$CASSANDRA_HOME/data/commitlog, and $CASSANDRA_HOME/data/saved_caches,
respectively. The log directory now defaults to $CASSANDRA_HOME/logs.
If not set, $CASSANDRA_HOME, defaults to the top-level directory of
the installation.
Note that this should only affect source checkouts and tarballs.
Deb and RPM packages will continue to use /var/lib/cassandra and
/var/log/cassandra in cassandra.yaml.
- SSTable data directory name is slightly changed. Each directory will
have hex string appended after CF name, e.g.
ks/cf-5be396077b811e3a3ab9dc4b9ac088d/
This hex string part represents unique ColumnFamily ID.
Note that existing directories are used as is, so only newly created
directories after upgrade have new directory name format.
- Saved key cache files also have ColumnFamily ID in their file name.
- It is now possible to do incremental repairs, sstables that have been
repaired are marked with a timestamp and not included in the next
repair session. Use nodetool repair -par -inc to use this feature.
A tool to manually mark/unmark sstables as repaired is available in
tools/bin/sstablerepairedset. This is particularly important when
using LCS, or any data not repaired in your first incremental repair
will be put back in L0.
- Bootstrapping now ensures that range movements are consistent,
meaning the data for the new node is taken from the node that is no
longer a responsible for that range of keys.
If you want the old behavior (due to a lost node perhaps)
you can set the following property (-Dcassandra.consistent.rangemovement=false)
- It is now possible to use quoted identifiers in triggers' names.
WARNING: if you previously used triggers with capital letters in their
names, then you must quote them from now on.
- Improved stress tool (http://goo.gl/OTNqiQ)
- New incremental repair option (http://goo.gl/MjohJp, http://goo.gl/f8jSme)
- Incremental replacement of compacted SSTables (http://goo.gl/JfDBGW)
- The row cache can now cache only the head of partitions (http://goo.gl/6TJPH6)
- Off-heap memtables (http://goo.gl/YT7znJ)
- CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ... (http://goo.gl/kQl7GW)
Upgrading
---------
- commitlog_sync_batch_window_in_ms behavior has changed from the
maximum time to wait between fsync to the minimum time. We are
working on making this more user-friendly (see CASSANDRA-9533) but in the
meantime, this means 2.1 needs a much smaller batch window to keep
writer threads from starving. The suggested default is now 2ms.
- Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore
pre-2.0 sstables are not supported. This means that before upgrading
a node on 2.1, this node must be started on 2.0 and
'nodetool upgdradesstables' must be run (and this even in the case
of not-rolling upgrades).
- For size-tiered compaction users, Cassandra now defaults to ignoring
the coldest 5% of sstables. This can be customized with the
cold_reads_to_omit compaction option; 0.0 omits nothing (the old
behavior) and 1.0 omits everything.
- Multithreaded compaction has been removed.
- Counters implementation has been changed, replaced by a safer one with
less caveats, but different performance characteristics. You might have
to change your data model to accomodate the new implementation.
(See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the
blog post at http://goo.gl/qj8iQl for details).
- (per-table) index_interval parameter has been replaced with
min_index_interval and max_index_interval paratemeters. index_interval
has been deprecated.
- support for supercolumns has been removed from json2sstable
2.0.11
======
Upgrading
---------
- Nothing specific to this release, but refer to previous entries if you
are upgrading from a previous version.
New features
------------
- DateTieredCompactionStrategy added, optimized for time series data and groups
data that is written closely in time (CASSANDRA-6602 for details). Consider
this experimental for now.
2.0.10
======
New features
------------
- CqlPaginRecordReader and CqlPagingInputFormat have both been removed.
Use CqlInputFormat instead.
- If you are using Leveled Compaction, you can now disable doing size-tiered
compaction in L0 by starting Cassandra with -Dcassandra.disable_stcs_in_l0
(see CASSANDRA-6621 for details).
- Shuffle and taketoken have been removed. For clusters that choose to
upgrade to vnodes, creating a new datacenter with vnodes and migrating is
recommended. See http://goo.gl/Sna2S1 for further information.
2.0.9
=====
Upgrading
---------
- Default values for read_repair_chance and local_read_repair_chance have been
swapped. Namely, default read_repair_chance is now set to 0.0, and default
local_read_repair_chance to 0.1.
- Queries selecting only CQL static columns were (mistakenly) not returning one
result per row in the partition. This has been fixed and a SELECT DISTINCT
can be used when only the static column of a partition needs to be fetch
without fetching the whole partition. But if you use static columns, please
make sure this won't affect you (see CASSANDRA-7305 for details).
2.0.8
=====
New features
------------
- New snitches have been used for users of Google Compute Engine and of
Cloudstack.
Upgrading
---------
- Nothing specific to this release, but please see 2.0.7 if you are upgrading
from a previous version.
2.0.7
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.0.6 if you are upgrading
from a previous version.
2.0.6
=====
New features
------------
- CQL now support static columns, allows to batch multiple conditional updates
and has a new syntax for slicing over multiple clustering columns
(http://goo.gl/B6qz4j).
- Repair can be restricted to a set of nodes using the -hosts option in nodetool.
- A new 'nodetool taketoken' command relocate tokens with vnodes.
- Hinted handoff can be enabled only for some data-centers (see
hinted_handoff_enabled in cassandra.yaml)
Upgrading
---------
- Nothing specific to this release, but please see 2.0.5 if you are upgrading
from a previous version.
2.0.5
=====
New features
------------
- Batchlog replay can be, and is throttled by default now.
See batchlog_replay_throttle_in_kb setting in cassandra.yaml.
- Scrub can now optionally skip corrupt counter partitions. Please note
that this will lead to the loss of all the counter updates in the skipped
partition. See the --skip-corrupted option.
Upgrading
---------
- If your cluster began on a version before 1.2, check that your secondary
index SSTables are on version 'ic' before upgrading. If not, run
'nodetool upgradesstables' if on 1.2.14 or later, or run 'nodetool
upgradesstables ks cf' with the keyspace and secondary index named
explicitly otherwise. If you don't do this and upgrade to 2.0.x and it
refuses to start because of 'hf' version files in the secondary index,
you will need to delete/move them out of the way and recreate the index
when 2.0.x starts.
2.0.3
=====
New features
------------
- It's now possible to configure the maximum allowed size of the native
protocol frames (native_transport_max_frame_size_in_mb in the yaml file).
Upgrading
---------
- NaN and Infinity are new valid floating point constants in CQL3 and are now reserved
keywords. In the unlikely case you were using one of them as an identifier (for a
column, a keyspace or a table), you will now have to double-quote them (see
http://cassandra.apache.org/doc/cql3/CQL.html#identifiers for "quoted identifiers").
- The IEndpointStateChangeSubscriber has a new method, beforeChange, that
any custom implemenations using the class will need to implement.
2.0.2
=====
New features
------------
- Speculative retry defaults to 99th percentile
(See blog post at http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2)
- Configurable metrics reporting
(see conf/metrics-reporter-config-sample.yaml)
- Compaction history and stats are now saved to system keyspace
(system.compaction_history table). You can access historiy via
new 'nodetool compactionhistory' command or CQL.
Upgrading
---------
- Nodetool defaults to Sequential mode for repair operations
2.0.1
=====
Upgrading
---------
- The default memtable allocation has changed from 1/3 of heap to 1/4
of heap. Also, default (single-partition) read and write timeouts
have been reduced from 10s to 5s and 2s, respectively.
2.0.0
=====
Upgrading
---------
- Java 7 is now *required*!
- Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
goes for sstable compatibility as well as network. When
upgrading from an earlier release, upgrade to 1.2.9 first and
run upgradesstables before proceeding to 2.0.
- CAS and new features in CQL such as DROP COLUMN assume that cell
timestamps are microseconds-since-epoch. Do not use these
features if you are using client-specified timestamps with some
other source.
- Replication and strategy options do not accept unknown options anymore.
This was already the case for CQL3 in 1.2 but this is now the case for
thrift too.
- auto_bootstrap of a single-token node with no initial_token will
now pick a random token instead of bisecting an existing token
range. We recommend upgrading to vnodes; failing that, we
recommend specifying initial_token.
- reduce_cache_sizes_at, reduce_cache_capacity_to, and
flush_largest_memtables_at options have been removed from cassandra.yaml.
- CacheServiceMBean.reduceCacheSizes() has been removed.
Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
- authority option in cassandra.yaml has been deprecated since 1.2.0,
but it has been completely removed in 2.0. Please use 'authorizer' option.
- ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
typeAsBlob() conversion functions instead.
See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
- Inputting blobs as string constants is now fully deprecated in
favor of blob constants. Make sure to update your applications to use
the new syntax while you are still on 1.2 (which supports both string
and blob constants for blob input) before upgrading to 2.0.
- index_interval is now moved to ColumnFamily property. You can change value
with ALTER TABLE ... WITH statement and SSTables written after that will
have new value. When upgrading, Cassandra will pick up the value defined in
cassanda.yaml as the default for existing ColumnFamilies, until you explicitly
set the value for those.
- The deprecated native_transport_min_threads option has been removed in
Cassandra.yaml.
Operations
----------
- VNodes are enabled by default in cassandra.yaml. initial_token
for non-vnode deployments has been removed from the example
yaml, but is still respected if specified.
- Major compactions, cleanup, scrub, and upgradesstables will interrupt
any in-progress compactions (but not repair validations) when invoked.
- Disabling autocompactions by setting min/max compaction threshold to 0
has been deprecated, instead, use the nodetool commands 'disableautocompaction'
and 'enableautocompaction' or set the compaction strategy option enabled = false
- ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now.
See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
- CAS uses gc_grace_seconds to determine how long to keep unused paxos
state around for, or a minimum of three hours.
- A new hints created metric is tracked per target, replacing countPendingHints
- After performance testing for CASSANDRA-5727, the default LCS filesize
has been changed from 5MB to 160MB.
- cqlsh DESCRIBE SCHEMA no longer outputs the schema of system_* keyspaces;
use DESCRIBE FULL SCHEMA if you need the schema of system_* keyspaces.
- CQL2 has been deprecated, and will be removed entirely in 2.2. See
CASSANDRA-5918 for details.
- Commit log archiver now assumes the client time stamp to be in microsecond
precision, during restore. Please refer to commitlog_archiving.properties.
Features
--------
- Lightweight transactions
(http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)
- Alias support has been added to CQL3 SELECT statement. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for details.
- JEMalloc support (see memory_allocator in cassandra.yaml)
- Experimental triggers support. See examples/ for how to use. "Experimental"
means "tied closely to internal data structures; we plan to decouple this in
the future, which will probably break triggers written against this initial
API."
- Numerous improvements to CQL3 and a new version of the native protocol. See
http://www.datastax.com/dev/blog/cql-in-cassandra-2-0 for details.
1.2.11
======
Features
--------
- Added a new consistency level, LOCAL_ONE, that forces all CL.ONE operations to
execute only in the local datacenter.
- New replace_address to supplant the (now removed) replace_token and
replace_node workflows to replace a dead node in place. Works like the
old options, but takes the IP address of the node to be replaced.
1.2.9
=====
Features
--------
- A history of executed nodetool commands is now captured.
It can be found in ~/.cassandra/nodetool.history. Other tools output files
(cli and cqlsh history, .cqlshrc) are now centralized in ~/.cassandra, as well.
- A new sstablesplit utility allows to split large sstables offline.
1.2.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.2.7 if you are upgrading
from a previous version.
1.2.7
=====
Upgrading
---------
- If you have decommissioned a node in the past 72 hours, it is imperative
that you not upgrade until such time has passed, or do a full cluster
restart (not rolling) before beginning the upgrade. This only applies to
decommission, not removetoken.
1.2.6
=====
Upgrading
---------
- hinted_handoff_throttle_in_kb is now reduced by a factor
proportional to the number of nodes in the cluster (see
https://issues.apache.org/jira/browse/CASSANDRA-5272).
- CQL3 syntax for CREATE CUSTOM INDEX has been updated. See CQL3
documentation for details.
1.2.5
=====
Features
--------
- Custom secondary index support has been added to CQL3. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details and examples.
Upgrading
---------
- The native CQL transport is enabled by default on part 9042.
1.2.4
=====
Upgrading
---------
- 'nodetool upgradesstables' now only upgrades/rewrites sstables that are
not on the current version (which is usually what you want). Use the new
-a flag to recover the old behavior of rewriting all sstables.
Features
--------
- superuser setup delay (10 seconds) can now be overridden using
'cassandra.superuser_setup_delay_ms' property.
1.2.3
=====
Upgrading
---------
- CQL3 used to be case-insensitive for property map key in ALTER and CREATE
statements. In other words:
CREATE KEYSPACE test WITH replication = { 'CLASS' : 'SimpleStrategy',
'REPLICATION_FACTOR' : '1' }
was allowed. However, this was not consistent with the fact that string
literal are case sensitive in every other places and more importantly this
break NetworkTopologyStrategy for which DC names are case sensitive. Those
property map key are now case sensitive. So the statement above should be
changed to:
CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : '1' }
1.2.2
=====
Upgrading
---------
- CQL3 type validation for constants has been fixed, which may require
fixing queries that were relying on the previous loose validation. Please
refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
and in particular the changelog section for more details. Please note in
particular that inputing blobs as strings constants is now deprecated (in
favor of blob constants) and its support will be removed in a future
version.
Features
--------
- Built-in CQL3-based implementations of IAuthenticator (PasswordAuthenticator)
and IAuthorizer (CassandraAuthorizer) have been added. PasswordAuthenticator
stores usernames and hashed passwords in system_auth.credentials table;
CassandraAuthorizer stores permissions in system_auth.permissions table.
- system_auth keyspace is now alterable via ALTER KEYSPACE queries.
The default is SimpleStrategy with replication_factor of 1, but it's
advised to raise RF to at least 3 or 5, since CL.QUORUM is used for all
auth-related queries. It's also possible to change the strategy to NTS.
- Permissions caching with time-based expiration policy has been added to reduce
performance impact of authorization. Permission validity can be configured
using 'permissions_validity_in_ms' setting in cassandra.yaml. The default
is 2000 (2 seconds).
- SimpleAuthenticator and SimpleAuthorizer examples have been removed. Please
look at CassandraAuthorizer/PasswordAuthenticator instead.
1.2.1
=====
Upgrading
---------
- In CQL3, date string are no longer accepted as timeuuid value since a
date string is not a correct representation of a timeuuid. Instead, new
methods (minTimeuuid, maxTimeuuid, now, dateOf, unixTimestampOf) have been
introduced to make working on timeuuid from date string easy. cqlsh also
does not display timeuuid as date string (since this is a lossy
representation), but the new dateOf method can be used instead. Please
refer to the reference documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for more detail.
- For client implementors: CQL3 client using the thrift interface should
use the new execute_cql3_query, prepare_cql3_query and execute_prepared_cql3_query
since 1.2.0. However, Cassandra 1.2.0 was not complaining if CQL3 was set
through set_cql_version but the now CQL2 only methods were used. This is
now the case.
- Queries that uses unrecognized or bad compaction or replication strategy
options are now refused (instead of simply logging a warning).
1.2
===
Upgrading
---------
- IAuthenticator interface has been updated to support dynamic
user creation, modification and removal. Users, even when stored
externally, now have to be explicitly created using
CREATE USER query first. AllowAllAuthenticator and SimpleAuthenticator
have been updated for the new interface, but you'll have to update
your old IAuthenticator implementations for 1.2. To ease this process,
a new abstract LegacyAuthenticator class has been added - subclass it
in your old IAuthenticator implementaion and everything should just work
(this only affects users who implemented custom authenticators).
- IAuthority interface has been deprecated in favor of IAuthorizer.
AllowAllAuthority and SimpleAuthority have been renamed to
AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to
simplify the upgrade to the new interface, a new abstract
LegacyAuthorizer has been added - you should subclass it in your
old IAuthority implementation and everything should just work
(this only affects users who implemented custom authorities).
'authority' setting in cassandra.yaml has been renamed to 'authorizer',
'authority' is no longer recognized. This affects all upgrading users.
- 1.2 is NOT network-compatible with versions older than 1.0. That
means if you want to do a rolling, zero-downtime upgrade, you'll need
to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains
the ability to read data files from Cassandra versions at least
back to 0.6, so a non-rolling upgrade remains possible with just
one step.
- The default partitioner for new clusters is Murmur3Partitioner,
which is about 10% faster for index-intensive workloads. Partitioners
cannot be changed once data is in the cluster, however, so if you are
switching to the 1.2 cassandra.yaml, you should change this to
RandomPartitioner or whatever your old partitioner was.
- If you using counters and upgrading from a version prior to
1.1.6, you should drain existing Cassandra nodes prior to the
upgrade to prevent overcount during commitlog replay (see
CASSANDRA-4782). For non-counter uses, drain is not required
but is a good practice to minimize restart time.
- Tables using LeveledCompactionStrategy will default to not
creating a row-level bloom filter. The default in older versions
of Cassandra differs; you should manually set the false positive
rate to 1.0 (to disable) or 0.01 (to enable, if you make many
requests for rows that do not exist).
- The hints schema was changed from 1.1 to 1.2. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 1.2 for the first time. Additionally, upgraded nodes
will not store new hints destined for older (pre-1.2) nodes. It is
therefore recommended that you perform a cluster upgrade when all
nodes are up. Because hints will be lost, a cluster-wide repair (with
-pr) is recommended after upgrade of all nodes.
- The `nodetool removetoken` command (and corresponding JMX operation)
have been renamed to `nodetool removenode`. This function is
incompatible with the earlier `nodetool removetoken`, and attempts to
remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster,
is not supported.
- The somewhat ill-conceived CollatingOrderPreservingPartitioner
has been removed. Use Murmur3Partitioner (recommended) or
ByteOrderedPartitioner instead.
- Global option hinted_handoff_throttle_delay_in_ms has been removed.
hinted_handoff_throttle_in_kb has been added instead.
- The default bloom filter fp chance has been increased to 1%.
This will save about 30% of the memory used by the old default.
Existing columnfamilies will retain their old setting.
- The default partitioner (for new clusters; the partitioner cannot be
changed in existing clusters) was changed from RandomPartitioner to
Murmur3Partitioner which provides faster hashing as well as improved
performance with secondary indexes.
- The default version of CQL (and cqlsh) is now CQL3. CQL2 is still
available but you will have to use the thrift set_cql_version method
(that is already supported in 1.1) to use CQL2. For cqlsh, you will need
to use 'cqlsh -2'.
- CQL3 is now considered final in this release. Compared to the beta
version that is part of 1.1, this final version has a few additions
(collections), but also some (incompatible) changes in the syntax for the
options of the create/alter keyspace/table statements. Typically, the
syntax to create a keyspace is now:
CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : 2 };
Also, the consistency level cannot be set in the language anymore, but is
at the protocol level.
Please refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details.
- In CQL3, the DROP behavior from ALTER TABLE has currently been removed
(because it was not correctly implemented). We hope to add it back soon
(Cassandra 1.2.1 or 1.2.2)
Features
--------
- Cassandra can now handle concurrent CREATE TABLE schema changes
as well as other updates
- rpc_timeout has been split up to allow finer-grained control
on timeouts for different operation types
- num_tokens can now be specified in cassandra.yaml. This defines the
number of tokens assigned to the host on the ring (default: 1).
Also specifying initial_token will override any num_tokens setting.
- disk_failure_policy allows blacklisting failed disks in JBOD
configuration instead of erroring out indefinitely
- event tracing can be configured per-connection ("trace_next_query")
or globally/probabilistically ("nodetool settraceprobability")
- Atomic batches are now supported server side, where Cassandra will
guarantee that (at the price of pre-writing the batch to another node
first), all mutations in the batch will be applied, even if the
coordinator fails mid-batch.
- new IAuthorizer interface has replaced the old IAuthority. IAuthorizer
allows dynamic permission management via new CQL3 statements:
GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing
the permissions in Cassandra is being worked on and we expect to
include it in 1.2.1 or 1.2.2.
- IAuthenticator interface has been updated to support dynamic user
creation, modification and removal via new CQL3 statements:
CREATE USER, ALTER USER, DROP USER, LIST USERS. A native implementation
that stores users in Cassandra itself is being worked on and is expected to
become part of 1.2.1 or 1.2.2.
1.1.5
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.4
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.3
=====
Upgrading
---------
- Running "nodetool upgradesstables" after upgrading is recommended
if you use Counter columnfamilies.
Features
--------
- the cqlsh COPY command can now export to CSV flat files
- added a new tools/bin/token-generator to facilitate generating evenly distributed tokens
1.1.2
=====
Upgrading
---------
- If you have column families using the LeveledCompactionStrategy, you should run scrub on those column families.
Features
--------
- cqlsh has a new COPY command to load data from CSV flat files
1.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
Features
--------
- Continuous commitlog archiving and point-in-time recovery.
See conf/commitlog_archiving.properties
- Incremental repair by token range, exposed over JMX
1.1
===
Upgrading
---------
- Compression is enabled by default on newly created ColumnFamilies
(and unchanged for ColumnFamilies created prior to upgrading).
- If you are running a multi datacenter setup, you should upgrade to
the latest 1.0.x (or 0.8.x) release before upgrading. Versions
0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible
with 1.1.
- EACH_QUORUM ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a LOCAL_QUORUM read instead.)
- ANY ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a ONE read for range queries;
single-row and multiget reads already rejected ANY.)
- The largest mutation batch accepted by the commitlog is now 128MB.
(In practice, batches larger than ~10MB always caused poor
performance due to load volatility and GC promotion failures.)
Larger batches will continue to be accepted but will not be
durable. Consider setting durable_writes=false if you really
want to use such large batches.
- Make sure that global settings: key_cache_{size_in_mb, save_period}
and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
used instead of per-ColumnFamily options.
- JMX methods no longer return custom Cassandra objects. Any such methods
will now return standard Maps, Lists, etc.
- Hadoop input and output details are now separated. If you were
previously using methods such as getRpcPort you now need to use
getInputRpcPort or getOutputRpcPort depending on the circumstance.
- CQL changes:
+ Prior to 1.1, you could use KEY as the primary key name in some
select statements, even if the PK was actually given a different
name. In 1.1+ you must use the defined PK name.
- The sliced_buffer_size_in_kb option has been removed from the
cassandra.yaml config file (this option was a no-op since 1.0).
Features
--------
- Concurrent schema updates are now supported, with any conflicts
automatically resolved. Please note that simultaneously running
‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t
be safe until version 1.2 due to the nature of ColumnFamily
identifier generation, for more details see CASSANDRA-3794.
- The CQL language has undergone a major revision, CQL3, the
highlights of which are covered at [1]. CQL3 is not
backwards-compatibile with CQL2, so we've introduced a
set_cql_version Thrift method to specify which version you want.
(The default remains CQL2 at least until Cassandra 1.2.) cqlsh
adds a --cql3 flag to enable this.
[1] http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
- Row-level isolation: multi-column updates to a single row have
always been *atomic* (either all will be applied, or none)
thanks to the CommitLog, but until 1.1 they were not *isolated*
-- a reader may see mixed old and new values while the update
happens.
- Finer-grained control over data directories, allowing a ColumnFamily to
be pinned to specfic volume, e.g. one backed by SSD.
- The bulk loader is not longer a fat client; it can be run from an
existing machine in a cluster.
- A new write survey mode has been added, similar to bootstrap (enabled via
-Dcassandra.write_survey=true), but the node will not automatically join
the cluster. This is useful for cases such as testing different
compaction strategies with live traffic without affecting the cluster.
- Key and row caches are now global, similar to the global memtable
threshold. Manual tuning of cache sizes per-columnfamily is no longer
required.
- Off-heap caches no longer require JNA, and will work out of the box
on Windows as well as Unix platforms.
- Streaming is now multithreaded.
- Compactions may now be aborted via JMX or nodetool.
- The stress tool is not new in 1.1, but it is newly included in
binary builds as well as the source tree
- Hadoop: a new BulkOutputFormat is included which will directly write
SSTables locally and then stream them into the cluster.
YOU SHOULD USE BulkOutputFormat BY DEFAULT. ColumnFamilyOutputFormat
is still around in case for some strange reason you want results
trickling out over Thrift, but BulkOutputFormat is significantly
more efficient.
- Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat,
allowing index expressions to be evaluated server-side to reduce
the amount of data sent to Hadoop.
- Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via
a boolean parameter to setInputColumnFamily, that pages through
data column-at-a-time instead of row-at-a-time.
- Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT
to true. This will produce each row's columns in a bag.
1.0.8
=====
Upgrading
---------
- Nothing specific to 1.0.8
Other
-----
- Allow configuring socket timeout for streaming
1.0.7
=====
Upgrading
---------
- Nothing specific to 1.0.7, please report to instruction for 1.0.6
Other
-----
- Adds new setstreamthroughput to nodetool to configure streaming
throttling
- Adds JMX property to get/set rpc_timeout_in_ms at runtime
- Allow configuring (per-CF) bloom_filter_fp_chance
1.0.6
=====
Upgrading
---------
- This release fixes an issue related to the chunk_length_kb option for
compressed sstables. If you use compression on some column families, it
is recommended after the upgrade to check the value for this option on
these column families (the default value is 64). In case the option would
not be set correctly, you should update the column family definition,
setting the right value and then run scrub on the column family.
- Please report to instruction for 1.0.5 if coming from an older version.
1.0.5
=====
Upgrading
---------
- 1.0.5 comes to fix two important regression of 1.0.4. So all information
concerning 1.0.4 are valid for this release, but please avoids upgrading
to 1.0.4.
1.0.4
=====
Upgrading
---------
- Nothing specific to 1.0.4 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- A new upgradesstables command has been added to nodetool. It is very
similar to scrub but without the ability to discard corrupted rows (and
as a consequence it does not snapshot automatically before). This new
command is to be prefered to scrub in all cases where sstables should be
rewritten to the current format for upgrade purposes.
JMX
---
- The path for the data, commit log and saved cache directories exposed
through JMX
- The in-memory bloom filter sizes are now exposed through JMX
1.0.3
=====
Upgrading
---------
- Nothing specific to 1.0.3 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- For non compressed sstables (compressed sstable already include more
fine grained checsums), a sha1 for the full sstable is now automatically
created (in a fix with suffix -Digest.sha1). It can be used to check the
sstable integrity with sha1sum.
1.0.2
=====
Upgrading
---------
- Nothing specific to 1.0.2 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- Cassandra CLI queries now have timing information
1.0.1
=====
Upgrading
---------
- If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading
section
- For running on Windows as a Service, procrun is no longer discributed
with Cassandra, see README.txt for more information on how to download
it if necessary.
- The name given to snapshots directories have been improved for human
readability. If you had scripts relying on it, you may need to update
them.
1.0
===
Upgrading
---------
- Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible
with 1.0: upgrade to the most recent 0.8 release first.)
You do not need to bring down the whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- CQL inserts/updates now generate microsecond resolution timestamps
by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD
MIX milliseconds and microseconds, with clients talking to servers
generating milliseconds unable to overwrite the larger microsecond
timestamps. If you are using CQL and this is important for your
application, you can either perform a non-rolling upgrade to 1.0, or
update your application first to use explicit timestamps with the "USING
timestamp=X" syntax.
- The BinaryMemtable bulk-load interface has been removed (use the
sstableloader tool instead).
- The compaction_thread_priority setting has been removed from
cassandra.yaml (use compaction_throughput_mb_per_sec to throttle
compaction instead).
- CQL types bytea and date were renamed to blob and timestamp, respectively,
to conform with SQL norms. CQL type int is now a 4-byte int, not 8
(which is still available as bigint).
- Cassandra 1.0 uses arena allocation to reduce old generation
fragmentation. This means there is a minimum overhead of 1MB
per ColumnFamily plus 1MB per index.
- The SimpleAuthenticator and SimpleAuthority classes have been moved to
the example directory (and are thus not available from the binary
distribution). They never provided actual security and in their current
state are only meant as examples.
Features
--------
- SSTable compression is supported through the 'compression_options'
parameter when creating/updating a column family. For instance, you can
create a column family Cf using compression (through the Snappy library)
in the CLI with:
create column family Cf with compression_options={sstable_compression: SnappyCompressor}
SSTable compression is not activated by default but can be activated or
deactivated at any time.
- Compressed SSTable blocks are checksummed to protect against bitrot
- New LevelDB-inspired compaction algorithm can be enabled by setting the
Columnfamily compaction_strategy=LeveledCompactionStrategy option.
Leveled compaction means you only need to keep a few MB of space free for
compaction instead of (in the worst case) 50%.
- Ability to use multiple threads during a single compaction. See
multithreaded_compaction in cassandra.yaml for more details.
- Windows Service ("cassandra.bat install" to enable)
- A dead node may be replaced in a single step by starting a new node
with -Dcassandra.replace_token=<token>. More details can be found at
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
- It is now possible to repair only the first range returned by the
partitioner for a node with `nodetool repair -pr`. It makes it
easier/possible to repair a full cluster without any work duplication by
running this command on every node of the cluster.
New data types
--------------
- decimal
Other
-----
- Hinted Handoff has two major improvements:
- Hint replay is much more efficient thanks to a change in the data model
- Hints are created for all replicas that do not ack a write. (Formerly,
only replicas known to be down when the write started were hinted.)
This means that running with read repair completely off is much more
viable than before, and the default read_repair_chance is reduced from 1.0
("always repair") to 0.1 ("repair 10% of the time").
- The old per-ColumnFamily memtable thresholds
(memtable_throughput_in_mb, memtable_operations_in_millions,
memtable_flush_after_mins) are ignored, in favor of the global
memtable_total_space_in_mb and commitlog_total_space_in_mb settings.
This does not affect client compatibility -- the old options are
still allowed, but have no effect. These options may be removed
entirely in a future release.
- Backlogged compactions will begin five minutes after startup. The 0.8
behavior of never starting compaction until a flush happens is usually
not what is desired, but a short grace period is useful to allow caches
to warm up first.
- The deletion of compacted data files is not performed during Garbage
Collection anymore. This means compacted files will now be deleted
without delay.
0.8.5
=====
Features
--------
- SSTables copied to a data directory can be loaded by a live node through
nodetool refresh (may be handy to load snapshots).
- The configured compaction throughput is exposed through JMX.
Other
-----
- The sstableloader is now bundled with the debian package.
- Repair detects when a participating node is dead and fails instead of
hanging forever.
0.8.4
=====
Upgrading
---------
- Nothing specific to 0.8.4
Other
-----
- This release comes to fix a bug in counter that could lead to
(important) over-count.
- It also fixes a slight upgrade regression from 0.8.3. It is thus advised
to jump directly to 0.8.4 if upgrading from before 0.8.3.
0.8.3
=====
Upgrading
---------
- Token removal has been revamped. Removing tokens in a mixed cluster with
0.8.3 will not work, so the entire cluster will need to be running 0.8.3
first, except for the dead node.
Features
--------
- It is now possible to use thrift asynchronous and
half-synchronous/half-asynchronous servers (see cassandra.yaml for more
details).
- It is now possible to access counter columns through Hadoop.
Other
-----
- This release fix a regression of 0.8 that can make commit log segment to
be deleted even though not all data it contains has been flushed.
Upgrades from 0.8.* is very much encouraged.
0.8.2
=====
Upgrading
---------
- 0.8.0 and 0.8.1 shipped with a bug that was setting the
replicate_on_write option for counter column families to false (this
option has no effect on non-counter column family). This is an unsafe
default and 0.8.2 correct this, the default for replicate_on_write is
now true. It is advised to update your counter column family definitions
if replicate_on_write was uncorrectly set to false (before or after
upgrade).
0.8.1
=====
Upgrading
---------
- 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a
simple rolling restart.
- If upgrading for earlier version (0.7), please refer to the 0.8 section
for instructions.
Features
--------
- Numerous additions/improvements to CQL (support for counters, TTL, batch
inserts/deletes, index dropping, ...).
- Add two new AbstractTypes (comparator) to support compound keys
(CompositeType and DynamicCompositeType), as well as a ReverseType to
reverse the order of any existing comparator.
- New option to bypass the commit log on some keyspaces (for advanced
users).
Tools
-----
- Add new data bulk loading utility (sstableloader).
0.8
===
Upgrading
---------
- Upgrading from version 0.7.1 or later can be done with a rolling
restart, one node at a time. You do not need to bring down the
whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- Running nodetool drain before shutting down the 0.7 node is
recommended but not required. (Skipping this will result in
replay of entire commitlog, so it will take longer to restart but
is otherwise harmless.)
- 0.8 is fully API-compatible with 0.7. You can continue
to use your 0.7 clients.
- Avro record classes used in map/reduce and Hadoop streaming code have
been removed. Map/reduce can be switched to Thrift by changing
org.apache.cassandra.avro in import statements to
org.apache.cassandra.thrift (no class names change). Streaming support
has been removed for the time being.
- The loadbalance command has been removed from nodetool. For similar
behavior, decommission then rebootstrap with empty initial_token.
- Thrift unframed mode has been removed.
- The addition of key_validation_class means the cli will assume keys
are bytes, instead of strings, in the absence of other information.
See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details.
Features
--------
- added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
Python, respectively (see: drivers/ subdirectory and doc/cql)
- added distributed Counters feature;
see http://wiki.apache.org/cassandra/Counters
- optional intranode encryption; see comments around 'encryption_options'
in cassandra.yaml
- compaction multithreading and rate-limiting; see
'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
cassandra.yaml
- cassandra will limit total memtable memory usage to 1/3 of the heap
by default. This can be ajusted or disabled with the
memtable_total_space_in_mb option. The old per-ColumnFamily
throughput, operations, and age settings are still respected but
will be removed in a future major release once we are satisfied that
memtable_total_space_in_mb works adequately.
Tools
-----
- stress and py_stress moved from contrib/ to tools/
- clustertool was removed (see
https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
of how to script nodetool across the cluster instead)
Other
-----
- In the past, sstable2json would write column names and values as
hex strings, and now creates human readable values based on the
comparator/validator. As a result, JSON dumps created with
older versions of sstable2json are no longer compatible with
json2sstable, and imports must be made with a configuration that
is identical to the export.
- manually-forced compactions ("nodetool compact") will do nothing
if only a single SSTable remains for a ColumnFamily. To force it
to compact that anyway (which will free up space if there are
a lot of expired tombstones), use the new forceUserDefinedCompaction
JMX method on CompactionManager.
- most of contrib/ (which was not part of the binary releases)
has been moved either to examples/ or tools/. We plan to move the
rest for 0.8.1.
JMX
---
- By default, JMX now listens on port 7199.
0.7.6
=====
Upgrading
---------
- Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
0.7.5
=====
Upgrading
---------
- Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Changes
-------
- system_update_column_family no longer snapshots before applying
the schema change. (_update_keyspace never did. _drop_keyspace
and _drop_column_family continue to snapshot.)
- added memtable_flush_queue_size option to cassandra.yaml to
avoid blocking writes when multiple column families (or a colum
family with indexes) are flushed at the same time.
- allow overriding initial_token, storage_port and rpc_port using
system properties
0.7.4
=====
Upgrading
---------
- Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Features
--------
- Output to Pig is now supported as well as input
0.7.3
=====
Upgrading
---------
- 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
bloom filters to be generated when compacting sstables generated
with earlier versions. This would manifest in IOExceptions during
column name-based queries. 0.7.3 provides "nodetool scrub" to
rebuild sstables with correct bloom filters, with no data lost.
(If your cluster was never on 0.7.0 or earlier, you don't have to
worry about this.) Note that nodetool scrub will snapshot your
data files before rebuilding, just in case.
0.7.1
=====
Upgrading
---------
- 0.7.1 is completely backwards compatible with 0.7.0. Just restart
each node with the new version, one at a time. (The cluster does
not all need to be upgraded simultaneously.)
Features
--------
- added flush_largest_memtables_at and reduce_cache_sizes_at options
to cassandra.yaml as an escape valve for memory pressure
- added option to specify -Dcassandra.join_ring=false on startup
to allow "warm spare" nodes or performing JMX maintenance before
joining the ring
Performance
-----------
- Disk writes and sequential scans avoid polluting page cache
(requires JNA to be enabled)
- Cassandra performs writes efficiently across datacenters by
sending a single copy of the mutation and having the recipient
forward that to other replicas in its datacenter.
- Improved network buffering
- Reduced lock contention on memtable flush
- Optimized supercolumn deserialization
- Zero-copy reads from mmapped sstable files
- Explicitly set higher JVM new generation size
- Reduced i/o contention during saving of caches
0.7.0
=====
Features
--------
- Secondary indexes (indexes on column values) are now supported
- Row size limit increased from 2GB to 2 billion columns. rows
are no longer read into memory during compaction.
- Keyspace and ColumnFamily definitions may be added and modified live
- Streaming data for repair or node movement no longer requires
anticompaction step first
- NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for
use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments
in `cassandra.yaml.`
- Optional per-Column time-to-live field allows expiring data without
have to issue explicit remove commands
- `truncate` thrift method allows clearing an entire ColumnFamily at once
- Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out]
support
- Up to 8x faster reads from row cache
- A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
and orders keys by their byte value. This should be used in new
deployments instead of OrderPreservingPartitioner.
- Optional round-robin scheduling between keyspaces for multitenant
clusters
- Dynamic endpoint snitch mitigates the impact of impaired nodes
- New `IntegerType`, faster than LongType and allows integers of
both less and more bits than Long's 64
- A revamped authentication system that decouples authorization and
allows finer-grained control of resources.
Upgrading
---------
The Thrift API has changed in incompatible ways; see below, and refer
to http://wiki.apache.org/cassandra/ClientOptions for a list of
higher-level clients that have been updated to support the 0.7 API.
The Cassandra inter-node protocol is incompatible with 0.6.x
releases (and with 0.7 beta1), meaning you will have to bring your
cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
The hints schema was changed from 0.6 to 0.7. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 0.7 for the first time.
Keyspace and ColumnFamily definitions are stored in the system
keyspace, rather than the configuration file.
The process to upgrade is:
1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log
message "Node is drained" appears), stop the process.
2) Convert your storage-conf.xml to the new cassandra.yaml using
"bin/config-converter".
3) Rename any of your keyspace or column family names that do not adhere
to the '^\w+' regex convention.
4) Start up your cluster with the 0.7 version.
5) Initialize your Keyspace and ColumnFamily definitions using
"bin/schematool <host> <jmxport> import". _You only need to do
this to one node_.
Thrift API
----------
- The Cassandra server now defaults to framed mode, rather than
unframed. Unframed is obsolete and will be removed in the next
major release.
- The Cassandra Thrift interface file has been updated for Thrift 0.5.
If you are compiling your own client code from the interface, you
will need to upgrade the Thrift compiler to match.
- Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
CollatingOrderPreservingPartitioner continue to expect that keys contain
UTF-8 encoded strings, but RandomPartitioner now works on any key data.
- keyspace parameters have been replaced with the per-connection
set_keyspace method.
- The return type for login() is now AccessLevel.
- The get_string_property() method has been removed.
- The get_string_list_property() method has been removed.
Configuraton
------------
- Configuration file renamed to cassandra.yaml and log4j.properties to
log4j-server.properties
- PropertyFileSnitch configuration file renamed to
cassandra-topology.properties
- The ThriftAddress and ThriftPort directives have been renamed to
RPCAddress and RPCPort respectively.
- EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch
has been added.
- RackUnawareStrategy and RackAwareStrategy have been renamed to
SimpleStrategy and OldNetworkTopologyStrategy, respectively.
- RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
- GCGraceSeconds is now per-ColumnFamily instead of global
- Keyspace and column family names that do not confirm to a '^\w+' regex
are considered illegal.
- Keyspace and column family definitions will need to be loaded via
"bin/schematool <host> <jmxport> import". _You only need to do this to
one node_.
- In addition to an authenticator, an authority must be configured as
well. Users of SimpleAuthenticator should use SimpleAuthority for this
value (the default is AllowAllAuthority, which corresponds with
AllowAllAuthenticator).
- The format of access.properties has changed, see the sample configuration
conf/access.properties for documentation on the new format.
JMX
---
- StreamingService moved from o.a.c.streaming to o.a.c.service
- GMFD renamed to GOSSIP_STAGE
- {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
since it no longer has to wait til compaction to be computed
Other
-----
- If extending AbstractType, make sure you follow the singleton pattern
followed by Cassandra core AbstractType classes: provide a public
static final variable called 'instance'.
0.6.6
=====
Upgrading
---------
- As part of the cache-saving feature, a third directory
(along with data and commitlog) has been added to the config
file. You will need to set and create this directory
when restarting your node into 0.6.6.
0.6.1
=====
Upgrading
---------
- We try to keep minor versions 100% compatible (data format,
commitlog format, network format) within the major series, but
we introduced a network-level incompatibility in 0.6.1.
Thus, if you are upgrading from 0.6.0 to any higher version
(0.6.1, 0.6.2, etc.) then you will need to restart your entire
cluster with the new version, instead of being able to do a
rolling restart.
0.6.0
=====
Features
--------
- row caching: configure with the RowsCached attribute in
ColumnFamily definition
- Hadoop map/reduce support: see contrib/word_count for an example
- experimental authentication support, described under
Authenticator in storage.conf
Configuraton
------------
- MemtableSizeInMB has been replaced by MemtableThroughputInMB which
triggers a memtable flush when the specified amount of data has
been written, including overwrites.
- MemtableObjectCountInMillions has been replaced by the
MemtableOperationsInMillions directive which causes a memtable flush
to occur after the specified number of operations.
- Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by
BinaryMemtableThroughputInMB.
- Replication factor is now per-keyspace, rather than global.
- KeysCachedFraction is deprecated in favor of KeysCached
- RowWarningThresholdInMB added, to warn before very large rows
get big enough to threaten node stability
Thrift API
----------
- removed deprecated get_key_range method
- added batch_mutate meethod
- deprecated multiget and batch_insert methods in favor of
multiget_slice and batch_mutate, respectively
- added ConsistencyLevel.ANY, for when you want write
availability even when it may not be readable immediately.
Unlike CL.ZERO, though, it will throw an exception if
it cannot be written *somewhere*.
JMX metrics
-----------
- read and write statistics are reported as lifetime totals,
instead of averages over the last minute. average-since-last
requested are also available for convenience.
- cache hit rate statistics are now available from JMX under
org.apache.cassandra.db.Caches
- compaction JMX metrics are moved to
org.apache.cassandra.db.CompactionManager. PendingTasks is now
a much better estimate of compactions remaining, and the
progress of the current compaction has been added.
- commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
- progress of data streaming during bootstrap, loadbalance, or other
data migration, is available under
org.apache.cassandra.streaming.StreamingService.
See http://wiki.apache.org/cassandra/Streaming for details.
Installation/Upgrade
--------------------
- 0.6 network traffic is not compatible with earlier versions. You
will need to shut down all your nodes at once, upgrade, then restart.
0.5.0
=====
0. The commitlog format has changed (but sstable format has not).
When upgrading from 0.4, empty the commitlog either by running
bin/nodeprobe flush on each machine and waiting for the flush to finish,
or simply remove the commitlog directory if you only have test data.
(If more writes come in after the flush command, starting 0.5 will error
out; if that happens, just go back to 0.4 and flush again.)
The format changed twice: from 0.4 to beta1, and from beta2 to RC1.
.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist
in a cluster of 0.4 nodes or vice versa; you must upgrade your
whole cluster at the same time.
1. Bootstrap, move, load balancing, and active repair have been added.
See http://wiki.apache.org/cassandra/Operations. When upgrading
from 0.4, leave autobootstrap set to false for the first restart
of your old nodes.
2. Performance improvements across the board, especially on the write
path (over 100% improvement in stress.py throughput).
3. Configuration:
- Added "comment" field to ColumnFamily definition.
- Added MemtableFlushAfterMinutes, a global replacement for the
old per-CF FlushPeriodInMinutes setting
- Key cache settings
4. Thrift:
- Added get_range_slice, deprecating get_key_range
0.4.2
=====
1. Improve default garbage collector options significantly --
throughput will be 30% higher or more.
0.4.1
=====
1. SnapshotBeforeCompaction configuration option allows snapshotting
before each compaction, which allows rolling back to any version
of the data.
0.4.0
=====
1. On-disk data format has changed to allow billions of keys/rows per
node instead of only millions. The new format is incompatible with 0.3;
see 0.3 notes below for how to import data from a 0.3 install.
2. Cassandra now supports multiple keyspaces. Typically you will have
one keyspace per application, allowing applications to be able to
create and modify ColumnFamilies at will without worrying about
collisions with others in the same cluster.
3. Many Thrift API changes and documentation. See
http://wiki.apache.org/cassandra/API
4. Removed the web interface in favor of JMX and bin/nodeprobe, which
has significantly enhanced functionality.
5. Renamed configuration "<Table>" to "<Keyspace>".
6. Added commitlog fsync; see "<CommitLogSync>" in configuration.
0.3.0
=====
1. With enough and large enough keys in a ColumnFamily, Cassandra will
run out of memory trying to perform compactions (data file merges).
The size of what is stored in memory is (S + 16) * (N + M) where S
is the size of the key (usually 2 bytes per character), N is the
number of keys and M, is the map overhead (which can be guestimated
at around 32 bytes per key).
So, if you have 10-character keys and 1GB of headroom in your heap
space for compaction, you can expect to store about 17M keys
before running into problems.
See https://issues.apache.org/jira/browse/CASSANDRA-208
2. Because fixing #1 requires a data file format change, 0.4 will not
be binary-compatible with 0.3 data files. A client-side upgrade
can be done relatively easily with the following algorithm:
for key in old_client.get_key_range(everything):
columns = old_client.get_slice or get_slice_super(key, all columns)
new_client.batch_insert or batch_insert_super(key, columns)
The inner loop can be trivially parallelized for speed.
3. Commitlog does not fsync before reporting a write successful.
Using blocking writes mitigates this to some degree, since all
nodes that were part of the write quorum would have to fail
before sync for data to be lost.
See https://issues.apache.org/jira/browse/CASSANDRA-182
Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, because
compaction deserializes each row before merging.
See https://issues.apache.org/jira/browse/CASSANDRA-16
DSE 5.1.1のリリース・ノート
DataStax Enterprise 5.1.1のリリース・ノート。
2017年5月23日
- 5.1.1のコンポーネント
- RNdse.html#RNdse511__511H
- 5.1.1の変更点と機能強化
- 5.1.1の解決済みの問題点
- 5.1.1のCHANGES.txt
- 5.1.1のNEWS.txt
5.1.1のコンポーネント
- Apache Cassandra™ 3.10.0.1695
- Apache Solr™ 6.0.1.0.1705
- Apache Tomcat® 8.0.43
- DataStax Spark Cassandra Connector 2.0.2
- DSEFS 5.1.26
- TinkerPop 3.2.5-20170321-f3032b39
- 特定のHadoopライブラリ
5.1.1のハイライト
DSE AnalyticsおよびDSEFSのハイライト
DSE 5.1.1では、Sparkマスターが別のノードに変わったときに再接続するSparkワーカーの信頼性が向上しました。たとえば、現在のマスター・ノードが停止した場合などです。このようなシナリオが発生することはほとんどありませんが、Sparkワーカーを再起動するコマンドの実行が必要となる場合があります。この影響を受けるバージョンは、DSE 5.0.7と5.1.0です。(DSP-11306)
DSE Graphのハイライト
- メタプロパティがグラフ・スキーマで使用された場合、OLAPクエリーは失敗します。(DSP-13016)
- 複数のスレッドが同じGremlinスクリプトをコンパイルしないようにスクリプトが同期されます。マルチスレッドのシナリオでは、Gremlinスクリプトがハング状態になります。(DSP-12814)
DSE Searchのハイライト
- HTTPインターフェイスを使用する。(DSP-13318)、(DSP-13270)
- アクティブなSolrコアをサポートするThriftカラム・ファミリーがある。(DSP-13019)
- TTLを使用してデータの有効期限を設定する。(DSP-12960)
- インデックスの暗号化を使用する。(DSP-13155)、(DSP-12620)
- ライブ・インデックス作成を使用する。(DSP-12040)、(DSP-12941)
5.1.1の変更点と機能強化
5.1.1 DataStax Enterpriseの変更点と機能強化
- CVE-2015-6420により、commons-collections4バージョン4.1でセキュリティの修正が行われました。(DSP-13060)
- JVMでセグメンテーション障害を起こさずに、マップされたメモリー・アクセスがアサーションによって保護されるようになりました。(DSP-13344)
5.1.1 DSE Advanced Replication(DSE拡張レプリケーション)の変更点と機能強化
- CDCプロセッサーの堅牢性が向上しました。(DSP-12852)
- 監査ログの圧縮パラメーターが追加されました。(DSP-12949)
5.1.1 DSE Analyticsの変更点と機能強化
- Spark Cassandra ConnectorでDseSessionと互換性のあるセッションが作成されるようになりました。(DSP-12737)
5.1.1 DSE Graphの変更点と機能強化
- mapdbおよびnettyのtmp dirを設定する明示的なパラメーターが追加されました。(DGL-167)
- ディレクトリーの再帰的な読み込みがサポートされました。(DGL-172)
- ClusterBuilderの二重クラスター・クライアントが削除されました。代わりに、単一クライアントを使用し、{{SimpleGraphStatement}}でCLを構成してグラフを作成してください。(DGL-183)
- VertexInputRDD.getOrCreateVertexメソッドのパフォーマンスが改善され、Graph OLAPクエリーの実行時間が約10%短縮されました。(DSP-12782)
- アプリケーションの構築をサポートするために、com.datastax.dse:dse-spark-dependenciesにDseGraphFramesライブラリが含まれました。(DSP-13074)
5.1.1 DSEFSの変更点と機能強化
- 新しいデータ・ブロックを配置する場合は、DSEFSによるネットワーク帯域幅使用量を節約するためにローカル・ノードを使用することが推奨されます。(DSP-12746)
5.1.1 DSE Searchの変更点と機能強化
- CQLインデックス管理機能を使用してコアを作成するようにSolrデモが更新されました。(DSP-11451)
- 分散検索クエリーのランタイム・ノードのブラックリストが作成されるようになり、EndpointStateTracker MBeanにBlacklistedブーリアン属性が追加されました。(DSP-12965)
- dsetool core_indexing_status --progressオプションを使用することで、インデックスの再作成の進捗状況が表示されるようになりました。(DSP-12617)
- ネイティブおよびユーザー定義(タプル/UDT)要素型のfrozenセットおよびリストのインデックス作成がサポートされました。frozenマップのインデックス作成はサポートされていません。(DSP-12983)
5.1.1の解決済みの問題点
5.1.1 DataStax Enterpriseの解決済みの問題点
- dsetoolログにより、ログの認証情報が消去される。(DSP-12985)
- DseAuthenticatorでプレーン・テキスト認証が正しく処理されていなかったために、パフォーマンスが低下する。(DSP-13201)
- 5.1へのアップグレード中に、インストーラーによって/etc/dse/confの下のユーザー・ディレクトリーが削除される。(DSP-13296)
- LDAPを介したSafeNet/KMIP認証を行うことができない。(DSP-12739)
- Apache Ant Core 1.7.0にCVE-2012-2098の脆弱性がある。(DSP-12925)
5.1.1 DSE Advanced Replication(DSE拡張レプリケーション)の解決済みの問題点
- 構成の更新中にエラーが発生する。(DSP-13148)
- コミット・ログの暗号化が有効な場合に、転送中のAdvanced Replication(拡張レプリケーション)ミューテーションが暗号化されない。(DSP-12961)
- 転送ファイルが見つからない場合に、MutationFileSourceが失敗する。(DSP-11633)
- AdvRepチャネル・ステータスのNPE。(DSP-12522)
- AdvRep CLIメトリクス・リスト出力が負のメッセージ数を示す。(DSP-12788)
- マルチノードのadvrepログ・カウント・シリアライザー未定義エラー。(DSP-13032)
5.1.1 DSE Analyticsの解決済みの問題点
- 起動時にSparkワーカーはマスターに登録されて後で変更されるが、新しいマスターに登録されない。(DSP-11306)
- 新しいCQL型tinyint。(DSP-11940)
- アプリケーションが送信または停止されると同時に、Sparkマスターを持つDSEノードが正常にシャットダウンされた場合、Sparkマスターで復元用ストレージ情報を保存できない。(DSP-12795)
- 気象センサー・デモのWebサイトですべてのデータ値がグラフ化されない。(DSP-13041)
- Sparkシェルの起動時に不要なメッセージが表示される。(DSP-13239)
- spark-submit --driver-class-pathオプションを使用した場合に、ドライバー・クラスパスにのみjarが配置されない。(DSP-13289)
5.1.1 DSEFSの解決済みの問題点
- DSEFSのメモリー・リーク。(DSP-13023)
- Sparkで、ファイルをWebHDFS RESTインターフェイスに書き込めない。(DSP-13154)
5.1.1 DSE Graphの解決済みの問題点
- セカンダリ・インデックスがサポートされました。(DGL-202)
- カスタムIDの使用時にDGLを再実行すると、エッジが重複して作成される。(DGL-205)
- 空の文字列を含むプロパティがスキップされる。グラフ・ローダーの-skip_blank_valuesオプションが新たに追加されました。(DGL-215)
- タブ区切りデータをFile.textで正しく読み取ることができない。(DGL-222)
- 負の値を使用するとRangeStepが失敗する。(DSP-11671)
- DigestTokensManagerのロギング・レベルがINFOからDEBUGに下がる。(DSP-12234)
- Sparkからグラフを読み取るときに、読み取りと書き込みの両方でDecimal型が正常に機能しない。(DSP-12299)
- 新たに作成した要素のIDを通常の要素と比較すると、クラス・キャスト例外が発生する。(DSP-12738)
- graph.allow_scanをtxレベルで設定できるようになりました。(DSP-12794)
- 大きいGremlinスクリプトを処理する際のASMの「メソッド・コードが大きすぎる」という例外の処理が改善されました。(DSP-12802)
- スレッドが多いと停止して同じスクリプトをコンパイルする。(DSP-12814)
- スキーマ要素に指定した新しいIDが既に使用されていないかどうかをチェックします。(DSP-12826)
- solr .within()クエリーが正しく最適化されました。(DSP-12830)
- 頂点プロパティのメタプロパティがスキーマで定義されていないと、無効なRDDデータが作成される。(DSP-13016)
- OLAPでのエッジとメタプロパティの大文字と小文字の区別。(DSP-13085)
- フルグラフ・スキャンによって取得された頂点のIDを読み取ろうとして、例外がスローされる。(DSP-13210)
- Graphは、DSEシステムのキースペースを設定してから、スキーマ更新のリッスンを開始する必要がある。(DSP-13251)
- UUIDをカスタムIDにすると、DseGraphFrameが失敗する。(DSP-13302)
5.1.1 DSE Searchの解決済みの問題点
- デモ・アプリのsolrConfigファイルから<dataDir>オプションが削除されました。(DSP-9402)
- カラムにコロン(:)が含まれていると、CQL Searchクエリーがタイムアウトする。Solrのフィールド名ポリシーがDSE Searchのフィールド名に適用される。(DSP-11296)
- TimeUUIDFieldのエポックがプラットフォーム非依存になりました。(DSP-11424)
- RTセットアップで空のDWPTが破棄される場合、単語ベクトル(TV)ファイルがリークを処理する。(DSP-12040)
- 作成されたDistributedRequestExceptionに詳細メッセージがない。(DSP-12493)
- 同時処理が多い場合にBlockCacheの破損が発生する。(DSP-12620)
- UDTサブフィールドで検索時にパフォーマンスが低下する。(DSP-12812)
- TTLロギングが改善されました。(DSP-12885)
- RTの単語出現頻度に一貫性がない。(DSP-12941)
- TTLタスクのスケジュールがキャンセルされない。(DSP-12960)
- Thriftテーブルのアップグレード後に、コアを再読み込みできない。(DSP-13019)
- 構成に関係なく、Solrがポート8080のみをリッスンする。(DSP-13187)
- Solrが、すべてのコアを読み込む前に、HTTP要求を受け入れる。(DSP-13270)
- インデックス出力が閉じられるときにStatefulEncryptorAdapterキャッシュが除去されるため、StatefulEncryptorAdapterが過剰に使用される。(DSP-13155)
- CVE-2016-8735とその他のセキュリティ問題を修正するには、Tomcatを8.0.43にアップグレードしてください。(DSP-13318)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
パッケージ・インストールInstaller-Servicesインストール |
/etc/dse/dse.yaml |
tarボール・インストールInstaller-No Servicesインストール |
installation_location/resources/dse/conf/dse.yaml |
DSE 5.1.1のCHANGES.txt
DataStax Enterprise 5.1.1に含まれている、Apache Cassandra™ 3.10.0の実稼働環境で認定済みの変更点のリスト。
DataStax Enterprise(DSE)5.1.1には、それより前のDSEリリースのすべての変更点と、Apache Cassandra™ 3.10.0の実稼働環境で認定済みの以下の変更点が含まれています。これらの変更点はCHANGES.txtにリストされています。
- SASIでページングを使用する際の重複行の問題を修正(CASSANDRA-13302)
- パーティション・キーとその要素に対してCONTAINS文によるフィルター処理が可能(CASSANDRA-13275)
- トークンの分散が不均等な場合、vnodeを含むクラスターで均等な範囲が計算し直される(CASSANDRA-13229)
- duration型の検証を修正してオーバーフローを回避(CASSANDRA-13218)
- パーティション・キー列でサポートされていないSASIインデックスの作成を禁止(CASSANDRA-13228)
- CQL文法のキーに対する複数値を拒否(CASSANDRA-13369)
- 入力行がない場合にUDAが失敗する(CASSANDRA-13399)
- daemonInitializationを使用してcompaction-stressを修正(CASSANDRA-13188)
- V5プロトコル・フラグのデコード破損(CASSANDRA-13443)
- コンパクション・ストラテジからsstableを削除するために読み取りロックではなく書き込みロックを使用(CASSANDRA-13422)
- JMXEnabledThreadPoolExecutorsでmaxPoolSizeに等しいcorePoolSizeを使用(CASSANDRA-13329)
- 値を含んでいないSASIインデックスのリビルドを回避(CASSANDRA-12962)
- アナライザー入力ストリームにcharsetを追加(CASSANDRA-13151)
- StandardTokenizerImpl.jflexから無効な文字を削除(CASSANDRA-13417)
- cqlshの自動プロトコル・ダウングレードの不具合を修正(CASSANDRA-13307)
- QueryMessageからトレース・セッションに渡されないペイロードのトレース(CASSANDRA-12835)
- sstableloaderにストレージ・ポート・オプションを追加(CASSANDRA-13518)
- cqlsh DESCRIBE出力で引用符で囲まれたインデックス名を適切に処理(CASSANDRA-12847)
- 古い形式のsstableから静的行が2度読み取られるのを回避(CASSANDRA-13236)
- StorageService.excise()のNPEを修正(CASSANDRA-13163)
- 単一スレッドによりOutboundTcpConnectionメッセージの期限を設定する(CASSANDRA-13265)
- 受信した応答が不十分な場合、リペアが失敗する(CASSANDRA-13397)
- 読み込んだテーブルに削除されたカラムが含まれている場合のSSTableLoaderの失敗を修正(CASSANDRA-13276)
- CassandraIndexTestでの名前の競合を回避(CASSANDRA-13427)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- 使用廃止に関するヒントのリプレイを中断(CASSANDRA-13308)
- 部分的に書き込まれたヒント・ファイルの処理(CASSANDRA-12728)
- StorageServiceでのNPEの問題を修正(CASSANDRA-13060)
- 範囲トゥームストーンの読み取りの信頼性を向上(CASSANDRA-12811)
- スキーマ・テーブルが完全にフラッシュされていないことによる起動時の問題を修正(CASSANDRA-12213)
- 再起動時にデータが除外されることのあるビュー・ビルダーのバグを修正(CASSANDRA-13405)
- 標準カラムがない場合の2iページ・サイズの計算を修正(CASSANDRA-13400)
- 標準カラム・データのない2.X期限切れ行の変換を修正(CASSANDRA-13395)
- prefer_localが有効な場合にext+internal IPを使用する際のヒント配信を修正(CASSANDRA-13020)
- Nodetool upgradesstables/scrub/compactがシステム・テーブルを無視する(CASSANDRA-13410)
- ローリング・アップグレードのスキーマ・バージョン計算を修正(CASSANDRA-13441)
- RemoveTestでのgossiperの起動を回避(CASSANDRA-13407)
- JMXとNodeToolによって報告されるrow-cacheのweightedSize()を修正(CASSANDRA-13393)
- JVMメトリクス名を修正(CASSANDRA-13103)
- 結合ストラテジのスリープが過剰(CASSANDRA-13090)
- 静的カラムのあるテーブルのパーティション・キーの2ndaryインデックス・クエリーを修正(CASSANDRA-13147)
- cqlsh copy fromのParseErrorのunhashable型リストを修正(CASSANDRA-13364)
DSE 5.1.1のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================
Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again. Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)
When upgrading major versions of Cassandra, you will be unable to
restore snapshots created with the previous major version using the
'sstableloader' tool. You can upgrade the file format of your snapshots
using the provided 'sstableupgrade' tool.
3.11.0
======
Upgrading
---------
- The NativeAccessMBean isAvailable method will only return true if the
native library has been successfully linked. Previously it was returning
true if JNA could be found but was not taking into account link failures.
- Primary ranges in the system.size_estimates table are now based on the keyspace
replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
- In 2.1, the default for otc_coalescing_strategy was 'DISABLED'.
In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown
to be a performance regression. The default for 3.11.0 and newer has
been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should
be aware that the default has changed.
3.10
====
New features
------------
- New `DurationType` (cql duration). See CASSANDRA-11873
- Runtime modification of concurrent_compactors is now available via nodetool
- Support for the assignment operators +=/-= has been added for update queries.
- An Index implementation may now provide a task which runs prior to joining
the ring. See CASSANDRA-12039
- Filtering on partition key columns is now also supported for queries without
secondary indexes.
- A slow query log has been added: slow queries will be logged at DEBUG level.
For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms
in cassandra.yaml.
- Support for GROUP BY queries has been added.
- A new compaction-stress tool has been added to test the throughput of compaction
for any cassandra-stress user schema. see compaction-stress help for how to use.
- Compaction can now take into account overlapping tables that don't take part
in the compaction to look for deleted or overwritten data in the compacted tables.
Then such data is found, it can be safely discarded, which in turn should enable
the removal of tombstones over that data.
The behavior can be engaged in two ways:
- as a "nodetool garbagecollect -g CELL/ROW" operation, which applies
single-table compaction on all sstables to discard deleted data in one step.
- as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag,
which uses overlapping tables as a source of deletions/overwrites during all
compactions.
The argument specifies the granularity at which deleted data is to be found:
- If ROW is specified, only whole deleted rows (or sets of rows) will be
discarded.
- If CELL is specified, any columns whose value is overwritten or deleted
will also be discarded.
- NONE (default) specifies the old behavior, overlapping tables are not used to
decide when to discard data.
Which option to use depends on your workload, both ROW and CELL increase the
disk load on compaction (especially with the size-tiered compaction strategy),
with CELL being more resource-intensive. Both should lead to better read
performance if deleting rows (resp. overwriting or deleting cells) is common.
- Prepared statements are now persisted in the table prepared_statements in
the system keyspace. Upon startup, this table is used to preload all
previously prepared statements - i.e. in many cases clients do not need to
re-prepare statements against restarted nodes.
- cqlsh can now connect to older Cassandra versions by downgrading the native
protocol version. Please note that this is currently not part of our release
testing and, as a consequence, it is not guaranteed to work in all cases.
See CASSANDRA-12150 for more details.
- Snapshots that are automatically taken before a table is dropped or truncated
will have a "dropped" or "truncated" prefix on their snapshot tag name.
- Metrics are exposed for successful and failed authentication attempts.
These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess
and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
- Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET.
See CASSANDRA-11424 for details
- Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
- Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.
Upgrading
---------
- Support for alter types of already defined tables and of UDTs fields has been disabled.
If it is necessary to return a different type, please use casting instead. See
CASSANDRA-12443 for more details.
- Specifying the default_time_to_live option when creating or altering a
materialized view was erroneously accepted (and ignored). It is now
properly rejected.
- Only Java and JavaScript are now supported UDF languages.
The sandbox in 3.0 already prevented the use of script languages except Java
and JavaScript.
- Compaction now correctly drops sstables out of CompactionTask when there
isn't enough disk space to perform the full compaction. This should reduce
pending compaction tasks on systems with little remaining disk space.
- Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the
"full" request time on the coordinator. Previously, they only covered the time from
when the coordinator sent a message to a replica until the time that the replica
responded. Additionally, the previous behavior was to reset the timeout when performing
a read repair, making a second read to fix a short read, and when subranges were read
as part of a range scan or secondary index query. In 3.10 and higher, the timeout
is no longer reset for these "subqueries". The entire request must complete within
the specified timeout. As a consequence, your timeouts may need to be adjusted
to account for this. See CASSANDRA-12256 for more details.
- Logs written to stdout are now consistent with logs written to files.
Time is now local (it was UTC on the console and local in files). Date, thread, file
and line info where added to stdout. (see CASSANDRA-12004)
- The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided.
The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on
that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided
on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable
accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
- (Tools development) DatabaseDescriptor no longer implicitly startups components/services like
commit log replay. This may break existing 3rd party tools and clients. In order to startup
a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or
DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner,
snitch, encryption context. Client initialization just applies the configuration but does not
setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are
deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
- Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from
timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms
property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property.
- Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format
(e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).
3.8
===
New features
------------
- Shared pool threads are now named according to the stage they are executing
tasks for. Thread names mentioned in traced queries change accordingly.
- A new option has been added to cassandra-stress "-rate fixed={number}/s"
that forces a scheduled rate of operations/sec over time. Using this, stress can
accurately account for coordinated ommission from the stress process.
- The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
- hdr histograms have been added to stress runs, it's output can be saved to disk using:
"-log hdrfile=" option. This histogram includes response/service/wait times when used with the
fixed or throttle rate options. The histogram file can be plotted on
http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
- TimeWindowCompactionStrategy has been added. This has proven to be a better approach
to time series compaction and new tables should use this instead of DTCS. See
CASSANDRA-9666 for details.
- Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and
a brief explanation of on-disk locations for archived data in CommitLog form. This can
be enabled via ALTER TABLE ... WITH cdc=true.
Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to
the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables
will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached
between unflushed CommitLogSegments and cdc_raw.
NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version
cluster as it will lead to exceptions which can interrupt traffic. Once all nodes
have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.
Upgrading
---------
- The ReversedType behaviour has been corrected for clustering columns of
BYTES type containing empty value. Scrub should be run on the existing
SSTables containing a descending clustering column of BYTES type to correct
their ordering. See CASSANDRA-12127 for more details.
- Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address
to the public instance IP if this property is defined on cassandra.yaml.
- The name "json" and "distinct" are not valid anymore a user-defined function
names (they are still valid as column name however). In the unlikely case where
you had defined functions with such names, you will need to recreate
those under a different name, change your code to use the new names and
drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more
details).
Deprecation
-----------
- DateTieredCompactionStrategy has been deprecated - new tables should use
TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might
cause increased compaction load for a while after the migration so make sure you run
tests before migrating. Read CASSANDRA-9666 for background on this.
3.7
===
Upgrading
---------
- A maximum size for SSTables values has been introduced, to prevent out of memory
exceptions when reading corrupt SSTables. This maximum size can be set via
max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default
value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if
they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.
3.6
=====
New features
------------
- JMX connections can now use the same auth mechanisms as CQL clients. New options
in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated
to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings
still only expose JMX locally, and use the JVM's own security mechanisms when remote
connections are permitted. For more details on how to enable the new options, see the
comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for
the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details.
Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system
property at startup is deprecated. See CASSANDRA-11725 for more details.
- JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
- Collision checks are performed when joining the token ring, regardless of whether
the node should bootstrap. Additionally, replace_address can legitimately be used
without bootstrapping to help with recovery of nodes with partially failed disks.
See CASSANDRA-10134 for more details.
- Key cache will only hold indexed entries up to the size configured by
column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries
will never go into memory. See CASSANDRA-11206 for more details.
- For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL
from the inserted or updated values.
- Startup is now aborted if corrupted transaction log files are found. The details
of the affected log files are now logged, allowing the operator to decide how
to resolve the situation.
- Filtering expressions are made more pluggable and can be added programatically via
a QueryHandler implementation. See CASSANDRA-11295 for more details.
3.4
===
New features
------------
- Internal authentication now supports caching of encrypted credentials.
Reference cassandra.yaml:credentials_validity_in_ms
- Remote configuration of auth caches via JMX can be disabled using the
the system property cassandra.disable_auth_caches_remote_configuration
- sstabledump tool is added to be 3.0 version of former sstable2json. The tool only
supports v3.0+ SSTables. See tool's help for more detail.
Upgrading
---------
- Nothing specific to 3.4 but please see previous versions upgrading section,
especially if you are upgrading from 2.2.
Deprecation
-----------
- The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and
org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of
org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is
common across all caches in the auth subsystem. The specific mbean interfaces
for each individual cache will be removed in a subsequent major version.
3.2
===
New features
------------
- We now make sure that a token does not exist in several data directories. This
means that we run one compaction strategy per data_file_directory and we use
one thread per directory to flush. Use nodetool relocatesstables to make sure your
tokens are in the correct place, or just wait and compaction will handle it. See
CASSANDRA-6696 for more details.
- bound maximum in-flight commit log replay mutation bytes to 64 megabytes
tunable via cassandra.commitlog_max_outstanding_replay_bytes
- Support for type casting has been added to the selection clause.
- Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression.
Note: hints compression is currently disabled by default.
Upgrading
---------
- The compression ratio metrics computation has been modified to be more accurate.
- Running Cassandra as root is prevented by default.
- JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file
Deprecation
-----------
- The Thrift API is deprecated and will be removed in Cassandra 4.0.
3.1
=====
Upgrading
---------
- The return value of SelectStatement::getLimit as been changed from DataLimits
to int.
- Custom index implementation should be aware that the method Indexer::indexes()
has been removed as its contract was misleading and all custom implementation
should have almost surely returned true inconditionally for that method.
- GC logging is now enabled by default (you can disable it in the jvm.options
file if you prefer).
3.0
===
New features
------------
- EACH_QUORUM is now a supported consistency level for read requests.
- Support for IN restrictions on any partition key component or clustering key
as well as support for EQ and IN multicolumn restrictions has been added to
UPDATE and DELETE statement.
- Support for single-column and multi-colum slice restrictions (>, >=, <= and <)
has been added to DELETE statements
- nodetool rebuild_index accepts the index argument without
the redundant table name
- Materialized Views, which allow for server-side denormalization, is now
available. Materialized views provide an alternative to secondary indexes
for non-primary key queries, and perform much better for indexing high
cardinality columns.
See http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
- Hinted handoff has been completely rewritten. Hints are now stored in flat
files, with less overhead for storage and more efficient dispatch.
See CASSANDRA-6230 for full details.
- Option to not purge unrepaired tombstones. To avoid users having data resurrected
if repair has not been run within gc_grace_seconds, an option has been added to
only allow tombstones from repaired sstables to be purged. To enable, set the
compaction option 'only_purge_repaired_tombstones':true but keep in mind that if
you do not run repair for a long time, you will keep all tombstones around which
can cause other problems.
- Enabled warning on GC taking longer than 1000ms. See
cassandra.yaml:gc_warn_threshold_in_ms
Upgrading
---------
- Clients must use the native protocol version 3 when upgrading from 2.2.X as
the native protocol version 4 is not compatible between 2.2.X and 3.Y. See
https://www.mail-archive.com/user@cassandra.apache.org/msg45381.html for details.
- A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator,
representing the IP address of the client attempting authentication. It will be a breaking
change for any custom implementations.
- token-generator tool has been removed.
- Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9,
or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and
older versions is not supported.
- The 'memtable_allocation_type: offheap_objects' option has been removed. It should
be re-introduced in a future release and you can follow CASSANDRA-9472 to know more.
- Configuration parameter memory_allocator in cassandra.yaml has been removed.
- The native protocol versions 1 and 2 are not supported anymore.
- Max mutation size is now configurable via max_mutation_size_in_kb setting in
cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024.
- 3.0 requires Java 8u40 or later.
- Garbage collection options were moved from cassandra-env to jvm.options file.
- New transaction log files have been introduced to replace the compactions_in_progress
system table, temporary file markers (tmp and tmplink) and sstable ancerstors.
Therefore, compaction metadata no longer contains ancestors. Transaction log files
list sstable descriptors involved in compactions and other operations such as flushing
and streaming. Use the sstableutil tool to list any sstable files currently involved
in operations not yet completed, which previously would have been marked as temporary.
A transaction log file contains one sstable per line, with the prefix "add:" or "remove:".
They also contain a special line "commit", only inserted at the end when the transaction
is committed. On startup we use these files to cleanup any partial transactions that were
in progress when the process exited. If the commit line is found, we keep new sstables
(those with the "add" prefix) and delete the old sstables (those with the "remove" prefix),
vice-versa if the commit line is missing. Should you lose or delete these log files,
both old and new sstable files will be kept as live files, which will result in duplicated
sstables. These files are protected by incremental checksums so you should not manually
edit them. When restoring a full backup or moving sstable files, you should clean-up
any left over transactions and their temporary files first. You can use this command:
===> sstableutil -c ks table
See CASSANDRA-7066 for full details.
- New write stages have been added for batchlog and materialized view mutations
you can set their size in cassandra.yaml
- User defined functions are now executed in a sandbox.
To use UDFs and UDAs, you have to enable them in cassandra.yaml.
- New SSTable version 'la' with improved bloom-filter false-positive handling
compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade
is not necessary but recommended.
- Before upgrading to 3.0, make sure that your cluster is in complete agreement
(schema versions outputted by `nodetool describecluster` are all the same).
- Schema metadata is now stored in the new `system_schema` keyspace, and
legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details.
- Pig's support has been removed.
- Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use
CqlBulkOutputFormat and CqlBulkRecordWriter instead.
- Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed;
use CqlInputFormat and CqlOutputFormat instead.
- Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed;
use CqlRecordReader and CqlRecordWriter instead.
- hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers.
To specify a list of excluded data centers when hinted_handoff_enabled is set to true,
use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details.
- The `sstable_compression` and `chunk_length_kb` compression options have been deprecated.
The new options are `class` and `chunk_length_in_kb`. Disabling compression should now
be done by setting the new option `enabled` to `false`.
- The compression option `crc_check_chance` became a top-level table option, but is currently
enforced only against tables with enabled compression.
- Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax
has been deprecated since 2.1.0 and is being removed in 3.0.0.
- The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated
since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options,
has now been removed.
- Batchlog entries are now stored in a new table - system.batches.
The old one has been deprecated.
- JMX methods set/getCompactionStrategyClass have been removed, use
set/getCompactionParameters or set/getCompactionParametersJson instead.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The secondary index API has been comprehensively reworked. This will be a breaking
change for any custom index implementations, which should now look to implement
the new org.apache.cassandra.index.Index interface. New syntax has been added to create
and query row-based indexes, which are not explicitly linked to a single column in the
base table.
2.2.4
=====
Deprecation
-----------
- Pig support has been deprecated, and will be removed in 3.0.
Please see CASSANDRA-10542 for more details.
- Configuration parameter memory_allocator in cassandra.yaml has been deprecated
and will be removed in 3.0.0. As mentioned below for 2.2.0, jemalloc is
automatically preloaded on Unix platforms.
Operations
----------
- Switching data center or racks is no longer an allowed operation on a node
which has data. Instead, the node will need to be decommissioned and
rebootstrapped. If moving from the SimpleSnitch, make sure that the data
center and rack containing all current nodes is named "datacenter1" and
"rack1". To override this behaviour use -Dcassandra.ignore_rack=true and/or
-Dcassandra.ignore_dc=true.
- Reloading the configuration file of GossipingPropertyFileSnitch has been disabled.
Upgrading
---------
- The default for the inter-DC stream throughput setting
(inter_dc_stream_throughput_outbound_megabits_per_sec in cassandra.yaml) is
the same than the one for intra-DC one (200Mbps) instead of being unlimited.
Having it unlimited was never intended and was a bug.
New features
------------
- Time windows in DTCS are now limited to 1 day by default to be able to
handle bootstrap and repair in a better way. To get the old behaviour,
increase max_window_size_seconds.
- DTCS option max_sstable_age_days is now deprecated and defaults to 1000 days.
- Native protocol server now allows both SSL and non-SSL connections on
the same port.
2.2.3
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.2 if you are upgrading
from a previous version.
2.2.2
=====
Changed Defaults
----------------
- commitlog_total_space_in_mb will use the smaller of 8192, and 1/4
of the total space of the commitlog volume. (Before: always used
8192)
- The following INFO logs were reduced to DEBUG level and will now show
on debug.log instead of system.log:
- Memtable flushing actions
- Commit log replayed files
- Compacted sstables
- SStable opening (SSTableReader)
New features
------------
- Custom QueryHandlers can retrieve the column specifications for the bound
variables from QueryOptions by using the hasColumnSpecifications()
and getColumnSpecifications() methods.
- A new default assynchronous log appender debug.log was created in addition
to the system.log appender in order to provide more detailed log debugging.
In order to disable debug logging, you must comment-out the ASYNCDEBUGLOG
appender on conf/logback.xml. See CASSANDRA-10241 for more information.
2.2.1
=====
New features
------------
- COUNT(*) and COUNT(1) can be selected with other columns or functions
2.2
===
Upgrading
---------
- The authentication & authorization subsystems have been redesigned to
support role based access control (RBAC), resulting in a change to the
schema of the system_auth keyspace. See below for more detail.
For systems already using the internal auth implementations, the process
for converting existing data during a rolling upgrade is straightforward.
As each node is restarted, it will attempt to convert any data in the
legacy tables into the new schema. Until enough nodes to satisfy the
replication strategy for the system_auth keyspace are upgraded and so have
the new schema, this conversion will fail with the failure being reported
in the system log.
During the upgrade, Cassandra's internal auth classes will continue to use
the legacy tables, so clients experience no disruption. Issuing DCL
statements during an upgrade is not supported.
Once all nodes are upgraded, an operator with superuser privileges should
drop the legacy tables, system_auth.users, system_auth.credentials and
system_auth.permissions. Doing so will prompt Cassandra to switch over to
the new tables without requiring any further intervention.
While the legacy tables are present a restarted node will re-run the data
conversion and report the outcome so that operators can verify that it is
safe to drop them.
New features
------------
- The LIMIT clause applies now only to the number of rows returned to the user,
not to the number of row queried. By consequence, queries using aggregates will not
be impacted by the LIMIT clause anymore.
- Very large batches will now be rejected (defaults to 50kb). This
can be customized by modifying batch_size_fail_threshold_in_kb.
- Selecting columns,scalar functions, UDT fields, writetime or ttl together
with aggregated is now possible. The value returned for the columns,
scalar functions, UDT fields, writetime and ttl will be the ones for
the first row matching the query.
- Windows is now a supported platform. Powershell execution for startup scripts
is highly recommended and can be enabled via an administrator command-prompt
with: 'powershell set-executionpolicy unrestricted'
- It is now possible to do major compactions when using leveled compaction.
Doing that will take all sstables and compact them out in levels. The
levels will be non overlapping so doing this will still not be something
you want to do very often since it might cause more compactions for a while.
It is also possible to split output when doing a major compaction with
STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size.
This might be a bit better than old major compactions which created one big
file on disk.
- A new tool has been added bin/sstableverify that checks for errors/bitrot
in all sstables. Unlike scrub, this is a non-invasive tool.
- Authentication & Authorization APIs have been updated to introduce
roles. Roles and Permissions granted to them are inherited, supporting
role based access control. The role concept supercedes that of users
and CQL constructs such as CREATE USER are deprecated but retained for
compatibility. The requirement to explicitly create Roles in Cassandra
even when auth is handled by an external system has been removed, so
authentication & authorization can be delegated to such systems in their
entirety.
- In addition to the above, Roles are also first class resources and can be the
subject of permissions. Users (roles) can now be granted permissions on other
roles, including CREATE, ALTER, DROP & AUTHORIZE, which removesthe need for
superuser privileges in order to perform user/role management operations.
- Creators of database resources (Keyspaces, Tables, Roles) are now automatically
granted all permissions on them (if the IAuthorizer implementation supports
this).
- SSTable file name is changed. Now you don't have Keyspace/CF name
in file name. Also, secondary index has its own directory under parent's
directory.
- Support for user-defined functions and user-defined aggregates have
been added to CQL.
************************************************************************
IMPORTANT NOTE: user-defined functions can be used to execute
arbitrary and possibly evil code in Cassandra 2.2, and are
therefore disabled by default. To enable UDFs edit
cassandra.yaml and set enable_user_defined_functions to true.
CASSANDRA-9402 will add a security manager for UDFs in Cassandra
3.0. This will inherently be backwards-incompatible with any 2.2
UDF that perform insecure operations such as opening a socket or
writing to the filesystem.
************************************************************************
- Row-cache is now fully off-heap.
- jemalloc is now automatically preloaded and used on Linux and OS-X if
installed.
- Please ensure on Unix platforms that there is no libjnadispath.so
installed which is accessible by Cassandra. Old versions of
libjna packages (< 4.0.0) will cause problems - e.g. Debian Wheezy
contains libjna versin 3.2.x.
- The node now keeps up when streaming is failed during bootstrapping. You can
use new `nodetool bootstrap resume` command to continue streaming after resolving
an issue.
- Protocol version 4 specifies that bind variables do not require having a
value when executing a statement. Bind variables without a value are
called 'unset'. The 'unset' bind variable is serialized as the int
value '-2' without following bytes.
In an EXECUTE or BATCH request an unset bind value does not modify the value and
does not create a tombstone, an unset bind ttl is treated as 'unlimited',
an unset bind timestamp is treated as 'now', an unset bind counter operation
does not change the counter value.
Unset tuple field, UDT field and map key are not allowed.
In a QUERY request an unset limit is treated as 'unlimited'.
Unset WHERE clauses with unset partition column, clustering column
or index column are not allowed.
- New `ByteType` (cql tinyint). 1-byte signed integer
- New `ShortType` (cql smallint). 2-byte signed integer
- New `SimpleDateType` (cql date). 4-byte unsigned integer
- New `TimeType` (cql time). 8-byte long
- The toDate(timeuuid), toTimestamp(timeuuid) and toUnixTimestamp(timeuuid) functions have been added to allow
to convert from timeuuid into date type, timestamp type and bigint raw value.
The functions unixTimestampOf(timeuuid) and dateOf(timeuuid) have been deprecated.
- The toDate(timestamp) and toUnixTimestamp(timestamp) functions have been added to allow
to convert from timestamp into date type and bigint raw value.
- The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow
to convert from date into timestamp type and bigint raw value.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The default JVM flag -XX:+PerfDisableSharedMem will cause the following tools JVM
to stop working: jps, jstack, jinfo, jmc, jcmd as well as 3rd party tools like Jolokia.
If you wish to use these tools you can comment this flag out in cassandra-env.{sh,ps1}
Upgrading
---------
- Thrift rpc is no longer being started by default.
Set `start_rpc` parameter to `true` to enable it.
- Pig's CqlStorage has been removed, use CqlNativeStorage instead
- Pig's CassandraStorage has been deprecated. CassandraStorage
should only be used against tables created via thrift.
Use CqlNativeStorage for all other tables.
- IAuthenticator been updated to remove responsibility for user/role
maintenance and is now solely responsible for validating credentials,
This is primarily done via SASL, though an optional method exists for
systems which need support for the Thrift login() method.
- IRoleManager interface has been added which takes over the maintenance
functions from IAuthenticator. IAuthorizer is mainly unchanged. Auth data
in systems using the stock internal implementations PasswordAuthenticator
& CassandraAuthorizer will be automatically converted during upgrade,
with minimal operator intervention required. Custom implementations will
require modification, though these can be used in conjunction with the
stock CassandraRoleManager so providing an IRoleManager implementation
should not usually be necessary.
- Fat client support has been removed since we have push notifications to clients
- cassandra-cli has been removed. Please use cqlsh instead.
- YamlFileNetworkTopologySnitch has been removed; switch to
GossipingPropertyFileSnitch instead.
- CQL2 has been removed entirely in this release (previously deprecated
in 2.0.0). Please switch to CQL3 if you haven't already done so.
- The results of CQL3 queries containing an IN restriction will be ordered
in the normal order and not anymore in the order in which the column values were
specified in the IN restriction.
- Some secondary index queries with restrictions on non-indexed clustering
columns were not requiring ALLOW FILTERING as they should. This has been
fixed, and those queries now require ALLOW FILTERING (see CASSANDRA-8418
for details).
- The SSTableSimpleWriter and SSTableSimpleUnsortedWriter classes have been
deprecated and will be removed in the next major Cassandra release. You
should use the CQLSSTableWriter class instead.
- The sstable2json and json2sstable tools have been deprecated and will be
removed in the next major Cassandra release. See CASSANDRA-9618
(https://issues.apache.org/jira/browse/CASSANDRA-9618) for details.
- nodetool enablehandoff will no longer support a list of data centers starting
with the next major release. Two new commands will be added, enablehintsfordc and disablehintsfordc,
to exclude data centers from using hinted handoff when the global status is enabled.
In cassandra.yaml, hinted_handoff_enabled will no longer support a list of data centers starting
with the next major release. A new setting will be added, hinted_handoff_disabled_datacenters,
to exclude data centers when the global status is enabled, see CASSANDRA-9035 for details.
2.1.13
======
New features
------------
- New options for cqlsh COPY FROM and COPY TO, see CASSANDRA-9303 for details.
2.1.10
=====
New features
------------
- The syntax TRUNCATE TABLE X is now accepted as an alias for TRUNCATE X
2.1.9
=====
Upgrading
---------
- cqlsh will now display timestamps with a UTC timezone. Previously,
timestamps were displayed with the local timezone.
- Commit log files are no longer recycled by default, due to negative
performance implications. This can be enabled again with the
commitlog_segment_recycling option in your cassandra.yaml
- JMX methods set/getCompactionStrategyClass have been deprecated, use
set/getCompactionParameters/set/getCompactionParametersJson instead
2.1.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.7
=====
2.1.6
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.5
=====
Upgrading
---------
- The option to omit cold sstables with size tiered compaction has been
removed - it is almost always better to use date tiered compaction for
workloads that have cold data.
2.1.4
=====
Upgrading
---------
The default JMX config now listens to localhost only. You must enable
the other JMX flags in cassandra-env.sh manually.
2.1.3
=====
Upgrading
---------
- Prepending a list to a list collection was erroneously resulting in
the prepended list being reversed upon insertion. If you were depending
on this buggy behavior, note that it has been corrected.
- Incremental replacement of compacted SSTables has been disabled for this
release.
2.1.2
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
New features
------------
- Netty support for epoll on linux is now enabled. If for some
reason you want to disable it pass, the following system property
-Dcassandra.native.epoll.enabled=false
2.1
===
New features
------------
- Default data and log locations have changed. If not set in
cassandra.yaml, the data file directory, commitlog directory,
and saved caches directory will default to $CASSANDRA_HOME/data/data,
$CASSANDRA_HOME/data/commitlog, and $CASSANDRA_HOME/data/saved_caches,
respectively. The log directory now defaults to $CASSANDRA_HOME/logs.
If not set, $CASSANDRA_HOME, defaults to the top-level directory of
the installation.
Note that this should only affect source checkouts and tarballs.
Deb and RPM packages will continue to use /var/lib/cassandra and
/var/log/cassandra in cassandra.yaml.
- SSTable data directory name is slightly changed. Each directory will
have hex string appended after CF name, e.g.
ks/cf-5be396077b811e3a3ab9dc4b9ac088d/
This hex string part represents unique ColumnFamily ID.
Note that existing directories are used as is, so only newly created
directories after upgrade have new directory name format.
- Saved key cache files also have ColumnFamily ID in their file name.
- It is now possible to do incremental repairs, sstables that have been
repaired are marked with a timestamp and not included in the next
repair session. Use nodetool repair -par -inc to use this feature.
A tool to manually mark/unmark sstables as repaired is available in
tools/bin/sstablerepairedset. This is particularly important when
using LCS, or any data not repaired in your first incremental repair
will be put back in L0.
- Bootstrapping now ensures that range movements are consistent,
meaning the data for the new node is taken from the node that is no
longer a responsible for that range of keys.
If you want the old behavior (due to a lost node perhaps)
you can set the following property (-Dcassandra.consistent.rangemovement=false)
- It is now possible to use quoted identifiers in triggers' names.
WARNING: if you previously used triggers with capital letters in their
names, then you must quote them from now on.
- Improved stress tool (http://goo.gl/OTNqiQ)
- New incremental repair option (http://goo.gl/MjohJp, http://goo.gl/f8jSme)
- Incremental replacement of compacted SSTables (http://goo.gl/JfDBGW)
- The row cache can now cache only the head of partitions (http://goo.gl/6TJPH6)
- Off-heap memtables (http://goo.gl/YT7znJ)
- CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ... (http://goo.gl/kQl7GW)
Upgrading
---------
- commitlog_sync_batch_window_in_ms behavior has changed from the
maximum time to wait between fsync to the minimum time. We are
working on making this more user-friendly (see CASSANDRA-9533) but in the
meantime, this means 2.1 needs a much smaller batch window to keep
writer threads from starving. The suggested default is now 2ms.
- Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore
pre-2.0 sstables are not supported. This means that before upgrading
a node on 2.1, this node must be started on 2.0 and
'nodetool upgdradesstables' must be run (and this even in the case
of not-rolling upgrades).
- For size-tiered compaction users, Cassandra now defaults to ignoring
the coldest 5% of sstables. This can be customized with the
cold_reads_to_omit compaction option; 0.0 omits nothing (the old
behavior) and 1.0 omits everything.
- Multithreaded compaction has been removed.
- Counters implementation has been changed, replaced by a safer one with
less caveats, but different performance characteristics. You might have
to change your data model to accomodate the new implementation.
(See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the
blog post at http://goo.gl/qj8iQl for details).
- (per-table) index_interval parameter has been replaced with
min_index_interval and max_index_interval paratemeters. index_interval
has been deprecated.
- support for supercolumns has been removed from json2sstable
2.0.11
======
Upgrading
---------
- Nothing specific to this release, but refer to previous entries if you
are upgrading from a previous version.
New features
------------
- DateTieredCompactionStrategy added, optimized for time series data and groups
data that is written closely in time (CASSANDRA-6602 for details). Consider
this experimental for now.
2.0.10
======
New features
------------
- CqlPaginRecordReader and CqlPagingInputFormat have both been removed.
Use CqlInputFormat instead.
- If you are using Leveled Compaction, you can now disable doing size-tiered
compaction in L0 by starting Cassandra with -Dcassandra.disable_stcs_in_l0
(see CASSANDRA-6621 for details).
- Shuffle and taketoken have been removed. For clusters that choose to
upgrade to vnodes, creating a new datacenter with vnodes and migrating is
recommended. See http://goo.gl/Sna2S1 for further information.
2.0.9
=====
Upgrading
---------
- Default values for read_repair_chance and local_read_repair_chance have been
swapped. Namely, default read_repair_chance is now set to 0.0, and default
local_read_repair_chance to 0.1.
- Queries selecting only CQL static columns were (mistakenly) not returning one
result per row in the partition. This has been fixed and a SELECT DISTINCT
can be used when only the static column of a partition needs to be fetch
without fetching the whole partition. But if you use static columns, please
make sure this won't affect you (see CASSANDRA-7305 for details).
2.0.8
=====
New features
------------
- New snitches have been used for users of Google Compute Engine and of
Cloudstack.
Upgrading
---------
- Nothing specific to this release, but please see 2.0.7 if you are upgrading
from a previous version.
2.0.7
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.0.6 if you are upgrading
from a previous version.
2.0.6
=====
New features
------------
- CQL now support static columns, allows to batch multiple conditional updates
and has a new syntax for slicing over multiple clustering columns
(http://goo.gl/B6qz4j).
- Repair can be restricted to a set of nodes using the -hosts option in nodetool.
- A new 'nodetool taketoken' command relocate tokens with vnodes.
- Hinted handoff can be enabled only for some data-centers (see
hinted_handoff_enabled in cassandra.yaml)
Upgrading
---------
- Nothing specific to this release, but please see 2.0.5 if you are upgrading
from a previous version.
2.0.5
=====
New features
------------
- Batchlog replay can be, and is throttled by default now.
See batchlog_replay_throttle_in_kb setting in cassandra.yaml.
- Scrub can now optionally skip corrupt counter partitions. Please note
that this will lead to the loss of all the counter updates in the skipped
partition. See the --skip-corrupted option.
Upgrading
---------
- If your cluster began on a version before 1.2, check that your secondary
index SSTables are on version 'ic' before upgrading. If not, run
'nodetool upgradesstables' if on 1.2.14 or later, or run 'nodetool
upgradesstables ks cf' with the keyspace and secondary index named
explicitly otherwise. If you don't do this and upgrade to 2.0.x and it
refuses to start because of 'hf' version files in the secondary index,
you will need to delete/move them out of the way and recreate the index
when 2.0.x starts.
2.0.3
=====
New features
------------
- It's now possible to configure the maximum allowed size of the native
protocol frames (native_transport_max_frame_size_in_mb in the yaml file).
Upgrading
---------
- NaN and Infinity are new valid floating point constants in CQL3 and are now reserved
keywords. In the unlikely case you were using one of them as an identifier (for a
column, a keyspace or a table), you will now have to double-quote them (see
http://cassandra.apache.org/doc/cql3/CQL.html#identifiers for "quoted identifiers").
- The IEndpointStateChangeSubscriber has a new method, beforeChange, that
any custom implemenations using the class will need to implement.
2.0.2
=====
New features
------------
- Speculative retry defaults to 99th percentile
(See blog post at http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2)
- Configurable metrics reporting
(see conf/metrics-reporter-config-sample.yaml)
- Compaction history and stats are now saved to system keyspace
(system.compaction_history table). You can access historiy via
new 'nodetool compactionhistory' command or CQL.
Upgrading
---------
- Nodetool defaults to Sequential mode for repair operations
2.0.1
=====
Upgrading
---------
- The default memtable allocation has changed from 1/3 of heap to 1/4
of heap. Also, default (single-partition) read and write timeouts
have been reduced from 10s to 5s and 2s, respectively.
2.0.0
=====
Upgrading
---------
- Java 7 is now *required*!
- Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
goes for sstable compatibility as well as network. When
upgrading from an earlier release, upgrade to 1.2.9 first and
run upgradesstables before proceeding to 2.0.
- CAS and new features in CQL such as DROP COLUMN assume that cell
timestamps are microseconds-since-epoch. Do not use these
features if you are using client-specified timestamps with some
other source.
- Replication and strategy options do not accept unknown options anymore.
This was already the case for CQL3 in 1.2 but this is now the case for
thrift too.
- auto_bootstrap of a single-token node with no initial_token will
now pick a random token instead of bisecting an existing token
range. We recommend upgrading to vnodes; failing that, we
recommend specifying initial_token.
- reduce_cache_sizes_at, reduce_cache_capacity_to, and
flush_largest_memtables_at options have been removed from cassandra.yaml.
- CacheServiceMBean.reduceCacheSizes() has been removed.
Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
- authority option in cassandra.yaml has been deprecated since 1.2.0,
but it has been completely removed in 2.0. Please use 'authorizer' option.
- ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
typeAsBlob() conversion functions instead.
See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
- Inputting blobs as string constants is now fully deprecated in
favor of blob constants. Make sure to update your applications to use
the new syntax while you are still on 1.2 (which supports both string
and blob constants for blob input) before upgrading to 2.0.
- index_interval is now moved to ColumnFamily property. You can change value
with ALTER TABLE ... WITH statement and SSTables written after that will
have new value. When upgrading, Cassandra will pick up the value defined in
cassanda.yaml as the default for existing ColumnFamilies, until you explicitly
set the value for those.
- The deprecated native_transport_min_threads option has been removed in
Cassandra.yaml.
Operations
----------
- VNodes are enabled by default in cassandra.yaml. initial_token
for non-vnode deployments has been removed from the example
yaml, but is still respected if specified.
- Major compactions, cleanup, scrub, and upgradesstables will interrupt
any in-progress compactions (but not repair validations) when invoked.
- Disabling autocompactions by setting min/max compaction threshold to 0
has been deprecated, instead, use the nodetool commands 'disableautocompaction'
and 'enableautocompaction' or set the compaction strategy option enabled = false
- ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now.
See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
- CAS uses gc_grace_seconds to determine how long to keep unused paxos
state around for, or a minimum of three hours.
- A new hints created metric is tracked per target, replacing countPendingHints
- After performance testing for CASSANDRA-5727, the default LCS filesize
has been changed from 5MB to 160MB.
- cqlsh DESCRIBE SCHEMA no longer outputs the schema of system_* keyspaces;
use DESCRIBE FULL SCHEMA if you need the schema of system_* keyspaces.
- CQL2 has been deprecated, and will be removed entirely in 2.2. See
CASSANDRA-5918 for details.
- Commit log archiver now assumes the client time stamp to be in microsecond
precision, during restore. Please refer to commitlog_archiving.properties.
Features
--------
- Lightweight transactions
(http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)
- Alias support has been added to CQL3 SELECT statement. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for details.
- JEMalloc support (see memory_allocator in cassandra.yaml)
- Experimental triggers support. See examples/ for how to use. "Experimental"
means "tied closely to internal data structures; we plan to decouple this in
the future, which will probably break triggers written against this initial
API."
- Numerous improvements to CQL3 and a new version of the native protocol. See
http://www.datastax.com/dev/blog/cql-in-cassandra-2-0 for details.
1.2.11
======
Features
--------
- Added a new consistency level, LOCAL_ONE, that forces all CL.ONE operations to
execute only in the local datacenter.
- New replace_address to supplant the (now removed) replace_token and
replace_node workflows to replace a dead node in place. Works like the
old options, but takes the IP address of the node to be replaced.
1.2.9
=====
Features
--------
- A history of executed nodetool commands is now captured.
It can be found in ~/.cassandra/nodetool.history. Other tools output files
(cli and cqlsh history, .cqlshrc) are now centralized in ~/.cassandra, as well.
- A new sstablesplit utility allows to split large sstables offline.
1.2.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.2.7 if you are upgrading
from a previous version.
1.2.7
=====
Upgrading
---------
- If you have decommissioned a node in the past 72 hours, it is imperative
that you not upgrade until such time has passed, or do a full cluster
restart (not rolling) before beginning the upgrade. This only applies to
decommission, not removetoken.
1.2.6
=====
Upgrading
---------
- hinted_handoff_throttle_in_kb is now reduced by a factor
proportional to the number of nodes in the cluster (see
https://issues.apache.org/jira/browse/CASSANDRA-5272).
- CQL3 syntax for CREATE CUSTOM INDEX has been updated. See CQL3
documentation for details.
1.2.5
=====
Features
--------
- Custom secondary index support has been added to CQL3. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details and examples.
Upgrading
---------
- The native CQL transport is enabled by default on part 9042.
1.2.4
=====
Upgrading
---------
- 'nodetool upgradesstables' now only upgrades/rewrites sstables that are
not on the current version (which is usually what you want). Use the new
-a flag to recover the old behavior of rewriting all sstables.
Features
--------
- superuser setup delay (10 seconds) can now be overridden using
'cassandra.superuser_setup_delay_ms' property.
1.2.3
=====
Upgrading
---------
- CQL3 used to be case-insensitive for property map key in ALTER and CREATE
statements. In other words:
CREATE KEYSPACE test WITH replication = { 'CLASS' : 'SimpleStrategy',
'REPLICATION_FACTOR' : '1' }
was allowed. However, this was not consistent with the fact that string
literal are case sensitive in every other places and more importantly this
break NetworkTopologyStrategy for which DC names are case sensitive. Those
property map key are now case sensitive. So the statement above should be
changed to:
CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : '1' }
1.2.2
=====
Upgrading
---------
- CQL3 type validation for constants has been fixed, which may require
fixing queries that were relying on the previous loose validation. Please
refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
and in particular the changelog section for more details. Please note in
particular that inputing blobs as strings constants is now deprecated (in
favor of blob constants) and its support will be removed in a future
version.
Features
--------
- Built-in CQL3-based implementations of IAuthenticator (PasswordAuthenticator)
and IAuthorizer (CassandraAuthorizer) have been added. PasswordAuthenticator
stores usernames and hashed passwords in system_auth.credentials table;
CassandraAuthorizer stores permissions in system_auth.permissions table.
- system_auth keyspace is now alterable via ALTER KEYSPACE queries.
The default is SimpleStrategy with replication_factor of 1, but it's
advised to raise RF to at least 3 or 5, since CL.QUORUM is used for all
auth-related queries. It's also possible to change the strategy to NTS.
- Permissions caching with time-based expiration policy has been added to reduce
performance impact of authorization. Permission validity can be configured
using 'permissions_validity_in_ms' setting in cassandra.yaml. The default
is 2000 (2 seconds).
- SimpleAuthenticator and SimpleAuthorizer examples have been removed. Please
look at CassandraAuthorizer/PasswordAuthenticator instead.
1.2.1
=====
Upgrading
---------
- In CQL3, date string are no longer accepted as timeuuid value since a
date string is not a correct representation of a timeuuid. Instead, new
methods (minTimeuuid, maxTimeuuid, now, dateOf, unixTimestampOf) have been
introduced to make working on timeuuid from date string easy. cqlsh also
does not display timeuuid as date string (since this is a lossy
representation), but the new dateOf method can be used instead. Please
refer to the reference documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for more detail.
- For client implementors: CQL3 client using the thrift interface should
use the new execute_cql3_query, prepare_cql3_query and execute_prepared_cql3_query
since 1.2.0. However, Cassandra 1.2.0 was not complaining if CQL3 was set
through set_cql_version but the now CQL2 only methods were used. This is
now the case.
- Queries that uses unrecognized or bad compaction or replication strategy
options are now refused (instead of simply logging a warning).
1.2
===
Upgrading
---------
- IAuthenticator interface has been updated to support dynamic
user creation, modification and removal. Users, even when stored
externally, now have to be explicitly created using
CREATE USER query first. AllowAllAuthenticator and SimpleAuthenticator
have been updated for the new interface, but you'll have to update
your old IAuthenticator implementations for 1.2. To ease this process,
a new abstract LegacyAuthenticator class has been added - subclass it
in your old IAuthenticator implementaion and everything should just work
(this only affects users who implemented custom authenticators).
- IAuthority interface has been deprecated in favor of IAuthorizer.
AllowAllAuthority and SimpleAuthority have been renamed to
AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to
simplify the upgrade to the new interface, a new abstract
LegacyAuthorizer has been added - you should subclass it in your
old IAuthority implementation and everything should just work
(this only affects users who implemented custom authorities).
'authority' setting in cassandra.yaml has been renamed to 'authorizer',
'authority' is no longer recognized. This affects all upgrading users.
- 1.2 is NOT network-compatible with versions older than 1.0. That
means if you want to do a rolling, zero-downtime upgrade, you'll need
to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains
the ability to read data files from Cassandra versions at least
back to 0.6, so a non-rolling upgrade remains possible with just
one step.
- The default partitioner for new clusters is Murmur3Partitioner,
which is about 10% faster for index-intensive workloads. Partitioners
cannot be changed once data is in the cluster, however, so if you are
switching to the 1.2 cassandra.yaml, you should change this to
RandomPartitioner or whatever your old partitioner was.
- If you using counters and upgrading from a version prior to
1.1.6, you should drain existing Cassandra nodes prior to the
upgrade to prevent overcount during commitlog replay (see
CASSANDRA-4782). For non-counter uses, drain is not required
but is a good practice to minimize restart time.
- Tables using LeveledCompactionStrategy will default to not
creating a row-level bloom filter. The default in older versions
of Cassandra differs; you should manually set the false positive
rate to 1.0 (to disable) or 0.01 (to enable, if you make many
requests for rows that do not exist).
- The hints schema was changed from 1.1 to 1.2. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 1.2 for the first time. Additionally, upgraded nodes
will not store new hints destined for older (pre-1.2) nodes. It is
therefore recommended that you perform a cluster upgrade when all
nodes are up. Because hints will be lost, a cluster-wide repair (with
-pr) is recommended after upgrade of all nodes.
- The `nodetool removetoken` command (and corresponding JMX operation)
have been renamed to `nodetool removenode`. This function is
incompatible with the earlier `nodetool removetoken`, and attempts to
remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster,
is not supported.
- The somewhat ill-conceived CollatingOrderPreservingPartitioner
has been removed. Use Murmur3Partitioner (recommended) or
ByteOrderedPartitioner instead.
- Global option hinted_handoff_throttle_delay_in_ms has been removed.
hinted_handoff_throttle_in_kb has been added instead.
- The default bloom filter fp chance has been increased to 1%.
This will save about 30% of the memory used by the old default.
Existing columnfamilies will retain their old setting.
- The default partitioner (for new clusters; the partitioner cannot be
changed in existing clusters) was changed from RandomPartitioner to
Murmur3Partitioner which provides faster hashing as well as improved
performance with secondary indexes.
- The default version of CQL (and cqlsh) is now CQL3. CQL2 is still
available but you will have to use the thrift set_cql_version method
(that is already supported in 1.1) to use CQL2. For cqlsh, you will need
to use 'cqlsh -2'.
- CQL3 is now considered final in this release. Compared to the beta
version that is part of 1.1, this final version has a few additions
(collections), but also some (incompatible) changes in the syntax for the
options of the create/alter keyspace/table statements. Typically, the
syntax to create a keyspace is now:
CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : 2 };
Also, the consistency level cannot be set in the language anymore, but is
at the protocol level.
Please refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details.
- In CQL3, the DROP behavior from ALTER TABLE has currently been removed
(because it was not correctly implemented). We hope to add it back soon
(Cassandra 1.2.1 or 1.2.2)
Features
--------
- Cassandra can now handle concurrent CREATE TABLE schema changes
as well as other updates
- rpc_timeout has been split up to allow finer-grained control
on timeouts for different operation types
- num_tokens can now be specified in cassandra.yaml. This defines the
number of tokens assigned to the host on the ring (default: 1).
Also specifying initial_token will override any num_tokens setting.
- disk_failure_policy allows blacklisting failed disks in JBOD
configuration instead of erroring out indefinitely
- event tracing can be configured per-connection ("trace_next_query")
or globally/probabilistically ("nodetool settraceprobability")
- Atomic batches are now supported server side, where Cassandra will
guarantee that (at the price of pre-writing the batch to another node
first), all mutations in the batch will be applied, even if the
coordinator fails mid-batch.
- new IAuthorizer interface has replaced the old IAuthority. IAuthorizer
allows dynamic permission management via new CQL3 statements:
GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing
the permissions in Cassandra is being worked on and we expect to
include it in 1.2.1 or 1.2.2.
- IAuthenticator interface has been updated to support dynamic user
creation, modification and removal via new CQL3 statements:
CREATE USER, ALTER USER, DROP USER, LIST USERS. A native implementation
that stores users in Cassandra itself is being worked on and is expected to
become part of 1.2.1 or 1.2.2.
1.1.5
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.4
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.3
=====
Upgrading
---------
- Running "nodetool upgradesstables" after upgrading is recommended
if you use Counter columnfamilies.
Features
--------
- the cqlsh COPY command can now export to CSV flat files
- added a new tools/bin/token-generator to facilitate generating evenly distributed tokens
1.1.2
=====
Upgrading
---------
- If you have column families using the LeveledCompactionStrategy, you should run scrub on those column families.
Features
--------
- cqlsh has a new COPY command to load data from CSV flat files
1.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
Features
--------
- Continuous commitlog archiving and point-in-time recovery.
See conf/commitlog_archiving.properties
- Incremental repair by token range, exposed over JMX
1.1
===
Upgrading
---------
- Compression is enabled by default on newly created ColumnFamilies
(and unchanged for ColumnFamilies created prior to upgrading).
- If you are running a multi datacenter setup, you should upgrade to
the latest 1.0.x (or 0.8.x) release before upgrading. Versions
0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible
with 1.1.
- EACH_QUORUM ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a LOCAL_QUORUM read instead.)
- ANY ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a ONE read for range queries;
single-row and multiget reads already rejected ANY.)
- The largest mutation batch accepted by the commitlog is now 128MB.
(In practice, batches larger than ~10MB always caused poor
performance due to load volatility and GC promotion failures.)
Larger batches will continue to be accepted but will not be
durable. Consider setting durable_writes=false if you really
want to use such large batches.
- Make sure that global settings: key_cache_{size_in_mb, save_period}
and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
used instead of per-ColumnFamily options.
- JMX methods no longer return custom Cassandra objects. Any such methods
will now return standard Maps, Lists, etc.
- Hadoop input and output details are now separated. If you were
previously using methods such as getRpcPort you now need to use
getInputRpcPort or getOutputRpcPort depending on the circumstance.
- CQL changes:
+ Prior to 1.1, you could use KEY as the primary key name in some
select statements, even if the PK was actually given a different
name. In 1.1+ you must use the defined PK name.
- The sliced_buffer_size_in_kb option has been removed from the
cassandra.yaml config file (this option was a no-op since 1.0).
Features
--------
- Concurrent schema updates are now supported, with any conflicts
automatically resolved. Please note that simultaneously running
‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t
be safe until version 1.2 due to the nature of ColumnFamily
identifier generation, for more details see CASSANDRA-3794.
- The CQL language has undergone a major revision, CQL3, the
highlights of which are covered at [1]. CQL3 is not
backwards-compatibile with CQL2, so we've introduced a
set_cql_version Thrift method to specify which version you want.
(The default remains CQL2 at least until Cassandra 1.2.) cqlsh
adds a --cql3 flag to enable this.
[1] http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
- Row-level isolation: multi-column updates to a single row have
always been *atomic* (either all will be applied, or none)
thanks to the CommitLog, but until 1.1 they were not *isolated*
-- a reader may see mixed old and new values while the update
happens.
- Finer-grained control over data directories, allowing a ColumnFamily to
be pinned to specfic volume, e.g. one backed by SSD.
- The bulk loader is not longer a fat client; it can be run from an
existing machine in a cluster.
- A new write survey mode has been added, similar to bootstrap (enabled via
-Dcassandra.write_survey=true), but the node will not automatically join
the cluster. This is useful for cases such as testing different
compaction strategies with live traffic without affecting the cluster.
- Key and row caches are now global, similar to the global memtable
threshold. Manual tuning of cache sizes per-columnfamily is no longer
required.
- Off-heap caches no longer require JNA, and will work out of the box
on Windows as well as Unix platforms.
- Streaming is now multithreaded.
- Compactions may now be aborted via JMX or nodetool.
- The stress tool is not new in 1.1, but it is newly included in
binary builds as well as the source tree
- Hadoop: a new BulkOutputFormat is included which will directly write
SSTables locally and then stream them into the cluster.
YOU SHOULD USE BulkOutputFormat BY DEFAULT. ColumnFamilyOutputFormat
is still around in case for some strange reason you want results
trickling out over Thrift, but BulkOutputFormat is significantly
more efficient.
- Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat,
allowing index expressions to be evaluated server-side to reduce
the amount of data sent to Hadoop.
- Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via
a boolean parameter to setInputColumnFamily, that pages through
data column-at-a-time instead of row-at-a-time.
- Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT
to true. This will produce each row's columns in a bag.
1.0.8
=====
Upgrading
---------
- Nothing specific to 1.0.8
Other
-----
- Allow configuring socket timeout for streaming
1.0.7
=====
Upgrading
---------
- Nothing specific to 1.0.7, please report to instruction for 1.0.6
Other
-----
- Adds new setstreamthroughput to nodetool to configure streaming
throttling
- Adds JMX property to get/set rpc_timeout_in_ms at runtime
- Allow configuring (per-CF) bloom_filter_fp_chance
1.0.6
=====
Upgrading
---------
- This release fixes an issue related to the chunk_length_kb option for
compressed sstables. If you use compression on some column families, it
is recommended after the upgrade to check the value for this option on
these column families (the default value is 64). In case the option would
not be set correctly, you should update the column family definition,
setting the right value and then run scrub on the column family.
- Please report to instruction for 1.0.5 if coming from an older version.
1.0.5
=====
Upgrading
---------
- 1.0.5 comes to fix two important regression of 1.0.4. So all information
concerning 1.0.4 are valid for this release, but please avoids upgrading
to 1.0.4.
1.0.4
=====
Upgrading
---------
- Nothing specific to 1.0.4 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- A new upgradesstables command has been added to nodetool. It is very
similar to scrub but without the ability to discard corrupted rows (and
as a consequence it does not snapshot automatically before). This new
command is to be prefered to scrub in all cases where sstables should be
rewritten to the current format for upgrade purposes.
JMX
---
- The path for the data, commit log and saved cache directories exposed
through JMX
- The in-memory bloom filter sizes are now exposed through JMX
1.0.3
=====
Upgrading
---------
- Nothing specific to 1.0.3 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- For non compressed sstables (compressed sstable already include more
fine grained checsums), a sha1 for the full sstable is now automatically
created (in a fix with suffix -Digest.sha1). It can be used to check the
sstable integrity with sha1sum.
1.0.2
=====
Upgrading
---------
- Nothing specific to 1.0.2 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- Cassandra CLI queries now have timing information
1.0.1
=====
Upgrading
---------
- If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading
section
- For running on Windows as a Service, procrun is no longer discributed
with Cassandra, see README.txt for more information on how to download
it if necessary.
- The name given to snapshots directories have been improved for human
readability. If you had scripts relying on it, you may need to update
them.
1.0
===
Upgrading
---------
- Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible
with 1.0: upgrade to the most recent 0.8 release first.)
You do not need to bring down the whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- CQL inserts/updates now generate microsecond resolution timestamps
by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD
MIX milliseconds and microseconds, with clients talking to servers
generating milliseconds unable to overwrite the larger microsecond
timestamps. If you are using CQL and this is important for your
application, you can either perform a non-rolling upgrade to 1.0, or
update your application first to use explicit timestamps with the "USING
timestamp=X" syntax.
- The BinaryMemtable bulk-load interface has been removed (use the
sstableloader tool instead).
- The compaction_thread_priority setting has been removed from
cassandra.yaml (use compaction_throughput_mb_per_sec to throttle
compaction instead).
- CQL types bytea and date were renamed to blob and timestamp, respectively,
to conform with SQL norms. CQL type int is now a 4-byte int, not 8
(which is still available as bigint).
- Cassandra 1.0 uses arena allocation to reduce old generation
fragmentation. This means there is a minimum overhead of 1MB
per ColumnFamily plus 1MB per index.
- The SimpleAuthenticator and SimpleAuthority classes have been moved to
the example directory (and are thus not available from the binary
distribution). They never provided actual security and in their current
state are only meant as examples.
Features
--------
- SSTable compression is supported through the 'compression_options'
parameter when creating/updating a column family. For instance, you can
create a column family Cf using compression (through the Snappy library)
in the CLI with:
create column family Cf with compression_options={sstable_compression: SnappyCompressor}
SSTable compression is not activated by default but can be activated or
deactivated at any time.
- Compressed SSTable blocks are checksummed to protect against bitrot
- New LevelDB-inspired compaction algorithm can be enabled by setting the
Columnfamily compaction_strategy=LeveledCompactionStrategy option.
Leveled compaction means you only need to keep a few MB of space free for
compaction instead of (in the worst case) 50%.
- Ability to use multiple threads during a single compaction. See
multithreaded_compaction in cassandra.yaml for more details.
- Windows Service ("cassandra.bat install" to enable)
- A dead node may be replaced in a single step by starting a new node
with -Dcassandra.replace_token=<token>. More details can be found at
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
- It is now possible to repair only the first range returned by the
partitioner for a node with `nodetool repair -pr`. It makes it
easier/possible to repair a full cluster without any work duplication by
running this command on every node of the cluster.
New data types
--------------
- decimal
Other
-----
- Hinted Handoff has two major improvements:
- Hint replay is much more efficient thanks to a change in the data model
- Hints are created for all replicas that do not ack a write. (Formerly,
only replicas known to be down when the write started were hinted.)
This means that running with read repair completely off is much more
viable than before, and the default read_repair_chance is reduced from 1.0
("always repair") to 0.1 ("repair 10% of the time").
- The old per-ColumnFamily memtable thresholds
(memtable_throughput_in_mb, memtable_operations_in_millions,
memtable_flush_after_mins) are ignored, in favor of the global
memtable_total_space_in_mb and commitlog_total_space_in_mb settings.
This does not affect client compatibility -- the old options are
still allowed, but have no effect. These options may be removed
entirely in a future release.
- Backlogged compactions will begin five minutes after startup. The 0.8
behavior of never starting compaction until a flush happens is usually
not what is desired, but a short grace period is useful to allow caches
to warm up first.
- The deletion of compacted data files is not performed during Garbage
Collection anymore. This means compacted files will now be deleted
without delay.
0.8.5
=====
Features
--------
- SSTables copied to a data directory can be loaded by a live node through
nodetool refresh (may be handy to load snapshots).
- The configured compaction throughput is exposed through JMX.
Other
-----
- The sstableloader is now bundled with the debian package.
- Repair detects when a participating node is dead and fails instead of
hanging forever.
0.8.4
=====
Upgrading
---------
- Nothing specific to 0.8.4
Other
-----
- This release comes to fix a bug in counter that could lead to
(important) over-count.
- It also fixes a slight upgrade regression from 0.8.3. It is thus advised
to jump directly to 0.8.4 if upgrading from before 0.8.3.
0.8.3
=====
Upgrading
---------
- Token removal has been revamped. Removing tokens in a mixed cluster with
0.8.3 will not work, so the entire cluster will need to be running 0.8.3
first, except for the dead node.
Features
--------
- It is now possible to use thrift asynchronous and
half-synchronous/half-asynchronous servers (see cassandra.yaml for more
details).
- It is now possible to access counter columns through Hadoop.
Other
-----
- This release fix a regression of 0.8 that can make commit log segment to
be deleted even though not all data it contains has been flushed.
Upgrades from 0.8.* is very much encouraged.
0.8.2
=====
Upgrading
---------
- 0.8.0 and 0.8.1 shipped with a bug that was setting the
replicate_on_write option for counter column families to false (this
option has no effect on non-counter column family). This is an unsafe
default and 0.8.2 correct this, the default for replicate_on_write is
now true. It is advised to update your counter column family definitions
if replicate_on_write was uncorrectly set to false (before or after
upgrade).
0.8.1
=====
Upgrading
---------
- 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a
simple rolling restart.
- If upgrading for earlier version (0.7), please refer to the 0.8 section
for instructions.
Features
--------
- Numerous additions/improvements to CQL (support for counters, TTL, batch
inserts/deletes, index dropping, ...).
- Add two new AbstractTypes (comparator) to support compound keys
(CompositeType and DynamicCompositeType), as well as a ReverseType to
reverse the order of any existing comparator.
- New option to bypass the commit log on some keyspaces (for advanced
users).
Tools
-----
- Add new data bulk loading utility (sstableloader).
0.8
===
Upgrading
---------
- Upgrading from version 0.7.1 or later can be done with a rolling
restart, one node at a time. You do not need to bring down the
whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- Running nodetool drain before shutting down the 0.7 node is
recommended but not required. (Skipping this will result in
replay of entire commitlog, so it will take longer to restart but
is otherwise harmless.)
- 0.8 is fully API-compatible with 0.7. You can continue
to use your 0.7 clients.
- Avro record classes used in map/reduce and Hadoop streaming code have
been removed. Map/reduce can be switched to Thrift by changing
org.apache.cassandra.avro in import statements to
org.apache.cassandra.thrift (no class names change). Streaming support
has been removed for the time being.
- The loadbalance command has been removed from nodetool. For similar
behavior, decommission then rebootstrap with empty initial_token.
- Thrift unframed mode has been removed.
- The addition of key_validation_class means the cli will assume keys
are bytes, instead of strings, in the absence of other information.
See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details.
Features
--------
- added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
Python, respectively (see: drivers/ subdirectory and doc/cql)
- added distributed Counters feature;
see http://wiki.apache.org/cassandra/Counters
- optional intranode encryption; see comments around 'encryption_options'
in cassandra.yaml
- compaction multithreading and rate-limiting; see
'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
cassandra.yaml
- cassandra will limit total memtable memory usage to 1/3 of the heap
by default. This can be ajusted or disabled with the
memtable_total_space_in_mb option. The old per-ColumnFamily
throughput, operations, and age settings are still respected but
will be removed in a future major release once we are satisfied that
memtable_total_space_in_mb works adequately.
Tools
-----
- stress and py_stress moved from contrib/ to tools/
- clustertool was removed (see
https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
of how to script nodetool across the cluster instead)
Other
-----
- In the past, sstable2json would write column names and values as
hex strings, and now creates human readable values based on the
comparator/validator. As a result, JSON dumps created with
older versions of sstable2json are no longer compatible with
json2sstable, and imports must be made with a configuration that
is identical to the export.
- manually-forced compactions ("nodetool compact") will do nothing
if only a single SSTable remains for a ColumnFamily. To force it
to compact that anyway (which will free up space if there are
a lot of expired tombstones), use the new forceUserDefinedCompaction
JMX method on CompactionManager.
- most of contrib/ (which was not part of the binary releases)
has been moved either to examples/ or tools/. We plan to move the
rest for 0.8.1.
JMX
---
- By default, JMX now listens on port 7199.
0.7.6
=====
Upgrading
---------
- Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
0.7.5
=====
Upgrading
---------
- Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Changes
-------
- system_update_column_family no longer snapshots before applying
the schema change. (_update_keyspace never did. _drop_keyspace
and _drop_column_family continue to snapshot.)
- added memtable_flush_queue_size option to cassandra.yaml to
avoid blocking writes when multiple column families (or a colum
family with indexes) are flushed at the same time.
- allow overriding initial_token, storage_port and rpc_port using
system properties
0.7.4
=====
Upgrading
---------
- Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Features
--------
- Output to Pig is now supported as well as input
0.7.3
=====
Upgrading
---------
- 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
bloom filters to be generated when compacting sstables generated
with earlier versions. This would manifest in IOExceptions during
column name-based queries. 0.7.3 provides "nodetool scrub" to
rebuild sstables with correct bloom filters, with no data lost.
(If your cluster was never on 0.7.0 or earlier, you don't have to
worry about this.) Note that nodetool scrub will snapshot your
data files before rebuilding, just in case.
0.7.1
=====
Upgrading
---------
- 0.7.1 is completely backwards compatible with 0.7.0. Just restart
each node with the new version, one at a time. (The cluster does
not all need to be upgraded simultaneously.)
Features
--------
- added flush_largest_memtables_at and reduce_cache_sizes_at options
to cassandra.yaml as an escape valve for memory pressure
- added option to specify -Dcassandra.join_ring=false on startup
to allow "warm spare" nodes or performing JMX maintenance before
joining the ring
Performance
-----------
- Disk writes and sequential scans avoid polluting page cache
(requires JNA to be enabled)
- Cassandra performs writes efficiently across datacenters by
sending a single copy of the mutation and having the recipient
forward that to other replicas in its datacenter.
- Improved network buffering
- Reduced lock contention on memtable flush
- Optimized supercolumn deserialization
- Zero-copy reads from mmapped sstable files
- Explicitly set higher JVM new generation size
- Reduced i/o contention during saving of caches
0.7.0
=====
Features
--------
- Secondary indexes (indexes on column values) are now supported
- Row size limit increased from 2GB to 2 billion columns. rows
are no longer read into memory during compaction.
- Keyspace and ColumnFamily definitions may be added and modified live
- Streaming data for repair or node movement no longer requires
anticompaction step first
- NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for
use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments
in `cassandra.yaml.`
- Optional per-Column time-to-live field allows expiring data without
have to issue explicit remove commands
- `truncate` thrift method allows clearing an entire ColumnFamily at once
- Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out]
support
- Up to 8x faster reads from row cache
- A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
and orders keys by their byte value. This should be used in new
deployments instead of OrderPreservingPartitioner.
- Optional round-robin scheduling between keyspaces for multitenant
clusters
- Dynamic endpoint snitch mitigates the impact of impaired nodes
- New `IntegerType`, faster than LongType and allows integers of
both less and more bits than Long's 64
- A revamped authentication system that decouples authorization and
allows finer-grained control of resources.
Upgrading
---------
The Thrift API has changed in incompatible ways; see below, and refer
to http://wiki.apache.org/cassandra/ClientOptions for a list of
higher-level clients that have been updated to support the 0.7 API.
The Cassandra inter-node protocol is incompatible with 0.6.x
releases (and with 0.7 beta1), meaning you will have to bring your
cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
The hints schema was changed from 0.6 to 0.7. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 0.7 for the first time.
Keyspace and ColumnFamily definitions are stored in the system
keyspace, rather than the configuration file.
The process to upgrade is:
1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log
message "Node is drained" appears), stop the process.
2) Convert your storage-conf.xml to the new cassandra.yaml using
"bin/config-converter".
3) Rename any of your keyspace or column family names that do not adhere
to the '^\w+' regex convention.
4) Start up your cluster with the 0.7 version.
5) Initialize your Keyspace and ColumnFamily definitions using
"bin/schematool <host> <jmxport> import". _You only need to do
this to one node_.
Thrift API
----------
- The Cassandra server now defaults to framed mode, rather than
unframed. Unframed is obsolete and will be removed in the next
major release.
- The Cassandra Thrift interface file has been updated for Thrift 0.5.
If you are compiling your own client code from the interface, you
will need to upgrade the Thrift compiler to match.
- Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
CollatingOrderPreservingPartitioner continue to expect that keys contain
UTF-8 encoded strings, but RandomPartitioner now works on any key data.
- keyspace parameters have been replaced with the per-connection
set_keyspace method.
- The return type for login() is now AccessLevel.
- The get_string_property() method has been removed.
- The get_string_list_property() method has been removed.
Configuraton
------------
- Configuration file renamed to cassandra.yaml and log4j.properties to
log4j-server.properties
- PropertyFileSnitch configuration file renamed to
cassandra-topology.properties
- The ThriftAddress and ThriftPort directives have been renamed to
RPCAddress and RPCPort respectively.
- EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch
has been added.
- RackUnawareStrategy and RackAwareStrategy have been renamed to
SimpleStrategy and OldNetworkTopologyStrategy, respectively.
- RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
- GCGraceSeconds is now per-ColumnFamily instead of global
- Keyspace and column family names that do not confirm to a '^\w+' regex
are considered illegal.
- Keyspace and column family definitions will need to be loaded via
"bin/schematool <host> <jmxport> import". _You only need to do this to
one node_.
- In addition to an authenticator, an authority must be configured as
well. Users of SimpleAuthenticator should use SimpleAuthority for this
value (the default is AllowAllAuthority, which corresponds with
AllowAllAuthenticator).
- The format of access.properties has changed, see the sample configuration
conf/access.properties for documentation on the new format.
JMX
---
- StreamingService moved from o.a.c.streaming to o.a.c.service
- GMFD renamed to GOSSIP_STAGE
- {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
since it no longer has to wait til compaction to be computed
Other
-----
- If extending AbstractType, make sure you follow the singleton pattern
followed by Cassandra core AbstractType classes: provide a public
static final variable called 'instance'.
0.6.6
=====
Upgrading
---------
- As part of the cache-saving feature, a third directory
(along with data and commitlog) has been added to the config
file. You will need to set and create this directory
when restarting your node into 0.6.6.
0.6.1
=====
Upgrading
---------
- We try to keep minor versions 100% compatible (data format,
commitlog format, network format) within the major series, but
we introduced a network-level incompatibility in 0.6.1.
Thus, if you are upgrading from 0.6.0 to any higher version
(0.6.1, 0.6.2, etc.) then you will need to restart your entire
cluster with the new version, instead of being able to do a
rolling restart.
0.6.0
=====
Features
--------
- row caching: configure with the RowsCached attribute in
ColumnFamily definition
- Hadoop map/reduce support: see contrib/word_count for an example
- experimental authentication support, described under
Authenticator in storage.conf
Configuraton
------------
- MemtableSizeInMB has been replaced by MemtableThroughputInMB which
triggers a memtable flush when the specified amount of data has
been written, including overwrites.
- MemtableObjectCountInMillions has been replaced by the
MemtableOperationsInMillions directive which causes a memtable flush
to occur after the specified number of operations.
- Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by
BinaryMemtableThroughputInMB.
- Replication factor is now per-keyspace, rather than global.
- KeysCachedFraction is deprecated in favor of KeysCached
- RowWarningThresholdInMB added, to warn before very large rows
get big enough to threaten node stability
Thrift API
----------
- removed deprecated get_key_range method
- added batch_mutate meethod
- deprecated multiget and batch_insert methods in favor of
multiget_slice and batch_mutate, respectively
- added ConsistencyLevel.ANY, for when you want write
availability even when it may not be readable immediately.
Unlike CL.ZERO, though, it will throw an exception if
it cannot be written *somewhere*.
JMX metrics
-----------
- read and write statistics are reported as lifetime totals,
instead of averages over the last minute. average-since-last
requested are also available for convenience.
- cache hit rate statistics are now available from JMX under
org.apache.cassandra.db.Caches
- compaction JMX metrics are moved to
org.apache.cassandra.db.CompactionManager. PendingTasks is now
a much better estimate of compactions remaining, and the
progress of the current compaction has been added.
- commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
- progress of data streaming during bootstrap, loadbalance, or other
data migration, is available under
org.apache.cassandra.streaming.StreamingService.
See http://wiki.apache.org/cassandra/Streaming for details.
Installation/Upgrade
--------------------
- 0.6 network traffic is not compatible with earlier versions. You
will need to shut down all your nodes at once, upgrade, then restart.
0.5.0
=====
0. The commitlog format has changed (but sstable format has not).
When upgrading from 0.4, empty the commitlog either by running
bin/nodeprobe flush on each machine and waiting for the flush to finish,
or simply remove the commitlog directory if you only have test data.
(If more writes come in after the flush command, starting 0.5 will error
out; if that happens, just go back to 0.4 and flush again.)
The format changed twice: from 0.4 to beta1, and from beta2 to RC1.
.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist
in a cluster of 0.4 nodes or vice versa; you must upgrade your
whole cluster at the same time.
1. Bootstrap, move, load balancing, and active repair have been added.
See http://wiki.apache.org/cassandra/Operations. When upgrading
from 0.4, leave autobootstrap set to false for the first restart
of your old nodes.
2. Performance improvements across the board, especially on the write
path (over 100% improvement in stress.py throughput).
3. Configuration:
- Added "comment" field to ColumnFamily definition.
- Added MemtableFlushAfterMinutes, a global replacement for the
old per-CF FlushPeriodInMinutes setting
- Key cache settings
4. Thrift:
- Added get_range_slice, deprecating get_key_range
0.4.2
=====
1. Improve default garbage collector options significantly --
throughput will be 30% higher or more.
0.4.1
=====
1. SnapshotBeforeCompaction configuration option allows snapshotting
before each compaction, which allows rolling back to any version
of the data.
0.4.0
=====
1. On-disk data format has changed to allow billions of keys/rows per
node instead of only millions. The new format is incompatible with 0.3;
see 0.3 notes below for how to import data from a 0.3 install.
2. Cassandra now supports multiple keyspaces. Typically you will have
one keyspace per application, allowing applications to be able to
create and modify ColumnFamilies at will without worrying about
collisions with others in the same cluster.
3. Many Thrift API changes and documentation. See
http://wiki.apache.org/cassandra/API
4. Removed the web interface in favor of JMX and bin/nodeprobe, which
has significantly enhanced functionality.
5. Renamed configuration "<Table>" to "<Keyspace>".
6. Added commitlog fsync; see "<CommitLogSync>" in configuration.
0.3.0
=====
1. With enough and large enough keys in a ColumnFamily, Cassandra will
run out of memory trying to perform compactions (data file merges).
The size of what is stored in memory is (S + 16) * (N + M) where S
is the size of the key (usually 2 bytes per character), N is the
number of keys and M, is the map overhead (which can be guestimated
at around 32 bytes per key).
So, if you have 10-character keys and 1GB of headroom in your heap
space for compaction, you can expect to store about 17M keys
before running into problems.
See https://issues.apache.org/jira/browse/CASSANDRA-208
2. Because fixing #1 requires a data file format change, 0.4 will not
be binary-compatible with 0.3 data files. A client-side upgrade
can be done relatively easily with the following algorithm:
for key in old_client.get_key_range(everything):
columns = old_client.get_slice or get_slice_super(key, all columns)
new_client.batch_insert or batch_insert_super(key, columns)
The inner loop can be trivially parallelized for speed.
3. Commitlog does not fsync before reporting a write successful.
Using blocking writes mitigates this to some degree, since all
nodes that were part of the write quorum would have to fail
before sync for data to be lost.
See https://issues.apache.org/jira/browse/CASSANDRA-182
Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, because
compaction deserializes each row before merging.
See https://issues.apache.org/jira/browse/CASSANDRA-16
DSE 5.1.0のリリース・ノート
2017年4月18日
- 5.1のコンポーネント
- 5.1の新機能
- RNdse.html#RNdse510__510experimental
- 5.1の変更点と機能強化
- 5.1の既知の問題点
- 5.1の解決済みの問題点
- 5.1のCHANGES.txt
- 5.1のNEWS.txt
5.1.0のコンポーネント
- Apache Cassandra™ 3.10.0.1652
- Apache Solr™ 6.0.1.0.1596
- Apache Spark™ 2.0.2.6
- Apache Tomcat® 8.0.37
- DataStax Spark Cassandra Connector 2.0.1
- DSE Java Driver 1.2.2
- DSEFS 5.1.24
- Netty 4.0.42.Final
- Spark Jobserver 0.6.2.234(互換性のあるAPIが必要)
- TinkerPop 3.2.5-20170222-de2f4034
- 特定のHadoopライブラリ
5.1.0の新機能
「DataStax Enterprise 5.1の新機能」を参照してください。
5.1.0の試験段階の機能
- DSE Graphでスーパーノードの処理に使用するパーティション分割された頂点テーブル(PVT)。
エッジの数が非常に多い頂点に使用する場合、パーティション分割された頂点は、頂点をグラフ・データベース・ストレージ用に小さいコンポーネントに分割することで得られる頂点のデータの一部で構成されます。
- DseGraphFrameを使用したグラフのインポート。
- dsetool index_checksで、Apache Lucene®の試験段階の機能を使用する。
- SASIインデックス。
- DSEFSとの間の構造化ストリーミング操作でSpark ALPHA機能を使用する。
- 複数のデータ・センターにまたがるDSEFSファイル・システム。
- OpsCenterのラボ機能。
5.1.0の変更点と機能強化
5.1.0 DataStax Enterpriseの変更点と機能強化
- DSE認証モデルにプロキシ認証が追加されました。(DSP-3800)、(DSP-8467)
- TimeWindowCompactionStrategy(TWCS)がdse_perfテーブルで設定されるようになりました。以前のリリースで作成されたテーブルでTWCSを使用する場合は、DSE 5.1にアップグレードした後にテーブルを変更してください。(DSP-5560)
- MemoryOnlyStrategyを圧縮で使用できるようになりました。(DSP-6715)
- 削除されたミューテーションのメトリクスがパフォーマンス・オブジェクトに追加されました。(DSP-7936)
- DSEサーバーの起動時間が改善されました。(DSP-9545)
- DateTieredStorageStrategyは廃止予定です。代わりに、TimeWindowStorageStrategyを使用してください。(DSP-9740)
- DSEカスタム・コンパクション・ストラテジのためのタブ補完がcqlshに追加されました。(DSP-9864)
- スロー・クエリー・ログにトレースIDが含まれるようになりました。(DSP-10055)
- 行レベルのパーミッションの設定がサポートされました。行レベル・アクセス制御(RLAC)を使用した行レベルのパーミッションの設定は、DSE SearchまたはDSE Graphでの使用はサポートされていません。(DSP-10093)
- G1GCでは、最大ヒープ・サイズが8192 MBから32765 MBに増えました。(DSP-10459)
- CassandraAuditWriterで使用されるコンパクション・ストラテジが変更されました。(DSP-11508)
- 直近の最も遅いクエリーを出力するdsetoolコマンドが実装されました。(DSP-11152)
- CQLのスロー・クエリー・ログのパフォーマンスが向上し、デフォルトが変更されました。(DSP-11171)
- DataStax Enterprise 5.1へのアップグレードは、DataStax Enterprise 5.0からのみサポートされています。これより前のバージョンからアップグレードするには、DSE 5.0への中間アップグレードが必要です。(DSP-11281)
- cassandra.yamlでデフォルトのオーセンティケーターはDseAuthenticatorになり、デフォルトのオーソライザーはDseAuthorizerになりました。DSE 5.1にアップグレードしたら、セキュリティの設定を確認して調整してください。(DSP-12211)
- DseAuthenticator以外のオーセンティケーターと、DseAuthorizer以外のオーソライザーは、DSE 5.0で廃止されました。DSE 5.1で他のオーセンティケーターまたはオーソライザーを使用すると、一部のセキュリティ機能が正しく動作しない場合があります。(DSP-12542)
- CQLおよびcqlshコマンドのヘルプが改善されました。(DSP-12845)
cqlshで「
help
」と入力すると、閲覧できるヘルプ・トピックがすべて表示されます。「helpname
」と入力すると、nameコマンドの詳細が表示されます。たとえば、「help CAPTURE
」、「help ALTER_KEYSPACE
」のように入力します。 - パーティション分割されていないキースペースについては、使用廃止時にRFより小さいかどうかのチェックのみ実行されます。(DSP-13054)
- SmallIntおよびTinyIntのシリアライズが修正されました。(DSP-12916)
- CassandraLoginModuleからlegacyAuthenticateを呼び出す前に、null/空のパスワードを確認します。(DSP-8573)
- SELECT文にユーザー式を登録できるようになりました。(DSP-12549)
- 実行プロファイルにアップグレードすると、cqlsh COPYで要求タイムアウトが正しく適用されるようになりました。(DSP-12698)
- JavaドライバーがDSEドライバー・バージョン1.2.0-eap5に更新されました。(DSP-11964)
- select count(*)クエリーの連続ページング要求のAssertionErrorが修正されました。(DSP-11964)
- 内部DSEドライバーが更新され、Duration型の書式設定が修正されました。(DSP-11964)
- オープン・ソースPythonドライバーがDataStax Enterpriseドライバーに置き換わりました。(DSP-11964)
- OutOfSpaceTestが修正されました。(DSP-12239)
- SELECTにインデックス制限を不変として追加できるようになりました。(DSP-12239)
- タブ補完のためcqlshに文法拡張を追加できるようになりました。(DSP-12150)
- コンパクションのパフォーマンスが向上しました。(DSP-11695)
- SASIインデックスにクライアント警告が追加されました。(DSP-11695)
- cqlsh COPY FROMコマンドでUNSET値がサポートされました。(DSP-11695)
- 互換性がない認証および権限管理構成についてのエラー・メッセージが改善されました。(DSP-11695)
- 最適化された連続ページングが実装されました。(DSP-11695)
- cassandra-stressに、show-queries、query-log-file、およびno-progressログ・オプションが追加されました。(DSP-9476)
- cassandra-stressユーザー・モードで大きいパーティションの生成が可能になりました。(DSP-9476)
- 書き込みを段階的に実施するためにByteBufferを使用して(BufferedDataOutputStreamPlus)、可変長整数(VIntCoding)とDataOutputStreamPlusインターフェイスが最適化されました。(DSP-9476)
- メトリクスが改善され、リソース競合下でのオーバーヘッドが低減されました。(DSP-9476)
- パフォーマンスの向上:SinglePartitionReadCommand::queriesMulticellType()の実行速度が向上しました。(DSP-9476)
- GRANT/REVOKE文で内部リソース名を受け入れるようになりました。(DSP-11746)
- StatementRestrictions::getPartitionKeys()の実行速度が向上しました。(DSP-11724)
- 権限管理文内のキースペースの修飾をIResourceが担当するようになりました。(DSP-11588)
- 固定のタイムスタンプを持つデフォルトのスーパーユーザー・ロールが挿入されました。(DSP-11600)
- パーミッションを拡張できるようになりました。(DSP-11600)
- IResourceをより簡単に拡張できるようになりました。(DSP-11600)
- IAuthenticatorにメソッドが追加され、ユーザーとロールの両方でログインできるようになりました。(DSP-11600)
- プライベート・プロトコル・バージョンが追加されました。(DSP-11535)
5.1.0 DSE Advanced Replication(DSE拡張レプリケーション)の変更点と機能強化
- DSE Advanced Replication(DSE拡張レプリケーション)はDSE Multi-Instance(DSEマルチインスタンス)での使用が認定されました。(DSP-10738)
- 複数のクラスターへのレプリケーションがサポートされました。(DSP-8352)
- マルチ・データ・センター・エッジ(ソース)のクラスター構成がサポートされました。(DSP-8744)
- Cassandra CDC(Change Data Capture)を使用したDSE Advanced Replication(DSE拡張レプリケーション)が実装されました。(DSP-9822)
- 行レベルのパーミッションの設定がサポートされました。(DSP-10727)
デスティネーション・クラスターでの行レベル・アクセス制御(RLAC)セキュリティ。(DSP-10893)
- DSE 5.0 Advanced Replication(DSE 5.0拡張レプリケーション、V1)からDSE 5.1 Advanced Replication(DSE 5.1拡張レプリケーション、V2)への移行がサポートされました。(DSP-12280)
- ゲージ・メトリクス・タイプや転送グループ・メトリクスなど、パフォーマンス・メトリクスが強化されました。(DSP-12922)
5.1.0 DSE Analyticsの変更点と機能強化
- DSEFSにWebHDFS RESTインターフェイスが実装されました。(DSP-2347)
- 必要に応じてSparkエグゼキューターを個別のユーザーとして実行できるようになりました。(DSP-4252)
- Solrインデックスを非透過的に使用して、SparkSQLクエリーを最適化できるようになりました。(DSP-5028)
- BYOSでDSEFSがサポートされました。(DSP-8888)
- SparkマスターおよびワーカーUIで、SSLがサポートされました。(DSP-9928)
dse.yamlで、spark_encryption_optionsが無効になりました。
- Hiveコネクターが削除されました。Spark SQLではCassandraHiveメタストアが使用されます。Hive cql/cassandraハンドラーが削除されました。(DSP-10333)
- BYOHadoopおよびDSE Hadoopが削除されました。(DSE 5.0で廃止予定)(DSP-10408)
- DSEFSでのロックの速度が向上し、共有ロックがサポートされました。(DSP-11145)
- DSE SparkSQLでgeo型がサポートされ、Well Known Text形式で表示されます。(DSP-11173)
- Spark向けのCQLベースのリソース・マネージャー通信チャネルが作成されました。(DSP-11331)
- dse spark-submitを介して実行される分析ジョブで、パフォーマンス向上のために連続ページングを使用できるようになりました。 「継続的なページングの有効化」を参照してください。(DSP-11343)
- SparkSQLを介してDSEGraphFrameテーブルにアクセスできるようになりました。(DSP-11898)
- サーバー側のSpark UIの認証が可能になりました。(DSP-11955)
- dse client-tool sparkサブコマンドが強化されました。(DSP-12048)
conf.set("spark.shuffle.service.port", port
を使用したshuffleパラメーターのプログラムによる設定はサポートされません。代わりに、認証状態に基づいて正しいサービス・ポートを自動的に設定するdse spark-submitを使用してください。(DSP-12471)- Spark Jobserverが0.6.2.234にアップグレードされました。このカスタム・バージョンでは、互換性のあるDataStax Spark Jobserver API(推奨)またはjobserver 0.7.0を使用して、アプリケーションをリコンパイルする必要があります。(DSP-12478)
5.1.0 DSE Graphの変更点と機能強化
- 頂点(load_vertex_threads)またはエッジ(load_edge_threads)の読み込みに使用されるデフォルトのスレッド数が1から0に変更されました。(DGL-124)
- タイムアウトによりクエリーが失敗した場合に、タイムアウトを超過したことを示すエラー・メッセージが表示されるようになりました。(DSP-9393)
- グラフを削除するifExistsが追加されました。(DSP-9511)
- グラフのクエリーに関連するデータベース・エラーが直接ドライバーに送られるようになりました。(DSP-9567)
- エッジIDの形式が変更されました。これによるユーザーへの影響はありません。(DSP-10566)
- 境界外のジオ・データが拒否されるようになりました。(DSP-10748)
- graph#ioが無効になりました。(DSP-10804)
- バッチ・グラフ・クエリー用のDSEGraphFrameフレームワークにより、GraphとSparkの統合が改善され、パフォーマンスが向上して使いやすくなりました。(DSP-11104)
- 外部Solrスキーマの変更がDSE Graphによって上書きされないようになりました。(DSP-11226)
- Graphで日付型がサポートされました。(DSP-11287)
- グラフ固有のMBeanがdatastore-latenciesカテゴリーからrequest-latenciesカテゴリーに移行しました。(DSP-11521)
- グラフでSolrベースのあいまい検索がサポートされました。(DSP-11273)
- 編集距離クエリーのDSE Graph APIがサポートされました。(DSP-11880)
- 検索正規表現「.」がすべての空白文字に一致するようになりました。(DSP-11952)
- Kryoバージョンの競合。(DSP-11984)
- DSEGスナップショット構成ミューテーターが追加されました。(DSP-12072)
- GremlinからSparkプロパティを設定できるようになりました。(DSP-12296)
- ドライバーで距離と多角形のクエリーのGeoインターフェイスが変更されました。(DSP-12710)
- Geo述語が変更されました。(DSP-12467)
5.1.0 DSEFSの変更点と機能強化
- ファイル・パーミッションと所有権を制御するDSEFSコマンド。(DSP-10582)
- タブのオートコンプリートがサポートされました。(DSP-10584)
- ファイル圧縮がサポートされました。(DSP-10655)
- DSEFSシェルのローカル・ファイル・システム操作が強化されました。(DSP-10933)
- DSEFSシェルでコメント(#)がサポートされました。(DSP-10935)
- JMXでDSEFSメトリクスが公開されました。(DSP-11375)
- 人間が判読可能なサイズ(-h)と単一カラム出力(-1)についてDSEFユーザー・エクスペリエンスが向上しました。(DSP-11675)
- 再帰的なlsパラメーター名が修正され、-rから-Rに変わりました。(DSP-12016)
- name_idが名前テーブルのプライマリ・キーに含まれました。DSEFS Cassandraスキーマの強化により、同時書き込みにより発生する不整合からすべてのメタデータを効果的に復元できるようになりました。DataStax Enterprise 5.1へのアップグレードでは、新しいスキーマを取得する手順を行う必要があります。(DSP-12450)
- DSE 5.1.0ではDSEFSはデフォルトで有効になっていますが、新しいDSE 5.1.0 dse.yamlファイルではdsefs.enabled設定はコメントアウトされています。DSEFSを有効にするには、DSE 5.1.0にアップグレードした後にdsefs_options.enabled設定をコメント解除します。(DSP-13310)
5.1.0 DSE Searchの変更点と機能強化
- DataImportHandlerはサポートされなくなりました。インポート・ハンドラー・タブがSolr Admin UIから削除されました。DSE 5.1にアップグレードする前に、solrconfigファイルからデータ・インポート・ハンドラーをすべて削除してください。(DSP-6266)
- 従来のnettyベースのノード間通信プロトコルが削除されました。/ja/upgrade/doc/upgrade/datastax_enterprise/upgdDSE51.html#upgdDSE51__prepUpg51Searchを参照してください。コアの作成や分散削除のような非クエリー検索要求のタイムアウトは、client_request_timeout_secondsオプションを使用してinternode_messaging_optionsで設定されます。(DSP-6933)
- テキスト形式の頂点プロパティの分析済みバージョンと未分析バージョンの両方のインデックスが自動的に作成されるようになりました。(DSP-7633)
- lucene CheckIndexを使用して、dsetoolによってインデックスの整合性をチェックできるようになりました。(DSP-8875)
- クラスター全体の検索インデックスを管理するための新しいDSE Searchインデックス管理コマンド。(DSP-9204)
- Luceneマージ・スケジュールを使用し、並列処理が行われないと、0スループットの期間が発生します。(DSP-9325)
以前のリリースでは、solrconfig.xmlのデフォルトのmergeScheduler設定が適切に設定されませんでした。現在は、カスタムのmergeScheduler構成が提供されていない限り、自動的にデフォルト設定が適切に設定されます。
- DSE 5.1にアップグレードする前に廃止予定のSolrフィールド型にアクションが必要です。(DSP-9509)
- HTTP書き込みは廃止予定です。CQLを使用してデータをDSEに挿入してください。(DSP-9540)
- dsetool検索コマンドでCQLインデックス管理コマンドが使用されるようになりました。dsetool create_coreで、deleteAllがサポートされなくなりました。(DSP-9762)
- 新しいDateRangeTypeデータ型のDateRangeFieldがサポートされました。(DSP-10225)
- 非同期インデックス作成のパフォーマンスが向上しました。(DSP-10617)
- 不要な構成要素についてのチェックがCassandraSolrConfigに追加されました。(DSP-10677)
- LUCENE-7299 基数ソートにより、セグメントのフラッシュが最適化されました。(DSP-10685)
- 自動生成されたスキーマのデフォルトの動作がDocValuesを有効にするように変更されました。(DSP-10690)
- XMLが正しくインデントされるようになり、自動生成されたリソースが読みやすくなりました。(DSP-10795)
- DSE 5.1へのアップグレード後、検索スキーマでSpatialRecursivePrefixTreeFieldType(RPT)を使用する場合の単位のフィールド型がdistanceUnitsに置き換わりました。(DSP-10802)
- Solrクエリー・パーサーがフィルター・ブーリアン・クエリーを使用するように最適化されました。(DSP-10916)
- Stored=trueコピー・フィールドはサポートされておらず、スキーマの検証は失敗に終わります。5.1にアップグレードする前に、schema.xmlファイルでcopyFieldディレクティブの格納されている属性値をtrueからfalseに変更して、コアを再読み込みする必要があります。(DSP-11087)
- PER PARTITION句は、DSE Search solr_queryクエリーではサポートされていません。(DSP-11050)
- SolrのtimeAllowedパラメーターを使用したクエリーの時間制限がサポートされていますが、DSE Searchでは異なる点があります。(DSP-11165)
- DSE Search例外のクライアント側のマッピングが改善されました。(DSP-11315)
- 検索TTLプロセスのデフォルト・バッチ・サイズが変更されました。(DSP-11493)
dse.yamlのttl_index_rebuild_options.max_docs_per_batchに値が指定されていない場合は、デフォルトが100から4096に変更されます。
- DSE SearchはCassandraのdurationデータ型をサポートしていません。(DSP-11825)
- Solr HTTP要求とSolr Admin UIの認証および権限管理のエラー処理が改善されました。(DSP-12550)
パーミッションがないために要求が失敗した場合、以前のバージョンでは401エラーが返されていましたが、403エラーが返されるようになりました。
- unfrozenタプルがサポートされました。(DSP-12347)
- dse.yamlおよびsolrconfig.xmlの書き込みパス構成のデフォルト選択が改善されました。「」を参照してください。(DSP-12491)
5.1.0の既知の問題点
nodetool repair -full
またはnodetool repair -pr
を使用している場合でも、DSE 5.0.0~5.0.9は、インクリメンタルとして実行され、sstableをリペア済みとマークするため、アンチコンパクションを引き起こします。(DSP-14464)
DSE Searchの既知の問題点
- DSE Searchでは、バージョンが混在するクラスターのトークンフィルターが見落とされる場合があります。トークンフィルター処理が正しく行われるようにするには、すべてのノードをDSE 5.1.6以降にアップグレードします。(DSP-14998)
-
以下の場合は、DSE 5.1.0をスキップしてDSE 5.1.1に直接アップグレードできます。
- HTTPインターフェイスを使用する。(DSP-13318)、(DSP-13270)
- アクティブなSolrコアをサポートするThriftカラム・ファミリーがある。(DSP-13019)
- TTLを使用してデータの有効期限を設定する。(DSP-12960)
- インデックスの暗号化を使用する。(DSP-13155)、(DSP-12620)
- ライブ・インデックス作成を使用する。(DSP-12040)、(DSP-12941)
- 構成に関係なく、Solrはポート8080のみをリッスンします。(DSP-13187)
- 5.1.0へのアップグレード後、自動生成されたsolrconfig.xmlに、JSONコア作成用の無効なrequestHandlerが含まれています。(DSP-13188)
JSONドキュメントでHTTP書き込みを行うと(廃止予定)、自動生成されたsolrconfig.xmlが以下のように変更されます。変更前:
から次に変更します:<requestHandler name="/update/json" class="solr.UpdateUpdateRequestHandler" startup="lazy"/>
<requestHandler name="/update/json" class="solr.UpdateRequestHandler" startup="lazy"/>
- Shiro 1.2.4ライブラリとSparkジョブ・サーバーの両方で使用される「自動ログイン」機能は、悪意のある攻撃者に対して脆弱です。application.confで「自動ログイン」機能を定義した場合、カスタムshiro.iniファイルでこの機能を有効にしないでください。
DSEでは「自動ログイン」機能はデフォルトで無効になっています。(DSP-11072)
5.1.0の解決済みの問題点
5.1.0 DataStax Enterpriseの解決済みの問題点
- スロー・クエリー・ログの現在のワースト・クエリー。(DSP-5088)
新しい構成可能なcql_slow_log_options。
- dse libに古いメトリクス・コア・バージョンがある。(DSP-11389)
- cqlsh SOURCEコマンドはPlainTextAuthenticatorを想定すべきではない。(DSP-12773)
5.1.0 DSE Advanced Replication(DSE拡張レプリケーション)の解決済みの問題点
- SSLリモート・クラスター接続の認証および暗号化の設定が修正されました。(DSP-9470)
5.1.0 DSE Analyticsの解決済みの問題点
- dse client-tool sql-schemaコマンドとダブルダッシュ・パラメーターの整合性が確保されました。(DSP-10557)
- CFSリペアは、Hadoop構成で定義されているデフォルトのファイル・システムのみリペアできる。(DSP-12481)
5.1.0 DSE Graphの解決済みの問題点
5.1.0 DSE Searchの解決済みの問題点
- Solrの範囲ファセットの先頭、末尾、および中間で、マルチノード・クラスターについて不正確で整合性の低い結果が返される。(DSP-4485)
- 自動生成されたリソースを書き込む前に検証する。(DSP-7638)
- frozen以外のUDTがサポートされました。Solrのフィールド名ポリシーが適用されます。(DSP-11412)
- ユーザーが検索インデックスを表示するには、それらのSELECTパーミッションが必要。Solr Admin UIを使用する場合は、すべてのコア操作で特定のパーミッションが必要。(DSP-11910)
- QueryUtils#getStandardVertexIdComponentsはスレッド・セーフではない。(DSP-12254)
- 再起動されたノードでコアが正しくアンロードされない。(DSP-12434)
- dsetoolのネイティブのドライバー接続が、指定されたホストに対して分離されない。(DSP-12438)
- 検索で非常に幅の広いパーティションのインデックスが再作成されると、ヒープを使い果たす。新しいIndexPool MBean属性。(DSP-12547)
- RTに関する同時ソート問題。(DSP-12600)
- 重複している試験段階のその他のSolr 6機能が無効になりました。(DSP-13093)
Hadoopライブラリ
組み込みのHadoopおよびBring-Your-Own-Hadoop(BYOH)はDataStax Enterprise(DSE)5.0で廃止され、DSE 5.1で削除されました。DSE 5.1以降でHadoopが削除されたため、MapReduce JobTrackerやTaskTrackerなど、DSEに以前含まれていたHadoopサービスはDSEで起動できなくなりました。
ただし、DSEでは、DSE 4.5以降の組み込みのSparkおよびDSE 5.0以降のBring-Your-Own-Spark(BYOS)を現在もサポートしています。Sparkはサーバーとクライアント上の特定のHadoopライブラリを使用するため、DSEには、SparkおよびBYOSの動作に必要なHadoopライブラリが引き続き同梱されています。
同梱のHadoopライブラリを表示するには、「DataStax Enterprise 5.1.xサードパーティ・ソフトウェア」を参照してください。
DSE 5.1のCHANGES.txt
DataStax Enterprise 5.1に含まれている、Apache Cassandra™ 3.10.0の実稼働環境で認定済みの変更点のリスト。
DataStax Enterprise(DSE)5.1には、それより前のDSEリリースのすべての変更点と、Apache Cassandra™ 3.10.0の実稼働環境で認定済みの以下の変更点が含まれています。これらの変更点はCHANGES.txtにリストされています。
- 同時フラッシュに起因するtestLimitSSTableフレークを修正(CASSANDRA-12820)
- cdcカラムの追加が再発生(CASSANDRA-13382)
- 静的カラム・インデックスを修正(CASSANDRA-13277)
- DataOutputBuffer.asNewBufferの破損(CASSANDRA-13298)
- MacOSでユニットテストのCipherFactoryTestが失敗(CASSANDRA-13370)
- frozen以外のUDTカラムでSELECT制約とCREATE INDEXを禁止(CASSANDRA-13247)
- 出荷時のデフォルトのロギングで「%F:%L」パターンに間違って「?:?」を出力(CASSANDRA-13317)
- UnfilteredRowIteratorWithLowerBoundでAssertionErrorの可能性(CASSANDRA-13366)
- AArch64の整列されていないメモリー・アクセスをサポート(CASSANDRA-13326)
- 空の範囲との共通部分のSASI範囲イテレーターの効率を改善(CASSANDRA-12915)
- duration型を使用したカラム間の等価性比較を修正(CASSANDRA-13174)
- stress-graphsのパスワードを難読化(CASSANDRA-12233)
- FastThreadLocalThreadおよびFastThreadLocalに移行(CASSANDRA-13034)
- nodetool stopdaemonがエラーで終了する(CASSANDRA-13030)
- system_distributedのテーブルでgcgsを0に指定してはならない(CASSANDRA-12954)
- SASIのプライマリ・インデックス計算を修正(CASSANDRA-12910)
- TokenAllocatorの追加修正(CASSANDRA-12990)
- NoReplicationTokenAllocatorはレプリケーション係数0で機能する必要がある(CASSANDRA-12983)
- メッセージの結合による不具合に対処(CASSANDRA-12676)
- 3.0/3.XへのアップグレードでIOエラー時に発生する可能性のあるNPEを修正(CASSANDRA-13389)
- レガシー・デシリアライザーが空の範囲トゥームストーンを作成する可能性がある(CASSANDRA-13341)
- レガシー・キャッシング・オプションにより、3.0にアップグレードできない場合がある(CASSANDRA-13384)
- WindowsでKernel32ライブラリーを使用してPIDを取得し、起動時チェックを修正(CASSANDRA-13333)
- メジャー・バージョン間でスキーマを交換しないためにコードを修正(CASSANDRA-13274)
- カラムの削除によりSSTableが破損(CASSANDRA-13337)
- SSTableイテレーターで範囲トゥームストーンの処理にバグ(CASSANDRA-13340)
- null収集のCONTAINSフィルター処理を修正(CASSANDRA-13246)
- 適用:MBean内にあるCassandra全体のメトリクスを使用する際、テスト実行ごとに一意のメトリクス・リザーバーを使用(CASSANDRA-13216)
- アップグレード時、2iテーブルの行の削除を伝搬(CASSANDRA-13320)
- 一部の空のスライスに対してSlice.isEmpty()がfalseを返す(CASSANDRA-13305)
- 書式設定された行出力をCQLテスターのassertEmptyに追加(CASSANDRA-13238)
- LogRecord絶対パスにコンポーネント・セパレーターを追加して、2.1~3.0のアップグレードでデータ喪失を回避(CASSANDRA-13294)
- sigarロギングを排除して、macOSでのテストを改善(CASSANDRA-13233)
- CSVに収集対象の無効なデータが含まれている場合、Cqlsh copy-fromはエラーで終了する(CASSANDRA-13071)
- オフヒープmemtableのc.yamlドキュメントを更新(CASSANDRA-13179)
- StreamingHistogramを加速(CASSANDRA-13038)
- レガシー・デシリアライザーが予期しない境界範囲トゥームストーンを作成する可能性がある(CASSANDRA-13237)
- AntiCompactionTestから不要なアサーションを削除(CASSANDRA-13070)
- 1900年より前の日付のcqlsh COPYを修正(CASSANDRA-13185)
- system.size_estimates tableでキースペース・レプリケーション設定を使用(CASSANDRA-9639)
- vm.max_map_count StartupCheckを追加(CASSANDRA-13008)
- ヒントに関連するロギングにホストIDとデスティネーションのIPアドレスを
- 含める必要がある(CASSANDRA-13205)
- logback.xmlを再読み込みできない(CASSANDRA-13173)
- 2.1から3.0へのアップグレード後に軽量トランザクションが一時的に失敗する(CASSANDRA-13109)
- 2.1.16から3.0.10/3.9へのアップグレード後に行の重複がある(CASSANDRA-13125)
- 空のIN制約を使用してUPDATEクエリーを修正(CASSANDRA-13152)
- sstabledumpのパーティションレベルの削除と
- 有効な行を使用したパーティションの処理を修正(CASSANDRA-13177)
- system_schema.columnsに
- system_schema.table内のテーブルのエントリーが含まれない場合に回避策を提供(CASSANDRA-13180)
- cassandra-stressでtruststore-passwordパラメーターを受け取る(CASSANDRA-12773)
- 転送中のシャドー・ラウンド応答を破棄(CASSANDRA-12653)
- 不一致を防ぐためにリペアされたデータのアンチコンパクションを行わない(CASSANDRA-13153)
- AnticompactionTaskのロガー名が正しくない(CASSANDRA-13343)
- 最後のミューテーションがセグメントの末尾から4バイト以内の場合、コミットログ・リプレイが失敗することがある(CASSANDRA-13282)
- 同じリストを複数回更新するクエリーを修正(CASSANDRA-13130)
- キースペースが指定されていない場合のGRANT/REVOKEを修正(CASSANDRA-13053)
- initメッセージの送信後、送信側スレッドのストリーミングを開始することにより、受信側での競合を回避(CASSANDRA-12886)
- antテストの実行時の「multiple versions of ant detected...」を修正(CASSANDRA-13232)
- 結合ストラテジのスリープが過剰(CASSANDRA-1309)
- 動作が定まらないLongLeveledCompactionStrategyTestを修正(CASSANDRA-12202)
- 問題のあるCOPY TO STDOUTを修正(CASSANDRA-12497)
- 逆クエリーのColumnCounter::countAllの動作を修正(CASSANDRA-13222)
- getSeeds()の呼び出しで発生した例外によってOTCスレッドが破損(CASSANDRA-13018)
- 負の平均レイテンシー・メトリクスを修正(CASSANDRA-12876)
- コミットログ・セグメントの作成時に1つのファイル・ポインターのみを使用(CASSANDRA-12539)
- 未使用のリポジトリを削除(CASSANDRA-13278)
- 検出されない例外のスタックトレースをログに記録(CASSANDRA-13108)
- 起動時のJavaエラーに移植可能なstderrを使用(CASSANDRA-13211)
- OutboundTcpConnectionのスレッド・リークを修正(CASSANDRA-13204)
- 結合ストラテジで無限ループが発生することがある(CASSANDRA-13159)
DSE 5.1のNEWS.txt
DataStax Enterprise 5.1のアップグレードに関する一般的なアドバイス
GENERAL UPGRADING ADVICE FOR ANY VERSION
========================================
Snapshotting is fast (especially if you have JNA installed) and takes
effectively zero disk space until you start compacting the live data
files again. Thus, best practice is to ALWAYS snapshot before any
upgrade, just in case you need to roll back to the previous version.
(Cassandra version X + 1 will always be able to read data files created
by version X, but the inverse is not necessarily the case.)
When upgrading major versions of Cassandra, you will be unable to
restore snapshots created with the previous major version using the
'sstableloader' tool. You can upgrade the file format of your snapshots
using the provided 'sstableupgrade' tool.
3.11.0
======
Upgrading
---------
- The NativeAccessMBean isAvailable method will only return true if the
native library has been successfully linked. Previously it was returning
true if JNA could be found but was not taking into account link failures.
- Primary ranges in the system.size_estimates table are now based on the keyspace
replication settings and adjacent ranges are no longer merged (CASSANDRA-9639).
- In 2.1, the default for otc_coalescing_strategy was 'DISABLED'.
In 2.2 and 3.0, it was changed to 'TIMEHORIZON', but that value was shown
to be a performance regression. The default for 3.11.0 and newer has
been reverted to 'DISABLED'. Users upgrading from Cassandra 2.2 or 3.0 should
be aware that the default has changed.
3.10
====
New features
------------
- New `DurationType` (cql duration). See CASSANDRA-11873
- Runtime modification of concurrent_compactors is now available via nodetool
- Support for the assignment operators +=/-= has been added for update queries.
- An Index implementation may now provide a task which runs prior to joining
the ring. See CASSANDRA-12039
- Filtering on partition key columns is now also supported for queries without
secondary indexes.
- A slow query log has been added: slow queries will be logged at DEBUG level.
For more details refer to CASSANDRA-12403 and slow_query_log_timeout_in_ms
in cassandra.yaml.
- Support for GROUP BY queries has been added.
- A new compaction-stress tool has been added to test the throughput of compaction
for any cassandra-stress user schema. see compaction-stress help for how to use.
- Compaction can now take into account overlapping tables that don't take part
in the compaction to look for deleted or overwritten data in the compacted tables.
Then such data is found, it can be safely discarded, which in turn should enable
the removal of tombstones over that data.
The behavior can be engaged in two ways:
- as a "nodetool garbagecollect -g CELL/ROW" operation, which applies
single-table compaction on all sstables to discard deleted data in one step.
- as a "provide_overlapping_tombstones:CELL/ROW/NONE" compaction strategy flag,
which uses overlapping tables as a source of deletions/overwrites during all
compactions.
The argument specifies the granularity at which deleted data is to be found:
- If ROW is specified, only whole deleted rows (or sets of rows) will be
discarded.
- If CELL is specified, any columns whose value is overwritten or deleted
will also be discarded.
- NONE (default) specifies the old behavior, overlapping tables are not used to
decide when to discard data.
Which option to use depends on your workload, both ROW and CELL increase the
disk load on compaction (especially with the size-tiered compaction strategy),
with CELL being more resource-intensive. Both should lead to better read
performance if deleting rows (resp. overwriting or deleting cells) is common.
- Prepared statements are now persisted in the table prepared_statements in
the system keyspace. Upon startup, this table is used to preload all
previously prepared statements - i.e. in many cases clients do not need to
re-prepare statements against restarted nodes.
- cqlsh can now connect to older Cassandra versions by downgrading the native
protocol version. Please note that this is currently not part of our release
testing and, as a consequence, it is not guaranteed to work in all cases.
See CASSANDRA-12150 for more details.
- Snapshots that are automatically taken before a table is dropped or truncated
will have a "dropped" or "truncated" prefix on their snapshot tag name.
- Metrics are exposed for successful and failed authentication attempts.
These can be located using the object names org.apache.cassandra.metrics:type=Client,name=AuthSuccess
and org.apache.cassandra.metrics:type=Client,name=AuthFailure respectively.
- Add support to "unset" JSON fields in prepared statements by specifying DEFAULT UNSET.
See CASSANDRA-11424 for details
- Allow TTL with null value on insert and update. It will be treated as equivalent to inserting a 0.
- Removed outboundBindAny configuration property. See CASSANDRA-12673 for details.
Upgrading
---------
- Support for alter types of already defined tables and of UDTs fields has been disabled.
If it is necessary to return a different type, please use casting instead. See
CASSANDRA-12443 for more details.
- Specifying the default_time_to_live option when creating or altering a
materialized view was erroneously accepted (and ignored). It is now
properly rejected.
- Only Java and JavaScript are now supported UDF languages.
The sandbox in 3.0 already prevented the use of script languages except Java
and JavaScript.
- Compaction now correctly drops sstables out of CompactionTask when there
isn't enough disk space to perform the full compaction. This should reduce
pending compaction tasks on systems with little remaining disk space.
- Request timeouts in cassandra.yaml (read_request_timeout_in_ms, etc) now apply to the
"full" request time on the coordinator. Previously, they only covered the time from
when the coordinator sent a message to a replica until the time that the replica
responded. Additionally, the previous behavior was to reset the timeout when performing
a read repair, making a second read to fix a short read, and when subranges were read
as part of a range scan or secondary index query. In 3.10 and higher, the timeout
is no longer reset for these "subqueries". The entire request must complete within
the specified timeout. As a consequence, your timeouts may need to be adjusted
to account for this. See CASSANDRA-12256 for more details.
- Logs written to stdout are now consistent with logs written to files.
Time is now local (it was UTC on the console and local in files). Date, thread, file
and line info where added to stdout. (see CASSANDRA-12004)
- The 'clientutil' jar, which has been somewhat broken on the 3.x branch, is not longer provided.
The features provided by that jar are provided by any good java driver and we advise relying on drivers rather on
that jar, but if you need that jar for backward compatiblity until you do so, you should use the version provided
on previous Cassandra branch, like the 3.0 branch (by design, the functionality provided by that jar are stable
accross versions so using the 3.0 jar for a client connecting to 3.x should work without issues).
- (Tools development) DatabaseDescriptor no longer implicitly startups components/services like
commit log replay. This may break existing 3rd party tools and clients. In order to startup
a standalone tool or client application, use the DatabaseDescriptor.toolInitialization() or
DatabaseDescriptor.clientInitialization() methods. Tool initialization sets up partitioner,
snitch, encryption context. Client initialization just applies the configuration but does not
setup anything. Instead of using Config.setClientMode() or Config.isClientMode(), which are
deprecated now, use one of the appropiate new methods in DatabaseDescriptor.
- Application layer keep-alives were added to the streaming protocol to prevent idle incoming connections from
timing out and failing the stream session (CASSANDRA-11839). This effectively deprecates the streaming_socket_timeout_in_ms
property in favor of streaming_keep_alive_period_in_secs. See cassandra.yaml for more details about this property.
- Duration litterals support the ISO 8601 format. By consequence, identifiers matching that format
(e.g P2Y or P1MT6H) will not be supported anymore (CASSANDRA-11873).
3.8
===
New features
------------
- Shared pool threads are now named according to the stage they are executing
tasks for. Thread names mentioned in traced queries change accordingly.
- A new option has been added to cassandra-stress "-rate fixed={number}/s"
that forces a scheduled rate of operations/sec over time. Using this, stress can
accurately account for coordinated ommission from the stress process.
- The cassandra-stress "-rate limit=" option has been renamed to "-rate throttle="
- hdr histograms have been added to stress runs, it's output can be saved to disk using:
"-log hdrfile=" option. This histogram includes response/service/wait times when used with the
fixed or throttle rate options. The histogram file can be plotted on
http://hdrhistogram.github.io/HdrHistogram/plotFiles.html
- TimeWindowCompactionStrategy has been added. This has proven to be a better approach
to time series compaction and new tables should use this instead of DTCS. See
CASSANDRA-9666 for details.
- Change-Data-Capture is now available. See cassandra.yaml and for cdc-specific flags and
a brief explanation of on-disk locations for archived data in CommitLog form. This can
be enabled via ALTER TABLE ... WITH cdc=true.
Upon flush, CommitLogSegments containing data for CDC-enabled tables are moved to
the data/cdc_raw directory until removed by the user and writes to CDC-enabled tables
will be rejected with a WriteTimeoutException once cdc_total_space_in_mb is reached
between unflushed CommitLogSegments and cdc_raw.
NOTE: CDC is disabled by default in the .yaml file. Do not enable CDC on a mixed-version
cluster as it will lead to exceptions which can interrupt traffic. Once all nodes
have been upgraded to 3.8 it is safe to enable this feature and restart the cluster.
Upgrading
---------
- The ReversedType behaviour has been corrected for clustering columns of
BYTES type containing empty value. Scrub should be run on the existing
SSTables containing a descending clustering column of BYTES type to correct
their ordering. See CASSANDRA-12127 for more details.
- Ec2MultiRegionSnitch will no longer automatically set broadcast_rpc_address
to the public instance IP if this property is defined on cassandra.yaml.
- The name "json" and "distinct" are not valid anymore a user-defined function
names (they are still valid as column name however). In the unlikely case where
you had defined functions with such names, you will need to recreate
those under a different name, change your code to use the new names and
drop the old versions, and this _before_ upgrade (see CASSANDRA-10783 for more
details).
Deprecation
-----------
- DateTieredCompactionStrategy has been deprecated - new tables should use
TimeWindowCompactionStrategy. Note that migrating an existing DTCS-table to TWCS might
cause increased compaction load for a while after the migration so make sure you run
tests before migrating. Read CASSANDRA-9666 for background on this.
3.7
===
Upgrading
---------
- A maximum size for SSTables values has been introduced, to prevent out of memory
exceptions when reading corrupt SSTables. This maximum size can be set via
max_value_size_in_mb in cassandra.yaml. The default is 256MB, which matches the default
value of native_transport_max_frame_size_in_mb. SSTables will be considered corrupt if
they contain values whose size exceeds this limit. See CASSANDRA-9530 for more details.
3.6
=====
New features
------------
- JMX connections can now use the same auth mechanisms as CQL clients. New options
in cassandra-env.(sh|ps1) enable JMX authentication and authorization to be delegated
to the IAuthenticator and IAuthorizer configured in cassandra.yaml. The default settings
still only expose JMX locally, and use the JVM's own security mechanisms when remote
connections are permitted. For more details on how to enable the new options, see the
comments in cassandra-env.sh. A new class of IResource, JMXResource, is provided for
the purposes of GRANT/REVOKE via CQL. See CASSANDRA-10091 for more details.
Also, directly setting JMX remote port via the com.sun.management.jmxremote.port system
property at startup is deprecated. See CASSANDRA-11725 for more details.
- JSON timestamps are now in UTC and contain the timezone information, see CASSANDRA-11137 for more details.
- Collision checks are performed when joining the token ring, regardless of whether
the node should bootstrap. Additionally, replace_address can legitimately be used
without bootstrapping to help with recovery of nodes with partially failed disks.
See CASSANDRA-10134 for more details.
- Key cache will only hold indexed entries up to the size configured by
column_index_cache_size_in_kb in cassandra.yaml in memory. Larger indexed entries
will never go into memory. See CASSANDRA-11206 for more details.
- For tables having a default_time_to_live specifying a TTL of 0 will remove the TTL
from the inserted or updated values.
- Startup is now aborted if corrupted transaction log files are found. The details
of the affected log files are now logged, allowing the operator to decide how
to resolve the situation.
- Filtering expressions are made more pluggable and can be added programatically via
a QueryHandler implementation. See CASSANDRA-11295 for more details.
3.4
===
New features
------------
- Internal authentication now supports caching of encrypted credentials.
Reference cassandra.yaml:credentials_validity_in_ms
- Remote configuration of auth caches via JMX can be disabled using the
the system property cassandra.disable_auth_caches_remote_configuration
- sstabledump tool is added to be 3.0 version of former sstable2json. The tool only
supports v3.0+ SSTables. See tool's help for more detail.
Upgrading
---------
- Nothing specific to 3.4 but please see previous versions upgrading section,
especially if you are upgrading from 2.2.
Deprecation
-----------
- The mbean interfaces org.apache.cassandra.auth.PermissionsCacheMBean and
org.apache.cassandra.auth.RolesCacheMBean are deprecated in favor of
org.apache.cassandra.auth.AuthCacheMBean. This generalized interface is
common across all caches in the auth subsystem. The specific mbean interfaces
for each individual cache will be removed in a subsequent major version.
3.2
===
New features
------------
- We now make sure that a token does not exist in several data directories. This
means that we run one compaction strategy per data_file_directory and we use
one thread per directory to flush. Use nodetool relocatesstables to make sure your
tokens are in the correct place, or just wait and compaction will handle it. See
CASSANDRA-6696 for more details.
- bound maximum in-flight commit log replay mutation bytes to 64 megabytes
tunable via cassandra.commitlog_max_outstanding_replay_bytes
- Support for type casting has been added to the selection clause.
- Hinted handoff now supports compression. Reference cassandra.yaml:hints_compression.
Note: hints compression is currently disabled by default.
Upgrading
---------
- The compression ratio metrics computation has been modified to be more accurate.
- Running Cassandra as root is prevented by default.
- JVM options are moved from cassandra-env.(sh|ps1) to jvm.options file
Deprecation
-----------
- The Thrift API is deprecated and will be removed in Cassandra 4.0.
3.1
=====
Upgrading
---------
- The return value of SelectStatement::getLimit as been changed from DataLimits
to int.
- Custom index implementation should be aware that the method Indexer::indexes()
has been removed as its contract was misleading and all custom implementation
should have almost surely returned true inconditionally for that method.
- GC logging is now enabled by default (you can disable it in the jvm.options
file if you prefer).
3.0
===
New features
------------
- EACH_QUORUM is now a supported consistency level for read requests.
- Support for IN restrictions on any partition key component or clustering key
as well as support for EQ and IN multicolumn restrictions has been added to
UPDATE and DELETE statement.
- Support for single-column and multi-colum slice restrictions (>, >=, <= and <)
has been added to DELETE statements
- nodetool rebuild_index accepts the index argument without
the redundant table name
- Materialized Views, which allow for server-side denormalization, is now
available. Materialized views provide an alternative to secondary indexes
for non-primary key queries, and perform much better for indexing high
cardinality columns.
See http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
- Hinted handoff has been completely rewritten. Hints are now stored in flat
files, with less overhead for storage and more efficient dispatch.
See CASSANDRA-6230 for full details.
- Option to not purge unrepaired tombstones. To avoid users having data resurrected
if repair has not been run within gc_grace_seconds, an option has been added to
only allow tombstones from repaired sstables to be purged. To enable, set the
compaction option 'only_purge_repaired_tombstones':true but keep in mind that if
you do not run repair for a long time, you will keep all tombstones around which
can cause other problems.
- Enabled warning on GC taking longer than 1000ms. See
cassandra.yaml:gc_warn_threshold_in_ms
Upgrading
---------
- Clients must use the native protocol version 3 when upgrading from 2.2.X as
the native protocol version 4 is not compatible between 2.2.X and 3.Y. See
https://www.mail-archive.com/user@cassandra.apache.org/msg45381.html for details.
- A new argument of type InetAdress has been added to IAuthenticator::newSaslNegotiator,
representing the IP address of the client attempting authentication. It will be a breaking
change for any custom implementations.
- token-generator tool has been removed.
- Upgrade to 3.0 is supported from Cassandra 2.1 versions greater or equal to 2.1.9,
or Cassandra 2.2 versions greater or equal to 2.2.2. Upgrade from Cassandra 2.0 and
older versions is not supported.
- The 'memtable_allocation_type: offheap_objects' option has been removed. It should
be re-introduced in a future release and you can follow CASSANDRA-9472 to know more.
- Configuration parameter memory_allocator in cassandra.yaml has been removed.
- The native protocol versions 1 and 2 are not supported anymore.
- Max mutation size is now configurable via max_mutation_size_in_kb setting in
cassandra.yaml; the default is half the size commitlog_segment_size_in_mb * 1024.
- 3.0 requires Java 8u40 or later.
- Garbage collection options were moved from cassandra-env to jvm.options file.
- New transaction log files have been introduced to replace the compactions_in_progress
system table, temporary file markers (tmp and tmplink) and sstable ancerstors.
Therefore, compaction metadata no longer contains ancestors. Transaction log files
list sstable descriptors involved in compactions and other operations such as flushing
and streaming. Use the sstableutil tool to list any sstable files currently involved
in operations not yet completed, which previously would have been marked as temporary.
A transaction log file contains one sstable per line, with the prefix "add:" or "remove:".
They also contain a special line "commit", only inserted at the end when the transaction
is committed. On startup we use these files to cleanup any partial transactions that were
in progress when the process exited. If the commit line is found, we keep new sstables
(those with the "add" prefix) and delete the old sstables (those with the "remove" prefix),
vice-versa if the commit line is missing. Should you lose or delete these log files,
both old and new sstable files will be kept as live files, which will result in duplicated
sstables. These files are protected by incremental checksums so you should not manually
edit them. When restoring a full backup or moving sstable files, you should clean-up
any left over transactions and their temporary files first. You can use this command:
===> sstableutil -c ks table
See CASSANDRA-7066 for full details.
- New write stages have been added for batchlog and materialized view mutations
you can set their size in cassandra.yaml
- User defined functions are now executed in a sandbox.
To use UDFs and UDAs, you have to enable them in cassandra.yaml.
- New SSTable version 'la' with improved bloom-filter false-positive handling
compared to previous version 'ka' used in 2.2 and 2.1. Running sstableupgrade
is not necessary but recommended.
- Before upgrading to 3.0, make sure that your cluster is in complete agreement
(schema versions outputted by `nodetool describecluster` are all the same).
- Schema metadata is now stored in the new `system_schema` keyspace, and
legacy `system.schema_*` tables are now gone; see CASSANDRA-6717 for details.
- Pig's support has been removed.
- Hadoop BulkOutputFormat and BulkRecordWriter have been removed; use
CqlBulkOutputFormat and CqlBulkRecordWriter instead.
- Hadoop ColumnFamilyInputFormat and ColumnFamilyOutputFormat have been removed;
use CqlInputFormat and CqlOutputFormat instead.
- Hadoop ColumnFamilyRecordReader and ColumnFamilyRecordWriter have been removed;
use CqlRecordReader and CqlRecordWriter instead.
- hinted_handoff_enabled in cassandra.yaml no longer supports a list of data centers.
To specify a list of excluded data centers when hinted_handoff_enabled is set to true,
use hinted_handoff_disabled_datacenters, see CASSANDRA-9035 for details.
- The `sstable_compression` and `chunk_length_kb` compression options have been deprecated.
The new options are `class` and `chunk_length_in_kb`. Disabling compression should now
be done by setting the new option `enabled` to `false`.
- The compression option `crc_check_chance` became a top-level table option, but is currently
enforced only against tables with enabled compression.
- Only map syntax is now allowed for caching options. ALL/NONE/KEYS_ONLY/ROWS_ONLY syntax
has been deprecated since 2.1.0 and is being removed in 3.0.0.
- The 'index_interval' option for 'CREATE TABLE' statements, which has been deprecated
since 2.1 and replaced with the 'min_index_interval' and 'max_index_interval' options,
has now been removed.
- Batchlog entries are now stored in a new table - system.batches.
The old one has been deprecated.
- JMX methods set/getCompactionStrategyClass have been removed, use
set/getCompactionParameters or set/getCompactionParametersJson instead.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The secondary index API has been comprehensively reworked. This will be a breaking
change for any custom index implementations, which should now look to implement
the new org.apache.cassandra.index.Index interface. New syntax has been added to create
and query row-based indexes, which are not explicitly linked to a single column in the
base table.
2.2.4
=====
Deprecation
-----------
- Pig support has been deprecated, and will be removed in 3.0.
Please see CASSANDRA-10542 for more details.
- Configuration parameter memory_allocator in cassandra.yaml has been deprecated
and will be removed in 3.0.0. As mentioned below for 2.2.0, jemalloc is
automatically preloaded on Unix platforms.
Operations
----------
- Switching data center or racks is no longer an allowed operation on a node
which has data. Instead, the node will need to be decommissioned and
rebootstrapped. If moving from the SimpleSnitch, make sure that the data
center and rack containing all current nodes is named "datacenter1" and
"rack1". To override this behaviour use -Dcassandra.ignore_rack=true and/or
-Dcassandra.ignore_dc=true.
- Reloading the configuration file of GossipingPropertyFileSnitch has been disabled.
Upgrading
---------
- The default for the inter-DC stream throughput setting
(inter_dc_stream_throughput_outbound_megabits_per_sec in cassandra.yaml) is
the same than the one for intra-DC one (200Mbps) instead of being unlimited.
Having it unlimited was never intended and was a bug.
New features
------------
- Time windows in DTCS are now limited to 1 day by default to be able to
handle bootstrap and repair in a better way. To get the old behaviour,
increase max_window_size_seconds.
- DTCS option max_sstable_age_days is now deprecated and defaults to 1000 days.
- Native protocol server now allows both SSL and non-SSL connections on
the same port.
2.2.3
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.2 if you are upgrading
from a previous version.
2.2.2
=====
Changed Defaults
----------------
- commitlog_total_space_in_mb will use the smaller of 8192, and 1/4
of the total space of the commitlog volume. (Before: always used
8192)
- The following INFO logs were reduced to DEBUG level and will now show
on debug.log instead of system.log:
- Memtable flushing actions
- Commit log replayed files
- Compacted sstables
- SStable opening (SSTableReader)
New features
------------
- Custom QueryHandlers can retrieve the column specifications for the bound
variables from QueryOptions by using the hasColumnSpecifications()
and getColumnSpecifications() methods.
- A new default assynchronous log appender debug.log was created in addition
to the system.log appender in order to provide more detailed log debugging.
In order to disable debug logging, you must comment-out the ASYNCDEBUGLOG
appender on conf/logback.xml. See CASSANDRA-10241 for more information.
2.2.1
=====
New features
------------
- COUNT(*) and COUNT(1) can be selected with other columns or functions
2.2
===
Upgrading
---------
- The authentication & authorization subsystems have been redesigned to
support role based access control (RBAC), resulting in a change to the
schema of the system_auth keyspace. See below for more detail.
For systems already using the internal auth implementations, the process
for converting existing data during a rolling upgrade is straightforward.
As each node is restarted, it will attempt to convert any data in the
legacy tables into the new schema. Until enough nodes to satisfy the
replication strategy for the system_auth keyspace are upgraded and so have
the new schema, this conversion will fail with the failure being reported
in the system log.
During the upgrade, Cassandra's internal auth classes will continue to use
the legacy tables, so clients experience no disruption. Issuing DCL
statements during an upgrade is not supported.
Once all nodes are upgraded, an operator with superuser privileges should
drop the legacy tables, system_auth.users, system_auth.credentials and
system_auth.permissions. Doing so will prompt Cassandra to switch over to
the new tables without requiring any further intervention.
While the legacy tables are present a restarted node will re-run the data
conversion and report the outcome so that operators can verify that it is
safe to drop them.
New features
------------
- The LIMIT clause applies now only to the number of rows returned to the user,
not to the number of row queried. By consequence, queries using aggregates will not
be impacted by the LIMIT clause anymore.
- Very large batches will now be rejected (defaults to 50kb). This
can be customized by modifying batch_size_fail_threshold_in_kb.
- Selecting columns,scalar functions, UDT fields, writetime or ttl together
with aggregated is now possible. The value returned for the columns,
scalar functions, UDT fields, writetime and ttl will be the ones for
the first row matching the query.
- Windows is now a supported platform. Powershell execution for startup scripts
is highly recommended and can be enabled via an administrator command-prompt
with: 'powershell set-executionpolicy unrestricted'
- It is now possible to do major compactions when using leveled compaction.
Doing that will take all sstables and compact them out in levels. The
levels will be non overlapping so doing this will still not be something
you want to do very often since it might cause more compactions for a while.
It is also possible to split output when doing a major compaction with
STCS - files will be split in sizes 50%, 25%, 12.5% etc of the total size.
This might be a bit better than old major compactions which created one big
file on disk.
- A new tool has been added bin/sstableverify that checks for errors/bitrot
in all sstables. Unlike scrub, this is a non-invasive tool.
- Authentication & Authorization APIs have been updated to introduce
roles. Roles and Permissions granted to them are inherited, supporting
role based access control. The role concept supercedes that of users
and CQL constructs such as CREATE USER are deprecated but retained for
compatibility. The requirement to explicitly create Roles in Cassandra
even when auth is handled by an external system has been removed, so
authentication & authorization can be delegated to such systems in their
entirety.
- In addition to the above, Roles are also first class resources and can be the
subject of permissions. Users (roles) can now be granted permissions on other
roles, including CREATE, ALTER, DROP & AUTHORIZE, which removesthe need for
superuser privileges in order to perform user/role management operations.
- Creators of database resources (Keyspaces, Tables, Roles) are now automatically
granted all permissions on them (if the IAuthorizer implementation supports
this).
- SSTable file name is changed. Now you don't have Keyspace/CF name
in file name. Also, secondary index has its own directory under parent's
directory.
- Support for user-defined functions and user-defined aggregates have
been added to CQL.
************************************************************************
IMPORTANT NOTE: user-defined functions can be used to execute
arbitrary and possibly evil code in Cassandra 2.2, and are
therefore disabled by default. To enable UDFs edit
cassandra.yaml and set enable_user_defined_functions to true.
CASSANDRA-9402 will add a security manager for UDFs in Cassandra
3.0. This will inherently be backwards-incompatible with any 2.2
UDF that perform insecure operations such as opening a socket or
writing to the filesystem.
************************************************************************
- Row-cache is now fully off-heap.
- jemalloc is now automatically preloaded and used on Linux and OS-X if
installed.
- Please ensure on Unix platforms that there is no libjnadispath.so
installed which is accessible by Cassandra. Old versions of
libjna packages (< 4.0.0) will cause problems - e.g. Debian Wheezy
contains libjna versin 3.2.x.
- The node now keeps up when streaming is failed during bootstrapping. You can
use new `nodetool bootstrap resume` command to continue streaming after resolving
an issue.
- Protocol version 4 specifies that bind variables do not require having a
value when executing a statement. Bind variables without a value are
called 'unset'. The 'unset' bind variable is serialized as the int
value '-2' without following bytes.
In an EXECUTE or BATCH request an unset bind value does not modify the value and
does not create a tombstone, an unset bind ttl is treated as 'unlimited',
an unset bind timestamp is treated as 'now', an unset bind counter operation
does not change the counter value.
Unset tuple field, UDT field and map key are not allowed.
In a QUERY request an unset limit is treated as 'unlimited'.
Unset WHERE clauses with unset partition column, clustering column
or index column are not allowed.
- New `ByteType` (cql tinyint). 1-byte signed integer
- New `ShortType` (cql smallint). 2-byte signed integer
- New `SimpleDateType` (cql date). 4-byte unsigned integer
- New `TimeType` (cql time). 8-byte long
- The toDate(timeuuid), toTimestamp(timeuuid) and toUnixTimestamp(timeuuid) functions have been added to allow
to convert from timeuuid into date type, timestamp type and bigint raw value.
The functions unixTimestampOf(timeuuid) and dateOf(timeuuid) have been deprecated.
- The toDate(timestamp) and toUnixTimestamp(timestamp) functions have been added to allow
to convert from timestamp into date type and bigint raw value.
- The toTimestamp(date) and toUnixTimestamp(date) functions have been added to allow
to convert from date into timestamp type and bigint raw value.
- SizeTieredCompactionStrategy parameter cold_reads_to_omit has been removed.
- The default JVM flag -XX:+PerfDisableSharedMem will cause the following tools JVM
to stop working: jps, jstack, jinfo, jmc, jcmd as well as 3rd party tools like Jolokia.
If you wish to use these tools you can comment this flag out in cassandra-env.{sh,ps1}
Upgrading
---------
- Thrift rpc is no longer being started by default.
Set `start_rpc` parameter to `true` to enable it.
- Pig's CqlStorage has been removed, use CqlNativeStorage instead
- Pig's CassandraStorage has been deprecated. CassandraStorage
should only be used against tables created via thrift.
Use CqlNativeStorage for all other tables.
- IAuthenticator been updated to remove responsibility for user/role
maintenance and is now solely responsible for validating credentials,
This is primarily done via SASL, though an optional method exists for
systems which need support for the Thrift login() method.
- IRoleManager interface has been added which takes over the maintenance
functions from IAuthenticator. IAuthorizer is mainly unchanged. Auth data
in systems using the stock internal implementations PasswordAuthenticator
& CassandraAuthorizer will be automatically converted during upgrade,
with minimal operator intervention required. Custom implementations will
require modification, though these can be used in conjunction with the
stock CassandraRoleManager so providing an IRoleManager implementation
should not usually be necessary.
- Fat client support has been removed since we have push notifications to clients
- cassandra-cli has been removed. Please use cqlsh instead.
- YamlFileNetworkTopologySnitch has been removed; switch to
GossipingPropertyFileSnitch instead.
- CQL2 has been removed entirely in this release (previously deprecated
in 2.0.0). Please switch to CQL3 if you haven't already done so.
- The results of CQL3 queries containing an IN restriction will be ordered
in the normal order and not anymore in the order in which the column values were
specified in the IN restriction.
- Some secondary index queries with restrictions on non-indexed clustering
columns were not requiring ALLOW FILTERING as they should. This has been
fixed, and those queries now require ALLOW FILTERING (see CASSANDRA-8418
for details).
- The SSTableSimpleWriter and SSTableSimpleUnsortedWriter classes have been
deprecated and will be removed in the next major Cassandra release. You
should use the CQLSSTableWriter class instead.
- The sstable2json and json2sstable tools have been deprecated and will be
removed in the next major Cassandra release. See CASSANDRA-9618
(https://issues.apache.org/jira/browse/CASSANDRA-9618) for details.
- nodetool enablehandoff will no longer support a list of data centers starting
with the next major release. Two new commands will be added, enablehintsfordc and disablehintsfordc,
to exclude data centers from using hinted handoff when the global status is enabled.
In cassandra.yaml, hinted_handoff_enabled will no longer support a list of data centers starting
with the next major release. A new setting will be added, hinted_handoff_disabled_datacenters,
to exclude data centers when the global status is enabled, see CASSANDRA-9035 for details.
2.1.13
======
New features
------------
- New options for cqlsh COPY FROM and COPY TO, see CASSANDRA-9303 for details.
2.1.10
=====
New features
------------
- The syntax TRUNCATE TABLE X is now accepted as an alias for TRUNCATE X
2.1.9
=====
Upgrading
---------
- cqlsh will now display timestamps with a UTC timezone. Previously,
timestamps were displayed with the local timezone.
- Commit log files are no longer recycled by default, due to negative
performance implications. This can be enabled again with the
commitlog_segment_recycling option in your cassandra.yaml
- JMX methods set/getCompactionStrategyClass have been deprecated, use
set/getCompactionParameters/set/getCompactionParametersJson instead
2.1.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.7
=====
2.1.6
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.5
=====
Upgrading
---------
- The option to omit cold sstables with size tiered compaction has been
removed - it is almost always better to use date tiered compaction for
workloads that have cold data.
2.1.4
=====
Upgrading
---------
The default JMX config now listens to localhost only. You must enable
the other JMX flags in cassandra-env.sh manually.
2.1.3
=====
Upgrading
---------
- Prepending a list to a list collection was erroneously resulting in
the prepended list being reversed upon insertion. If you were depending
on this buggy behavior, note that it has been corrected.
- Incremental replacement of compacted SSTables has been disabled for this
release.
2.1.2
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
2.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.1 if you are upgrading
from a previous version.
New features
------------
- Netty support for epoll on linux is now enabled. If for some
reason you want to disable it pass, the following system property
-Dcassandra.native.epoll.enabled=false
2.1
===
New features
------------
- Default data and log locations have changed. If not set in
cassandra.yaml, the data file directory, commitlog directory,
and saved caches directory will default to $CASSANDRA_HOME/data/data,
$CASSANDRA_HOME/data/commitlog, and $CASSANDRA_HOME/data/saved_caches,
respectively. The log directory now defaults to $CASSANDRA_HOME/logs.
If not set, $CASSANDRA_HOME, defaults to the top-level directory of
the installation.
Note that this should only affect source checkouts and tarballs.
Deb and RPM packages will continue to use /var/lib/cassandra and
/var/log/cassandra in cassandra.yaml.
- SSTable data directory name is slightly changed. Each directory will
have hex string appended after CF name, e.g.
ks/cf-5be396077b811e3a3ab9dc4b9ac088d/
This hex string part represents unique ColumnFamily ID.
Note that existing directories are used as is, so only newly created
directories after upgrade have new directory name format.
- Saved key cache files also have ColumnFamily ID in their file name.
- It is now possible to do incremental repairs, sstables that have been
repaired are marked with a timestamp and not included in the next
repair session. Use nodetool repair -par -inc to use this feature.
A tool to manually mark/unmark sstables as repaired is available in
tools/bin/sstablerepairedset. This is particularly important when
using LCS, or any data not repaired in your first incremental repair
will be put back in L0.
- Bootstrapping now ensures that range movements are consistent,
meaning the data for the new node is taken from the node that is no
longer a responsible for that range of keys.
If you want the old behavior (due to a lost node perhaps)
you can set the following property (-Dcassandra.consistent.rangemovement=false)
- It is now possible to use quoted identifiers in triggers' names.
WARNING: if you previously used triggers with capital letters in their
names, then you must quote them from now on.
- Improved stress tool (http://goo.gl/OTNqiQ)
- New incremental repair option (http://goo.gl/MjohJp, http://goo.gl/f8jSme)
- Incremental replacement of compacted SSTables (http://goo.gl/JfDBGW)
- The row cache can now cache only the head of partitions (http://goo.gl/6TJPH6)
- Off-heap memtables (http://goo.gl/YT7znJ)
- CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ... (http://goo.gl/kQl7GW)
Upgrading
---------
- commitlog_sync_batch_window_in_ms behavior has changed from the
maximum time to wait between fsync to the minimum time. We are
working on making this more user-friendly (see CASSANDRA-9533) but in the
meantime, this means 2.1 needs a much smaller batch window to keep
writer threads from starving. The suggested default is now 2ms.
- Rolling upgrades from anything pre-2.0.7 is not supported. Furthermore
pre-2.0 sstables are not supported. This means that before upgrading
a node on 2.1, this node must be started on 2.0 and
'nodetool upgdradesstables' must be run (and this even in the case
of not-rolling upgrades).
- For size-tiered compaction users, Cassandra now defaults to ignoring
the coldest 5% of sstables. This can be customized with the
cold_reads_to_omit compaction option; 0.0 omits nothing (the old
behavior) and 1.0 omits everything.
- Multithreaded compaction has been removed.
- Counters implementation has been changed, replaced by a safer one with
less caveats, but different performance characteristics. You might have
to change your data model to accomodate the new implementation.
(See https://issues.apache.org/jira/browse/CASSANDRA-6504 and the
blog post at http://goo.gl/qj8iQl for details).
- (per-table) index_interval parameter has been replaced with
min_index_interval and max_index_interval paratemeters. index_interval
has been deprecated.
- support for supercolumns has been removed from json2sstable
2.0.11
======
Upgrading
---------
- Nothing specific to this release, but refer to previous entries if you
are upgrading from a previous version.
New features
------------
- DateTieredCompactionStrategy added, optimized for time series data and groups
data that is written closely in time (CASSANDRA-6602 for details). Consider
this experimental for now.
2.0.10
======
New features
------------
- CqlPaginRecordReader and CqlPagingInputFormat have both been removed.
Use CqlInputFormat instead.
- If you are using Leveled Compaction, you can now disable doing size-tiered
compaction in L0 by starting Cassandra with -Dcassandra.disable_stcs_in_l0
(see CASSANDRA-6621 for details).
- Shuffle and taketoken have been removed. For clusters that choose to
upgrade to vnodes, creating a new datacenter with vnodes and migrating is
recommended. See http://goo.gl/Sna2S1 for further information.
2.0.9
=====
Upgrading
---------
- Default values for read_repair_chance and local_read_repair_chance have been
swapped. Namely, default read_repair_chance is now set to 0.0, and default
local_read_repair_chance to 0.1.
- Queries selecting only CQL static columns were (mistakenly) not returning one
result per row in the partition. This has been fixed and a SELECT DISTINCT
can be used when only the static column of a partition needs to be fetch
without fetching the whole partition. But if you use static columns, please
make sure this won't affect you (see CASSANDRA-7305 for details).
2.0.8
=====
New features
------------
- New snitches have been used for users of Google Compute Engine and of
Cloudstack.
Upgrading
---------
- Nothing specific to this release, but please see 2.0.7 if you are upgrading
from a previous version.
2.0.7
=====
Upgrading
---------
- Nothing specific to this release, but please see 2.0.6 if you are upgrading
from a previous version.
2.0.6
=====
New features
------------
- CQL now support static columns, allows to batch multiple conditional updates
and has a new syntax for slicing over multiple clustering columns
(http://goo.gl/B6qz4j).
- Repair can be restricted to a set of nodes using the -hosts option in nodetool.
- A new 'nodetool taketoken' command relocate tokens with vnodes.
- Hinted handoff can be enabled only for some data-centers (see
hinted_handoff_enabled in cassandra.yaml)
Upgrading
---------
- Nothing specific to this release, but please see 2.0.5 if you are upgrading
from a previous version.
2.0.5
=====
New features
------------
- Batchlog replay can be, and is throttled by default now.
See batchlog_replay_throttle_in_kb setting in cassandra.yaml.
- Scrub can now optionally skip corrupt counter partitions. Please note
that this will lead to the loss of all the counter updates in the skipped
partition. See the --skip-corrupted option.
Upgrading
---------
- If your cluster began on a version before 1.2, check that your secondary
index SSTables are on version 'ic' before upgrading. If not, run
'nodetool upgradesstables' if on 1.2.14 or later, or run 'nodetool
upgradesstables ks cf' with the keyspace and secondary index named
explicitly otherwise. If you don't do this and upgrade to 2.0.x and it
refuses to start because of 'hf' version files in the secondary index,
you will need to delete/move them out of the way and recreate the index
when 2.0.x starts.
2.0.3
=====
New features
------------
- It's now possible to configure the maximum allowed size of the native
protocol frames (native_transport_max_frame_size_in_mb in the yaml file).
Upgrading
---------
- NaN and Infinity are new valid floating point constants in CQL3 and are now reserved
keywords. In the unlikely case you were using one of them as an identifier (for a
column, a keyspace or a table), you will now have to double-quote them (see
http://cassandra.apache.org/doc/cql3/CQL.html#identifiers for "quoted identifiers").
- The IEndpointStateChangeSubscriber has a new method, beforeChange, that
any custom implemenations using the class will need to implement.
2.0.2
=====
New features
------------
- Speculative retry defaults to 99th percentile
(See blog post at http://www.datastax.com/dev/blog/rapid-read-protection-in-cassandra-2-0-2)
- Configurable metrics reporting
(see conf/metrics-reporter-config-sample.yaml)
- Compaction history and stats are now saved to system keyspace
(system.compaction_history table). You can access historiy via
new 'nodetool compactionhistory' command or CQL.
Upgrading
---------
- Nodetool defaults to Sequential mode for repair operations
2.0.1
=====
Upgrading
---------
- The default memtable allocation has changed from 1/3 of heap to 1/4
of heap. Also, default (single-partition) read and write timeouts
have been reduced from 10s to 5s and 2s, respectively.
2.0.0
=====
Upgrading
---------
- Java 7 is now *required*!
- Upgrading is ONLY supported from Cassandra 1.2.9 or later. This
goes for sstable compatibility as well as network. When
upgrading from an earlier release, upgrade to 1.2.9 first and
run upgradesstables before proceeding to 2.0.
- CAS and new features in CQL such as DROP COLUMN assume that cell
timestamps are microseconds-since-epoch. Do not use these
features if you are using client-specified timestamps with some
other source.
- Replication and strategy options do not accept unknown options anymore.
This was already the case for CQL3 in 1.2 but this is now the case for
thrift too.
- auto_bootstrap of a single-token node with no initial_token will
now pick a random token instead of bisecting an existing token
range. We recommend upgrading to vnodes; failing that, we
recommend specifying initial_token.
- reduce_cache_sizes_at, reduce_cache_capacity_to, and
flush_largest_memtables_at options have been removed from cassandra.yaml.
- CacheServiceMBean.reduceCacheSizes() has been removed.
Use CacheServiceMBean.set{Key,Row}CacheCapacityInMB() instead.
- authority option in cassandra.yaml has been deprecated since 1.2.0,
but it has been completely removed in 2.0. Please use 'authorizer' option.
- ASSUME command has been removed from cqlsh. Use CQL3 blobAsType() and
typeAsBlob() conversion functions instead.
See https://cassandra.apache.org/doc/cql3/CQL.html#blobFun for details.
- Inputting blobs as string constants is now fully deprecated in
favor of blob constants. Make sure to update your applications to use
the new syntax while you are still on 1.2 (which supports both string
and blob constants for blob input) before upgrading to 2.0.
- index_interval is now moved to ColumnFamily property. You can change value
with ALTER TABLE ... WITH statement and SSTables written after that will
have new value. When upgrading, Cassandra will pick up the value defined in
cassanda.yaml as the default for existing ColumnFamilies, until you explicitly
set the value for those.
- The deprecated native_transport_min_threads option has been removed in
Cassandra.yaml.
Operations
----------
- VNodes are enabled by default in cassandra.yaml. initial_token
for non-vnode deployments has been removed from the example
yaml, but is still respected if specified.
- Major compactions, cleanup, scrub, and upgradesstables will interrupt
any in-progress compactions (but not repair validations) when invoked.
- Disabling autocompactions by setting min/max compaction threshold to 0
has been deprecated, instead, use the nodetool commands 'disableautocompaction'
and 'enableautocompaction' or set the compaction strategy option enabled = false
- ALTER TABLE DROP has been reenabled for CQL3 tables and has new semantics now.
See https://cassandra.apache.org/doc/cql3/CQL.html#alterTableStmt and
https://issues.apache.org/jira/browse/CASSANDRA-3919 for details.
- CAS uses gc_grace_seconds to determine how long to keep unused paxos
state around for, or a minimum of three hours.
- A new hints created metric is tracked per target, replacing countPendingHints
- After performance testing for CASSANDRA-5727, the default LCS filesize
has been changed from 5MB to 160MB.
- cqlsh DESCRIBE SCHEMA no longer outputs the schema of system_* keyspaces;
use DESCRIBE FULL SCHEMA if you need the schema of system_* keyspaces.
- CQL2 has been deprecated, and will be removed entirely in 2.2. See
CASSANDRA-5918 for details.
- Commit log archiver now assumes the client time stamp to be in microsecond
precision, during restore. Please refer to commitlog_archiving.properties.
Features
--------
- Lightweight transactions
(http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0)
- Alias support has been added to CQL3 SELECT statement. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html) for details.
- JEMalloc support (see memory_allocator in cassandra.yaml)
- Experimental triggers support. See examples/ for how to use. "Experimental"
means "tied closely to internal data structures; we plan to decouple this in
the future, which will probably break triggers written against this initial
API."
- Numerous improvements to CQL3 and a new version of the native protocol. See
http://www.datastax.com/dev/blog/cql-in-cassandra-2-0 for details.
1.2.11
======
Features
--------
- Added a new consistency level, LOCAL_ONE, that forces all CL.ONE operations to
execute only in the local datacenter.
- New replace_address to supplant the (now removed) replace_token and
replace_node workflows to replace a dead node in place. Works like the
old options, but takes the IP address of the node to be replaced.
1.2.9
=====
Features
--------
- A history of executed nodetool commands is now captured.
It can be found in ~/.cassandra/nodetool.history. Other tools output files
(cli and cqlsh history, .cqlshrc) are now centralized in ~/.cassandra, as well.
- A new sstablesplit utility allows to split large sstables offline.
1.2.8
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.2.7 if you are upgrading
from a previous version.
1.2.7
=====
Upgrading
---------
- If you have decommissioned a node in the past 72 hours, it is imperative
that you not upgrade until such time has passed, or do a full cluster
restart (not rolling) before beginning the upgrade. This only applies to
decommission, not removetoken.
1.2.6
=====
Upgrading
---------
- hinted_handoff_throttle_in_kb is now reduced by a factor
proportional to the number of nodes in the cluster (see
https://issues.apache.org/jira/browse/CASSANDRA-5272).
- CQL3 syntax for CREATE CUSTOM INDEX has been updated. See CQL3
documentation for details.
1.2.5
=====
Features
--------
- Custom secondary index support has been added to CQL3. Refer to
CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details and examples.
Upgrading
---------
- The native CQL transport is enabled by default on part 9042.
1.2.4
=====
Upgrading
---------
- 'nodetool upgradesstables' now only upgrades/rewrites sstables that are
not on the current version (which is usually what you want). Use the new
-a flag to recover the old behavior of rewriting all sstables.
Features
--------
- superuser setup delay (10 seconds) can now be overridden using
'cassandra.superuser_setup_delay_ms' property.
1.2.3
=====
Upgrading
---------
- CQL3 used to be case-insensitive for property map key in ALTER and CREATE
statements. In other words:
CREATE KEYSPACE test WITH replication = { 'CLASS' : 'SimpleStrategy',
'REPLICATION_FACTOR' : '1' }
was allowed. However, this was not consistent with the fact that string
literal are case sensitive in every other places and more importantly this
break NetworkTopologyStrategy for which DC names are case sensitive. Those
property map key are now case sensitive. So the statement above should be
changed to:
CREATE KEYSPACE test WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : '1' }
1.2.2
=====
Upgrading
---------
- CQL3 type validation for constants has been fixed, which may require
fixing queries that were relying on the previous loose validation. Please
refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
and in particular the changelog section for more details. Please note in
particular that inputing blobs as strings constants is now deprecated (in
favor of blob constants) and its support will be removed in a future
version.
Features
--------
- Built-in CQL3-based implementations of IAuthenticator (PasswordAuthenticator)
and IAuthorizer (CassandraAuthorizer) have been added. PasswordAuthenticator
stores usernames and hashed passwords in system_auth.credentials table;
CassandraAuthorizer stores permissions in system_auth.permissions table.
- system_auth keyspace is now alterable via ALTER KEYSPACE queries.
The default is SimpleStrategy with replication_factor of 1, but it's
advised to raise RF to at least 3 or 5, since CL.QUORUM is used for all
auth-related queries. It's also possible to change the strategy to NTS.
- Permissions caching with time-based expiration policy has been added to reduce
performance impact of authorization. Permission validity can be configured
using 'permissions_validity_in_ms' setting in cassandra.yaml. The default
is 2000 (2 seconds).
- SimpleAuthenticator and SimpleAuthorizer examples have been removed. Please
look at CassandraAuthorizer/PasswordAuthenticator instead.
1.2.1
=====
Upgrading
---------
- In CQL3, date string are no longer accepted as timeuuid value since a
date string is not a correct representation of a timeuuid. Instead, new
methods (minTimeuuid, maxTimeuuid, now, dateOf, unixTimestampOf) have been
introduced to make working on timeuuid from date string easy. cqlsh also
does not display timeuuid as date string (since this is a lossy
representation), but the new dateOf method can be used instead. Please
refer to the reference documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for more detail.
- For client implementors: CQL3 client using the thrift interface should
use the new execute_cql3_query, prepare_cql3_query and execute_prepared_cql3_query
since 1.2.0. However, Cassandra 1.2.0 was not complaining if CQL3 was set
through set_cql_version but the now CQL2 only methods were used. This is
now the case.
- Queries that uses unrecognized or bad compaction or replication strategy
options are now refused (instead of simply logging a warning).
1.2
===
Upgrading
---------
- IAuthenticator interface has been updated to support dynamic
user creation, modification and removal. Users, even when stored
externally, now have to be explicitly created using
CREATE USER query first. AllowAllAuthenticator and SimpleAuthenticator
have been updated for the new interface, but you'll have to update
your old IAuthenticator implementations for 1.2. To ease this process,
a new abstract LegacyAuthenticator class has been added - subclass it
in your old IAuthenticator implementaion and everything should just work
(this only affects users who implemented custom authenticators).
- IAuthority interface has been deprecated in favor of IAuthorizer.
AllowAllAuthority and SimpleAuthority have been renamed to
AllowAllAuthorizer and SimpleAuthorizer, respectively. In order to
simplify the upgrade to the new interface, a new abstract
LegacyAuthorizer has been added - you should subclass it in your
old IAuthority implementation and everything should just work
(this only affects users who implemented custom authorities).
'authority' setting in cassandra.yaml has been renamed to 'authorizer',
'authority' is no longer recognized. This affects all upgrading users.
- 1.2 is NOT network-compatible with versions older than 1.0. That
means if you want to do a rolling, zero-downtime upgrade, you'll need
to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains
the ability to read data files from Cassandra versions at least
back to 0.6, so a non-rolling upgrade remains possible with just
one step.
- The default partitioner for new clusters is Murmur3Partitioner,
which is about 10% faster for index-intensive workloads. Partitioners
cannot be changed once data is in the cluster, however, so if you are
switching to the 1.2 cassandra.yaml, you should change this to
RandomPartitioner or whatever your old partitioner was.
- If you using counters and upgrading from a version prior to
1.1.6, you should drain existing Cassandra nodes prior to the
upgrade to prevent overcount during commitlog replay (see
CASSANDRA-4782). For non-counter uses, drain is not required
but is a good practice to minimize restart time.
- Tables using LeveledCompactionStrategy will default to not
creating a row-level bloom filter. The default in older versions
of Cassandra differs; you should manually set the false positive
rate to 1.0 (to disable) or 0.01 (to enable, if you make many
requests for rows that do not exist).
- The hints schema was changed from 1.1 to 1.2. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 1.2 for the first time. Additionally, upgraded nodes
will not store new hints destined for older (pre-1.2) nodes. It is
therefore recommended that you perform a cluster upgrade when all
nodes are up. Because hints will be lost, a cluster-wide repair (with
-pr) is recommended after upgrade of all nodes.
- The `nodetool removetoken` command (and corresponding JMX operation)
have been renamed to `nodetool removenode`. This function is
incompatible with the earlier `nodetool removetoken`, and attempts to
remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster,
is not supported.
- The somewhat ill-conceived CollatingOrderPreservingPartitioner
has been removed. Use Murmur3Partitioner (recommended) or
ByteOrderedPartitioner instead.
- Global option hinted_handoff_throttle_delay_in_ms has been removed.
hinted_handoff_throttle_in_kb has been added instead.
- The default bloom filter fp chance has been increased to 1%.
This will save about 30% of the memory used by the old default.
Existing columnfamilies will retain their old setting.
- The default partitioner (for new clusters; the partitioner cannot be
changed in existing clusters) was changed from RandomPartitioner to
Murmur3Partitioner which provides faster hashing as well as improved
performance with secondary indexes.
- The default version of CQL (and cqlsh) is now CQL3. CQL2 is still
available but you will have to use the thrift set_cql_version method
(that is already supported in 1.1) to use CQL2. For cqlsh, you will need
to use 'cqlsh -2'.
- CQL3 is now considered final in this release. Compared to the beta
version that is part of 1.1, this final version has a few additions
(collections), but also some (incompatible) changes in the syntax for the
options of the create/alter keyspace/table statements. Typically, the
syntax to create a keyspace is now:
CREATE KEYSPACE ks WITH replication = { 'class' : 'SimpleStrategy',
'replication_factor' : 2 };
Also, the consistency level cannot be set in the language anymore, but is
at the protocol level.
Please refer to the CQL3 documentation (http://cassandra.apache.org/doc/cql3/CQL.html)
for details.
- In CQL3, the DROP behavior from ALTER TABLE has currently been removed
(because it was not correctly implemented). We hope to add it back soon
(Cassandra 1.2.1 or 1.2.2)
Features
--------
- Cassandra can now handle concurrent CREATE TABLE schema changes
as well as other updates
- rpc_timeout has been split up to allow finer-grained control
on timeouts for different operation types
- num_tokens can now be specified in cassandra.yaml. This defines the
number of tokens assigned to the host on the ring (default: 1).
Also specifying initial_token will override any num_tokens setting.
- disk_failure_policy allows blacklisting failed disks in JBOD
configuration instead of erroring out indefinitely
- event tracing can be configured per-connection ("trace_next_query")
or globally/probabilistically ("nodetool settraceprobability")
- Atomic batches are now supported server side, where Cassandra will
guarantee that (at the price of pre-writing the batch to another node
first), all mutations in the batch will be applied, even if the
coordinator fails mid-batch.
- new IAuthorizer interface has replaced the old IAuthority. IAuthorizer
allows dynamic permission management via new CQL3 statements:
GRANT, REVOKE, LIST PERMISSIONS. A native implementation storing
the permissions in Cassandra is being worked on and we expect to
include it in 1.2.1 or 1.2.2.
- IAuthenticator interface has been updated to support dynamic user
creation, modification and removal via new CQL3 statements:
CREATE USER, ALTER USER, DROP USER, LIST USERS. A native implementation
that stores users in Cassandra itself is being worked on and is expected to
become part of 1.2.1 or 1.2.2.
1.1.5
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.4
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
1.1.3
=====
Upgrading
---------
- Running "nodetool upgradesstables" after upgrading is recommended
if you use Counter columnfamilies.
Features
--------
- the cqlsh COPY command can now export to CSV flat files
- added a new tools/bin/token-generator to facilitate generating evenly distributed tokens
1.1.2
=====
Upgrading
---------
- If you have column families using the LeveledCompactionStrategy, you should run scrub on those column families.
Features
--------
- cqlsh has a new COPY command to load data from CSV flat files
1.1.1
=====
Upgrading
---------
- Nothing specific to this release, but please see 1.1 if you are upgrading
from a previous version.
Features
--------
- Continuous commitlog archiving and point-in-time recovery.
See conf/commitlog_archiving.properties
- Incremental repair by token range, exposed over JMX
1.1
===
Upgrading
---------
- Compression is enabled by default on newly created ColumnFamilies
(and unchanged for ColumnFamilies created prior to upgrading).
- If you are running a multi datacenter setup, you should upgrade to
the latest 1.0.x (or 0.8.x) release before upgrading. Versions
0.8.8 and 1.0.3-1.0.5 generate cross-dc forwarding that is incompatible
with 1.1.
- EACH_QUORUM ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a LOCAL_QUORUM read instead.)
- ANY ConsistencyLevel is only supported for writes and will now
throw an InvalidRequestException when used for reads. (Previous
versions would silently perform a ONE read for range queries;
single-row and multiget reads already rejected ANY.)
- The largest mutation batch accepted by the commitlog is now 128MB.
(In practice, batches larger than ~10MB always caused poor
performance due to load volatility and GC promotion failures.)
Larger batches will continue to be accepted but will not be
durable. Consider setting durable_writes=false if you really
want to use such large batches.
- Make sure that global settings: key_cache_{size_in_mb, save_period}
and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
used instead of per-ColumnFamily options.
- JMX methods no longer return custom Cassandra objects. Any such methods
will now return standard Maps, Lists, etc.
- Hadoop input and output details are now separated. If you were
previously using methods such as getRpcPort you now need to use
getInputRpcPort or getOutputRpcPort depending on the circumstance.
- CQL changes:
+ Prior to 1.1, you could use KEY as the primary key name in some
select statements, even if the PK was actually given a different
name. In 1.1+ you must use the defined PK name.
- The sliced_buffer_size_in_kb option has been removed from the
cassandra.yaml config file (this option was a no-op since 1.0).
Features
--------
- Concurrent schema updates are now supported, with any conflicts
automatically resolved. Please note that simultaneously running
‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t
be safe until version 1.2 due to the nature of ColumnFamily
identifier generation, for more details see CASSANDRA-3794.
- The CQL language has undergone a major revision, CQL3, the
highlights of which are covered at [1]. CQL3 is not
backwards-compatibile with CQL2, so we've introduced a
set_cql_version Thrift method to specify which version you want.
(The default remains CQL2 at least until Cassandra 1.2.) cqlsh
adds a --cql3 flag to enable this.
[1] http://www.datastax.com/dev/blog/schema-in-cassandra-1-1
- Row-level isolation: multi-column updates to a single row have
always been *atomic* (either all will be applied, or none)
thanks to the CommitLog, but until 1.1 they were not *isolated*
-- a reader may see mixed old and new values while the update
happens.
- Finer-grained control over data directories, allowing a ColumnFamily to
be pinned to specfic volume, e.g. one backed by SSD.
- The bulk loader is not longer a fat client; it can be run from an
existing machine in a cluster.
- A new write survey mode has been added, similar to bootstrap (enabled via
-Dcassandra.write_survey=true), but the node will not automatically join
the cluster. This is useful for cases such as testing different
compaction strategies with live traffic without affecting the cluster.
- Key and row caches are now global, similar to the global memtable
threshold. Manual tuning of cache sizes per-columnfamily is no longer
required.
- Off-heap caches no longer require JNA, and will work out of the box
on Windows as well as Unix platforms.
- Streaming is now multithreaded.
- Compactions may now be aborted via JMX or nodetool.
- The stress tool is not new in 1.1, but it is newly included in
binary builds as well as the source tree
- Hadoop: a new BulkOutputFormat is included which will directly write
SSTables locally and then stream them into the cluster.
YOU SHOULD USE BulkOutputFormat BY DEFAULT. ColumnFamilyOutputFormat
is still around in case for some strange reason you want results
trickling out over Thrift, but BulkOutputFormat is significantly
more efficient.
- Hadoop: KeyRange.filter is now supported with ColumnFamilyInputFormat,
allowing index expressions to be evaluated server-side to reduce
the amount of data sent to Hadoop.
- Hadoop: ColumnFamilyRecordReader has a wide-row mode, enabled via
a boolean parameter to setInputColumnFamily, that pages through
data column-at-a-time instead of row-at-a-time.
- Pig: can use the wide-row Hadoop support, by setting PIG_WIDEROW_INPUT
to true. This will produce each row's columns in a bag.
1.0.8
=====
Upgrading
---------
- Nothing specific to 1.0.8
Other
-----
- Allow configuring socket timeout for streaming
1.0.7
=====
Upgrading
---------
- Nothing specific to 1.0.7, please report to instruction for 1.0.6
Other
-----
- Adds new setstreamthroughput to nodetool to configure streaming
throttling
- Adds JMX property to get/set rpc_timeout_in_ms at runtime
- Allow configuring (per-CF) bloom_filter_fp_chance
1.0.6
=====
Upgrading
---------
- This release fixes an issue related to the chunk_length_kb option for
compressed sstables. If you use compression on some column families, it
is recommended after the upgrade to check the value for this option on
these column families (the default value is 64). In case the option would
not be set correctly, you should update the column family definition,
setting the right value and then run scrub on the column family.
- Please report to instruction for 1.0.5 if coming from an older version.
1.0.5
=====
Upgrading
---------
- 1.0.5 comes to fix two important regression of 1.0.4. So all information
concerning 1.0.4 are valid for this release, but please avoids upgrading
to 1.0.4.
1.0.4
=====
Upgrading
---------
- Nothing specific to 1.0.4 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- A new upgradesstables command has been added to nodetool. It is very
similar to scrub but without the ability to discard corrupted rows (and
as a consequence it does not snapshot automatically before). This new
command is to be prefered to scrub in all cases where sstables should be
rewritten to the current format for upgrade purposes.
JMX
---
- The path for the data, commit log and saved cache directories exposed
through JMX
- The in-memory bloom filter sizes are now exposed through JMX
1.0.3
=====
Upgrading
---------
- Nothing specific to 1.0.3 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- For non compressed sstables (compressed sstable already include more
fine grained checsums), a sha1 for the full sstable is now automatically
created (in a fix with suffix -Digest.sha1). It can be used to check the
sstable integrity with sha1sum.
1.0.2
=====
Upgrading
---------
- Nothing specific to 1.0.2 but please see the 1.0 upgrading section if
upgrading from a version prior to 1.0.0
Features
--------
- Cassandra CLI queries now have timing information
1.0.1
=====
Upgrading
---------
- If upgrading from a version prior to 1.0.0, please see the 1.0 Upgrading
section
- For running on Windows as a Service, procrun is no longer discributed
with Cassandra, see README.txt for more information on how to download
it if necessary.
- The name given to snapshots directories have been improved for human
readability. If you had scripts relying on it, you may need to update
them.
1.0
===
Upgrading
---------
- Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling
restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible
with 1.0: upgrade to the most recent 0.8 release first.)
You do not need to bring down the whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- CQL inserts/updates now generate microsecond resolution timestamps
by default, instead of millisecond. THIS MEANS A ROLLING UPGRADE COULD
MIX milliseconds and microseconds, with clients talking to servers
generating milliseconds unable to overwrite the larger microsecond
timestamps. If you are using CQL and this is important for your
application, you can either perform a non-rolling upgrade to 1.0, or
update your application first to use explicit timestamps with the "USING
timestamp=X" syntax.
- The BinaryMemtable bulk-load interface has been removed (use the
sstableloader tool instead).
- The compaction_thread_priority setting has been removed from
cassandra.yaml (use compaction_throughput_mb_per_sec to throttle
compaction instead).
- CQL types bytea and date were renamed to blob and timestamp, respectively,
to conform with SQL norms. CQL type int is now a 4-byte int, not 8
(which is still available as bigint).
- Cassandra 1.0 uses arena allocation to reduce old generation
fragmentation. This means there is a minimum overhead of 1MB
per ColumnFamily plus 1MB per index.
- The SimpleAuthenticator and SimpleAuthority classes have been moved to
the example directory (and are thus not available from the binary
distribution). They never provided actual security and in their current
state are only meant as examples.
Features
--------
- SSTable compression is supported through the 'compression_options'
parameter when creating/updating a column family. For instance, you can
create a column family Cf using compression (through the Snappy library)
in the CLI with:
create column family Cf with compression_options={sstable_compression: SnappyCompressor}
SSTable compression is not activated by default but can be activated or
deactivated at any time.
- Compressed SSTable blocks are checksummed to protect against bitrot
- New LevelDB-inspired compaction algorithm can be enabled by setting the
Columnfamily compaction_strategy=LeveledCompactionStrategy option.
Leveled compaction means you only need to keep a few MB of space free for
compaction instead of (in the worst case) 50%.
- Ability to use multiple threads during a single compaction. See
multithreaded_compaction in cassandra.yaml for more details.
- Windows Service ("cassandra.bat install" to enable)
- A dead node may be replaced in a single step by starting a new node
with -Dcassandra.replace_token=<token>. More details can be found at
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
- It is now possible to repair only the first range returned by the
partitioner for a node with `nodetool repair -pr`. It makes it
easier/possible to repair a full cluster without any work duplication by
running this command on every node of the cluster.
New data types
--------------
- decimal
Other
-----
- Hinted Handoff has two major improvements:
- Hint replay is much more efficient thanks to a change in the data model
- Hints are created for all replicas that do not ack a write. (Formerly,
only replicas known to be down when the write started were hinted.)
This means that running with read repair completely off is much more
viable than before, and the default read_repair_chance is reduced from 1.0
("always repair") to 0.1 ("repair 10% of the time").
- The old per-ColumnFamily memtable thresholds
(memtable_throughput_in_mb, memtable_operations_in_millions,
memtable_flush_after_mins) are ignored, in favor of the global
memtable_total_space_in_mb and commitlog_total_space_in_mb settings.
This does not affect client compatibility -- the old options are
still allowed, but have no effect. These options may be removed
entirely in a future release.
- Backlogged compactions will begin five minutes after startup. The 0.8
behavior of never starting compaction until a flush happens is usually
not what is desired, but a short grace period is useful to allow caches
to warm up first.
- The deletion of compacted data files is not performed during Garbage
Collection anymore. This means compacted files will now be deleted
without delay.
0.8.5
=====
Features
--------
- SSTables copied to a data directory can be loaded by a live node through
nodetool refresh (may be handy to load snapshots).
- The configured compaction throughput is exposed through JMX.
Other
-----
- The sstableloader is now bundled with the debian package.
- Repair detects when a participating node is dead and fails instead of
hanging forever.
0.8.4
=====
Upgrading
---------
- Nothing specific to 0.8.4
Other
-----
- This release comes to fix a bug in counter that could lead to
(important) over-count.
- It also fixes a slight upgrade regression from 0.8.3. It is thus advised
to jump directly to 0.8.4 if upgrading from before 0.8.3.
0.8.3
=====
Upgrading
---------
- Token removal has been revamped. Removing tokens in a mixed cluster with
0.8.3 will not work, so the entire cluster will need to be running 0.8.3
first, except for the dead node.
Features
--------
- It is now possible to use thrift asynchronous and
half-synchronous/half-asynchronous servers (see cassandra.yaml for more
details).
- It is now possible to access counter columns through Hadoop.
Other
-----
- This release fix a regression of 0.8 that can make commit log segment to
be deleted even though not all data it contains has been flushed.
Upgrades from 0.8.* is very much encouraged.
0.8.2
=====
Upgrading
---------
- 0.8.0 and 0.8.1 shipped with a bug that was setting the
replicate_on_write option for counter column families to false (this
option has no effect on non-counter column family). This is an unsafe
default and 0.8.2 correct this, the default for replicate_on_write is
now true. It is advised to update your counter column family definitions
if replicate_on_write was uncorrectly set to false (before or after
upgrade).
0.8.1
=====
Upgrading
---------
- 0.8.1 is backwards compatible with 0.8, upgrade can be achieved by a
simple rolling restart.
- If upgrading for earlier version (0.7), please refer to the 0.8 section
for instructions.
Features
--------
- Numerous additions/improvements to CQL (support for counters, TTL, batch
inserts/deletes, index dropping, ...).
- Add two new AbstractTypes (comparator) to support compound keys
(CompositeType and DynamicCompositeType), as well as a ReverseType to
reverse the order of any existing comparator.
- New option to bypass the commit log on some keyspaces (for advanced
users).
Tools
-----
- Add new data bulk loading utility (sstableloader).
0.8
===
Upgrading
---------
- Upgrading from version 0.7.1 or later can be done with a rolling
restart, one node at a time. You do not need to bring down the
whole cluster at once.
- After upgrading, run nodetool scrub against each node before running
repair, moving nodes, or adding new ones.
- Running nodetool drain before shutting down the 0.7 node is
recommended but not required. (Skipping this will result in
replay of entire commitlog, so it will take longer to restart but
is otherwise harmless.)
- 0.8 is fully API-compatible with 0.7. You can continue
to use your 0.7 clients.
- Avro record classes used in map/reduce and Hadoop streaming code have
been removed. Map/reduce can be switched to Thrift by changing
org.apache.cassandra.avro in import statements to
org.apache.cassandra.thrift (no class names change). Streaming support
has been removed for the time being.
- The loadbalance command has been removed from nodetool. For similar
behavior, decommission then rebootstrap with empty initial_token.
- Thrift unframed mode has been removed.
- The addition of key_validation_class means the cli will assume keys
are bytes, instead of strings, in the absence of other information.
See http://wiki.apache.org/cassandra/FAQ#cli_keys for more details.
Features
--------
- added CQL client API and JDBC/DBAPI2-compliant drivers for Java and
Python, respectively (see: drivers/ subdirectory and doc/cql)
- added distributed Counters feature;
see http://wiki.apache.org/cassandra/Counters
- optional intranode encryption; see comments around 'encryption_options'
in cassandra.yaml
- compaction multithreading and rate-limiting; see
'concurrent_compactors' and 'compaction_throughput_mb_per_sec' in
cassandra.yaml
- cassandra will limit total memtable memory usage to 1/3 of the heap
by default. This can be ajusted or disabled with the
memtable_total_space_in_mb option. The old per-ColumnFamily
throughput, operations, and age settings are still respected but
will be removed in a future major release once we are satisfied that
memtable_total_space_in_mb works adequately.
Tools
-----
- stress and py_stress moved from contrib/ to tools/
- clustertool was removed (see
https://issues.apache.org/jira/browse/CASSANDRA-2607 for examples
of how to script nodetool across the cluster instead)
Other
-----
- In the past, sstable2json would write column names and values as
hex strings, and now creates human readable values based on the
comparator/validator. As a result, JSON dumps created with
older versions of sstable2json are no longer compatible with
json2sstable, and imports must be made with a configuration that
is identical to the export.
- manually-forced compactions ("nodetool compact") will do nothing
if only a single SSTable remains for a ColumnFamily. To force it
to compact that anyway (which will free up space if there are
a lot of expired tombstones), use the new forceUserDefinedCompaction
JMX method on CompactionManager.
- most of contrib/ (which was not part of the binary releases)
has been moved either to examples/ or tools/. We plan to move the
rest for 0.8.1.
JMX
---
- By default, JMX now listens on port 7199.
0.7.6
=====
Upgrading
---------
- Nothing specific to 0.7.6, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
0.7.5
=====
Upgrading
---------
- Nothing specific to 0.7.5, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Changes
-------
- system_update_column_family no longer snapshots before applying
the schema change. (_update_keyspace never did. _drop_keyspace
and _drop_column_family continue to snapshot.)
- added memtable_flush_queue_size option to cassandra.yaml to
avoid blocking writes when multiple column families (or a colum
family with indexes) are flushed at the same time.
- allow overriding initial_token, storage_port and rpc_port using
system properties
0.7.4
=====
Upgrading
---------
- Nothing specific to 0.7.4, but see 0.7.3 Upgrading if upgrading
from earlier than 0.7.1.
Features
--------
- Output to Pig is now supported as well as input
0.7.3
=====
Upgrading
---------
- 0.7.1 and 0.7.2 shipped with a bug that caused incorrect row-level
bloom filters to be generated when compacting sstables generated
with earlier versions. This would manifest in IOExceptions during
column name-based queries. 0.7.3 provides "nodetool scrub" to
rebuild sstables with correct bloom filters, with no data lost.
(If your cluster was never on 0.7.0 or earlier, you don't have to
worry about this.) Note that nodetool scrub will snapshot your
data files before rebuilding, just in case.
0.7.1
=====
Upgrading
---------
- 0.7.1 is completely backwards compatible with 0.7.0. Just restart
each node with the new version, one at a time. (The cluster does
not all need to be upgraded simultaneously.)
Features
--------
- added flush_largest_memtables_at and reduce_cache_sizes_at options
to cassandra.yaml as an escape valve for memory pressure
- added option to specify -Dcassandra.join_ring=false on startup
to allow "warm spare" nodes or performing JMX maintenance before
joining the ring
Performance
-----------
- Disk writes and sequential scans avoid polluting page cache
(requires JNA to be enabled)
- Cassandra performs writes efficiently across datacenters by
sending a single copy of the mutation and having the recipient
forward that to other replicas in its datacenter.
- Improved network buffering
- Reduced lock contention on memtable flush
- Optimized supercolumn deserialization
- Zero-copy reads from mmapped sstable files
- Explicitly set higher JVM new generation size
- Reduced i/o contention during saving of caches
0.7.0
=====
Features
--------
- Secondary indexes (indexes on column values) are now supported
- Row size limit increased from 2GB to 2 billion columns. rows
are no longer read into memory during compaction.
- Keyspace and ColumnFamily definitions may be added and modified live
- Streaming data for repair or node movement no longer requires
anticompaction step first
- NetworkTopologyStrategy (formerly DatacenterShardStrategy) is ready for
use, enabling ConsistencyLevel.DCQUORUM and DCQUORUMSYNC. See comments
in `cassandra.yaml.`
- Optional per-Column time-to-live field allows expiring data without
have to issue explicit remove commands
- `truncate` thrift method allows clearing an entire ColumnFamily at once
- Hadoop OutputFormat and Streaming [non-jvm map/reduce via stdin/out]
support
- Up to 8x faster reads from row cache
- A new ByteOrderedPartitioner supports bytes keys with arbitrary content,
and orders keys by their byte value. This should be used in new
deployments instead of OrderPreservingPartitioner.
- Optional round-robin scheduling between keyspaces for multitenant
clusters
- Dynamic endpoint snitch mitigates the impact of impaired nodes
- New `IntegerType`, faster than LongType and allows integers of
both less and more bits than Long's 64
- A revamped authentication system that decouples authorization and
allows finer-grained control of resources.
Upgrading
---------
The Thrift API has changed in incompatible ways; see below, and refer
to http://wiki.apache.org/cassandra/ClientOptions for a list of
higher-level clients that have been updated to support the 0.7 API.
The Cassandra inter-node protocol is incompatible with 0.6.x
releases (and with 0.7 beta1), meaning you will have to bring your
cluster down prior to upgrading: you cannot mix 0.6 and 0.7 nodes.
The hints schema was changed from 0.6 to 0.7. Cassandra automatically
snapshots and then truncates the hints column family as part of
starting up 0.7 for the first time.
Keyspace and ColumnFamily definitions are stored in the system
keyspace, rather than the configuration file.
The process to upgrade is:
1) run "nodetool drain" on _each_ 0.6 node. When drain finishes (log
message "Node is drained" appears), stop the process.
2) Convert your storage-conf.xml to the new cassandra.yaml using
"bin/config-converter".
3) Rename any of your keyspace or column family names that do not adhere
to the '^\w+' regex convention.
4) Start up your cluster with the 0.7 version.
5) Initialize your Keyspace and ColumnFamily definitions using
"bin/schematool <host> <jmxport> import". _You only need to do
this to one node_.
Thrift API
----------
- The Cassandra server now defaults to framed mode, rather than
unframed. Unframed is obsolete and will be removed in the next
major release.
- The Cassandra Thrift interface file has been updated for Thrift 0.5.
If you are compiling your own client code from the interface, you
will need to upgrade the Thrift compiler to match.
- Row keys are now bytes: keys stored by versions prior to 0.7.0 will be
returned as UTF-8 encoded bytes. OrderPreservingPartitioner and
CollatingOrderPreservingPartitioner continue to expect that keys contain
UTF-8 encoded strings, but RandomPartitioner now works on any key data.
- keyspace parameters have been replaced with the per-connection
set_keyspace method.
- The return type for login() is now AccessLevel.
- The get_string_property() method has been removed.
- The get_string_list_property() method has been removed.
Configuraton
------------
- Configuration file renamed to cassandra.yaml and log4j.properties to
log4j-server.properties
- PropertyFileSnitch configuration file renamed to
cassandra-topology.properties
- The ThriftAddress and ThriftPort directives have been renamed to
RPCAddress and RPCPort respectively.
- EndPointSnitch was renamed to RackInferringSnitch. A new SimpleSnitch
has been added.
- RackUnawareStrategy and RackAwareStrategy have been renamed to
SimpleStrategy and OldNetworkTopologyStrategy, respectively.
- RowWarningThresholdInMB replaced with in_memory_compaction_limit_in_mb
- GCGraceSeconds is now per-ColumnFamily instead of global
- Keyspace and column family names that do not confirm to a '^\w+' regex
are considered illegal.
- Keyspace and column family definitions will need to be loaded via
"bin/schematool <host> <jmxport> import". _You only need to do this to
one node_.
- In addition to an authenticator, an authority must be configured as
well. Users of SimpleAuthenticator should use SimpleAuthority for this
value (the default is AllowAllAuthority, which corresponds with
AllowAllAuthenticator).
- The format of access.properties has changed, see the sample configuration
conf/access.properties for documentation on the new format.
JMX
---
- StreamingService moved from o.a.c.streaming to o.a.c.service
- GMFD renamed to GOSSIP_STAGE
- {Min,Mean,Max}RowCompactedSize renamed to {Min,Mean,Max}RowSize
since it no longer has to wait til compaction to be computed
Other
-----
- If extending AbstractType, make sure you follow the singleton pattern
followed by Cassandra core AbstractType classes: provide a public
static final variable called 'instance'.
0.6.6
=====
Upgrading
---------
- As part of the cache-saving feature, a third directory
(along with data and commitlog) has been added to the config
file. You will need to set and create this directory
when restarting your node into 0.6.6.
0.6.1
=====
Upgrading
---------
- We try to keep minor versions 100% compatible (data format,
commitlog format, network format) within the major series, but
we introduced a network-level incompatibility in 0.6.1.
Thus, if you are upgrading from 0.6.0 to any higher version
(0.6.1, 0.6.2, etc.) then you will need to restart your entire
cluster with the new version, instead of being able to do a
rolling restart.
0.6.0
=====
Features
--------
- row caching: configure with the RowsCached attribute in
ColumnFamily definition
- Hadoop map/reduce support: see contrib/word_count for an example
- experimental authentication support, described under
Authenticator in storage.conf
Configuraton
------------
- MemtableSizeInMB has been replaced by MemtableThroughputInMB which
triggers a memtable flush when the specified amount of data has
been written, including overwrites.
- MemtableObjectCountInMillions has been replaced by the
MemtableOperationsInMillions directive which causes a memtable flush
to occur after the specified number of operations.
- Like MemtableSizeInMB, BinaryMemtableSizeInMB has been replaced by
BinaryMemtableThroughputInMB.
- Replication factor is now per-keyspace, rather than global.
- KeysCachedFraction is deprecated in favor of KeysCached
- RowWarningThresholdInMB added, to warn before very large rows
get big enough to threaten node stability
Thrift API
----------
- removed deprecated get_key_range method
- added batch_mutate meethod
- deprecated multiget and batch_insert methods in favor of
multiget_slice and batch_mutate, respectively
- added ConsistencyLevel.ANY, for when you want write
availability even when it may not be readable immediately.
Unlike CL.ZERO, though, it will throw an exception if
it cannot be written *somewhere*.
JMX metrics
-----------
- read and write statistics are reported as lifetime totals,
instead of averages over the last minute. average-since-last
requested are also available for convenience.
- cache hit rate statistics are now available from JMX under
org.apache.cassandra.db.Caches
- compaction JMX metrics are moved to
org.apache.cassandra.db.CompactionManager. PendingTasks is now
a much better estimate of compactions remaining, and the
progress of the current compaction has been added.
- commitlog JMX metrics are moved to org.apache.cassandra.db.Commitlog
- progress of data streaming during bootstrap, loadbalance, or other
data migration, is available under
org.apache.cassandra.streaming.StreamingService.
See http://wiki.apache.org/cassandra/Streaming for details.
Installation/Upgrade
--------------------
- 0.6 network traffic is not compatible with earlier versions. You
will need to shut down all your nodes at once, upgrade, then restart.
0.5.0
=====
0. The commitlog format has changed (but sstable format has not).
When upgrading from 0.4, empty the commitlog either by running
bin/nodeprobe flush on each machine and waiting for the flush to finish,
or simply remove the commitlog directory if you only have test data.
(If more writes come in after the flush command, starting 0.5 will error
out; if that happens, just go back to 0.4 and flush again.)
The format changed twice: from 0.4 to beta1, and from beta2 to RC1.
.5 The gossip protocol has changed, meaning 0.5 nodes cannot coexist
in a cluster of 0.4 nodes or vice versa; you must upgrade your
whole cluster at the same time.
1. Bootstrap, move, load balancing, and active repair have been added.
See http://wiki.apache.org/cassandra/Operations. When upgrading
from 0.4, leave autobootstrap set to false for the first restart
of your old nodes.
2. Performance improvements across the board, especially on the write
path (over 100% improvement in stress.py throughput).
3. Configuration:
- Added "comment" field to ColumnFamily definition.
- Added MemtableFlushAfterMinutes, a global replacement for the
old per-CF FlushPeriodInMinutes setting
- Key cache settings
4. Thrift:
- Added get_range_slice, deprecating get_key_range
0.4.2
=====
1. Improve default garbage collector options significantly --
throughput will be 30% higher or more.
0.4.1
=====
1. SnapshotBeforeCompaction configuration option allows snapshotting
before each compaction, which allows rolling back to any version
of the data.
0.4.0
=====
1. On-disk data format has changed to allow billions of keys/rows per
node instead of only millions. The new format is incompatible with 0.3;
see 0.3 notes below for how to import data from a 0.3 install.
2. Cassandra now supports multiple keyspaces. Typically you will have
one keyspace per application, allowing applications to be able to
create and modify ColumnFamilies at will without worrying about
collisions with others in the same cluster.
3. Many Thrift API changes and documentation. See
http://wiki.apache.org/cassandra/API
4. Removed the web interface in favor of JMX and bin/nodeprobe, which
has significantly enhanced functionality.
5. Renamed configuration "<Table>" to "<Keyspace>".
6. Added commitlog fsync; see "<CommitLogSync>" in configuration.
0.3.0
=====
1. With enough and large enough keys in a ColumnFamily, Cassandra will
run out of memory trying to perform compactions (data file merges).
The size of what is stored in memory is (S + 16) * (N + M) where S
is the size of the key (usually 2 bytes per character), N is the
number of keys and M, is the map overhead (which can be guestimated
at around 32 bytes per key).
So, if you have 10-character keys and 1GB of headroom in your heap
space for compaction, you can expect to store about 17M keys
before running into problems.
See https://issues.apache.org/jira/browse/CASSANDRA-208
2. Because fixing #1 requires a data file format change, 0.4 will not
be binary-compatible with 0.3 data files. A client-side upgrade
can be done relatively easily with the following algorithm:
for key in old_client.get_key_range(everything):
columns = old_client.get_slice or get_slice_super(key, all columns)
new_client.batch_insert or batch_insert_super(key, columns)
The inner loop can be trivially parallelized for speed.
3. Commitlog does not fsync before reporting a write successful.
Using blocking writes mitigates this to some degree, since all
nodes that were part of the write quorum would have to fail
before sync for data to be lost.
See https://issues.apache.org/jira/browse/CASSANDRA-182
Additionally, row size (that is, all the data associated with a single
key in a given ColumnFamily) is limited by available memory, because
compaction deserializes each row before merging.
See https://issues.apache.org/jira/browse/CASSANDRA-16