Getting started with Analytics and Hadoop in DataStax Enterprise
The Hadoop component in DataStax Enterprise enables analytics to be run across
DataStax Enterprise's distributed, shared-nothing architecture. Instead of using the Hadoop
Distributed File System (HDFS), DataStax Enterprise uses Cassandra File System (CFS)
keyspaces for the underlying storage layer.
Using the job tracker node
DataStax Enterprise schedules a series of tasks on the analytics nodes for each
MapReduce job that is submitted to the job tracker.
About the Cassandra File System
A Hive or Pig analytics job requires a Hadoop file system to function. For use with
DSE Hadoop, DataStax Enterprise provides a replacement for the Hadoop Distributed File
System (HDFS) called the Cassandra File System (CFS).