DSE Analytics with Hadoop DSE Hadoop topics. Getting started with Analytics and Hadoop in DataStax Enterprise The Hadoop component in DataStax Enterprise enables analytics to be run across DataStax Enterprise's distributed, shared-nothing architecture. Instead of using the Hadoop Distributed File System (HDFS), DataStax Enterprise uses Cassandra File System (CFS) keyspaces for the underlying storage layer. Using the job tracker node DataStax Enterprise schedules a series of tasks on the analytics nodes for each MapReduce job that is submitted to the job tracker. About the Cassandra File System A Hive or Pig analytics job requires a Hadoop file system to function. For use with DSE Hadoop, DataStax Enterprise provides a replacement for the Hadoop Distributed File System (HDFS) called the Cassandra File System (CFS). Using the cfs-archive to store huge files The Cassandra File System (CFS) consists of two layers: cfs and cfs-archive. Using cfs-archive is recommended for long-term storage of huge files. Using Hive DataStax Enterprise includes a Cassandra-enabled Hive MapReduce client. DataStax ODBC driver for Hive on Windows The DataStax ODBC Driver for Hive provides Windows users access to the information that is stored in DSE Hadoop. Using Mahout DataStax Enterprise integrates Apache Mahout, a Hadoop component that offers machine learning libraries. Using Pig DataStax Enterprise includes a Cassandra File System (CFS) enabled Apache Pig Client to provide a high-level programming environment for MapReduce coding. Sqoop Migrating data using Sqoop topics.