HOME
ACADEMY
DOCS
CONTACT US
DOWNLOAD DATASTAX
GLOSSARY
SUPPORT
DEVELOPER BLOGS
This document is
no longer maintained
.
DataStax Enterprise 4.5
EOSL
About DataStax Enterprise
Upgrading
Installing DataStax Enterprise 4.5
Installer - GUI or Text mode
Installer - unattended
Other install methods
Using the Yum repository
Using the APT repository
Using the binary tarball
On cloud providers
Installing on Amazon EC2
Installing on CenturyLink
Installing on Google Compute Engine
Installing on Microsoft Azure
Installing EPEL on RHEL OS 5.x
Uninstalling DataStax Enterprise 4.5
Managing security
Security management
Authenticating with Kerberos
Securing DataStax Enterprise nodes with Kerberos
Creating users
Enabling Kerberos security
Using cqlsh with Kerberos security
Using Kerberos authentication with Sqoop
Client-to-node encryption
Node-to-node encryption
Server certificates
Running cqlsh
Transparent data encryption
Encrypting data
Configuring encryption options
Migrating encrypted tables
Data auditing
Log formats
Configuring auditing
Internal authentication
Configuring internal authentication and authorization
Changing the default superuser
Enable internal security without downtime
cqlsh login
Managing object permissions
Configuring keyspace replication
Configuring firewall ports
Using the in-memory option
DSE Analytics
Introduction to DSE Analytics
Analyzing data using Spark
Spark introduction
Spark security
Setting Cassandra-specific properties
Spark configuration
Portfolio Manager demo using Spark
Starting Spark and Shark
Getting started with Shark
Spark user interface
Accessing Cassandra from Spark
Importing a Text File into a CQL Table
Spark supported types
Databricks ODBC driver for Apache Shark
Running the Weather Sensor demo
Analyzing data using external Hadoop systems (BYOH)
BYOH Introduction
BYOH Prerequisites and installation
Configuring an external Hadoop system
Starting up the BYOH data center
Using BYOH
Analyzing data using DSE Hadoop
DSE Hadoop introduction
Using the job tracker node
About the Cassandra File System
Using the cfs-archive to store huge files
Using Hive
ODBC driver for Hive
Using Mahout
Using Pig
DSE Search
Getting Started with Solr
Supported and unsupported features
Defining key Solr terms
Installing Solr nodes
Solr tutorial
Create Cassandra table
Import data
Create a search index
Exploring the Solr Admin
Simple search
Faceted search
Solr HTTP API
Configuring Solr
Mapping of Solr types
Legacy mapping of Solr types
Configuring the Solr type mapping version
Changing Solr Types
Configuring search components
Configuring multithreaded queries
Creating a schema and data modeling
Configuring the Solr library path
Configuring the Data Import Handler
Creating an index for searching
Uploading the schema and configuration
Creating a Solr core
Reloading a Solr core
Rebuilding an index using the UI
Checking indexing status
Adding and viewing index resources
Using DSE Search/Solr
Inserting, indexing, and searching data
Example: Using a CQL collection set
Inserting/updating data
Using dynamic fields
Deleting Solr data
Using copy fields
Viewing Solr core status
Querying Solr data
Using SolrJ and other Solr clients
Shard selection
Using the ShardRouter Mbean
Using the Solr HTTP API
Delete by id
Joining cores
Tracing Solr HTTP requests
Limiting columns indexed and returned by a query
Querying multiple tables
Querying using autocomplete/spellcheck
Using CQL for Solr queries
Using eDisMax
Capacity planning
Segregating workloads
Common operations
Handling inconsistencies in query results
Adding, decommissioning, repairing a node
Shuffling shards to balance the load
Managing the location of Solr data
Solr log messages
Changing the Solr connector port
Securing a Solr cluster
Fast repair
Excluding hosts from Solr-distributed queries
Expiring a DSE Search column
Changing the HTTP interface to Apache JServe Protocol
Shard transport options for DSE Search/Solr communications
Tuning DSE Search performance
Metrics MBeans
Using table compression
Configuring the update handler and autoSoftCommit
Parallelizing large Cassandra row reads
Changing the stack size and memtable space
Managing the consistency level
Configuring the available indexing threads
Managing caching
Increasing performance
Changing the replication factor
Configuring re-indexing
Solr MBeans
Update request processor and field transformer
Custom URP example
Field input/output transformer example
FIT reference implementation
Interface for custom field types
DSE vs. Open source
Wikipedia demo
Running the demo on a secure cluster
DSE Data Migration
Migrating data using Sqoop
About Sqoop
Running the Sqoop demo
Importing SQL to a CQL table or CFS
Importing data into a CQL list or set
Importing data into a CQL map
Importing joined tables
Exporting CQL data to SQL
Exporting selected CQL data to SQL
Exporting data from CQL collections
Automating a Sqoop operation
Sqoop reference
Migrating data using other methods
Deploying
Production deployment planning
Configuring replication
Mixing workloads
Single data center deployment per workload type
Multiple data center deployment per workload type
Single-token architecture deployment
Calculating tokens
Expanding an AMI cluster
DataStax Management Services
Performance Service
About the Performance Service
Configuring dse_perf keyspace replication
Enabling the Performance Service
Diagnostic table reference
Capacity Service
Repair Service
DataStax Enterprise tools
The dse commands
The dsetool
The cfs-stress tool
Pre-flight check and yaml_diff tools
Using the Cassandra bulk loader in a secure environment
Reference
Configuration (dse.yaml)
Starting and stopping DSE
Starting as a service
Starting as a stand-alone process
Stopping a node
Verify DSE running
File locations: Installer-Services and Package
File locations: Installer-No Services and Tarball
Troubleshooting
Cassandra Log4j appender
Log4j search demo
Installing glibc on Oracle Linux
Release notes
Home
Academy
Docs home
Contact us
Download DataStax
Glossary
Support
Developer blogs
Search
Home
DSE Analytics
Analyzing data using DSE Hadoop
Analyzing data using DSE Hadoop
DSE Hadoop topics.
DSE Hadoop introduction
The Hadoop component in DataStax Enterprise enables analytics to be run across DataStax Enterprise's distributed, shared-nothing architecture. Instead of using the Hadoop Distributed File System (HDFS), DataStax Enterprise uses Cassandra File System (CFS) keyspaces for the underlying storage layer.
Using the job tracker node
DataStax Enterprise schedules a series of tasks on the analytics nodes for each MapReduce job that is submitted to the job tracker.
About the Cassandra File System
A Hive or Pig analytics job requires a Hadoop file system to function. For use with DSE Hadoop, DataStax Enterprise provides a replacement for the Hadoop Distributed File System (HDFS) called the Cassandra File System (CFS).
Using the cfs-archive to store huge files
The Cassandra File System (CFS) consists of two layers: cfs and cfs-archive. Using cfs-archive is recommended for long-term storage of huge files.
Using Hive
DataStax Enterprise includes a Cassandra-enabled Hive MapReduce client.
DataStax ODBC driver for Hive on Windows
The DataStax ODBC Driver for Hive provides Windows users access to the information that is stored in DSE Hadoop.
Using Mahout
DataStax Enterprise integrates Apache Mahout, a Hadoop component that offers machine learning libraries.
Using Pig
DataStax Enterprise (DSE) includes a Cassandra File System (CFS) enabled Apache Pig Client to provide a high-level programming environment for MapReduce coding.
DSE Analytics
Introduction to DSE Analytics
Analyzing data using Spark
Analyzing data using external Hadoop systems (BYOH)
Analyzing data using DSE Hadoop
DSE Hadoop introduction
Using the job tracker node
About the Cassandra File System
Using the cfs-archive to store huge files
Using Hive
ODBC driver for Hive
Using Mahout
Using Pig