Using DSE Search is memory-intensive. Use a discovery process to develop a capacity plan to ensure sufficient memory resources.
Using DSE Search is memory-intensive. Solr rereads the entire row when updating indexes, and can impose a significant performance hit on spinning disks. Use solid-state drives (SSD) for applications that have very aggressive insert and update requirements.
This capacity planning discovery process helps you develop a plan for having sufficient memory resources to meet the operational requirements. For general advice on capacity planning, see Planning and testing a cluster deployment.
First, estimate how large your search index will grow by indexing a number of documents on a single node, executing typical user queries, and then examining the field cache memory usage for heap allocation. Repeat this process using a greater number of documents until you get a solid estimate of the size of the index for the maximum number of documents that a single node can handle. You can then determine how many servers to deploy for a cluster and the optimal heap size. Store the index on SSDs or in the system IO cache.
- Set the optimal heap size per node.
- Estimate of the number of nodes that are required for your application.
- Increase the replication factor to support more queries per second.
- The amount of RAM that is determined during capacity planning.
- SSD or a spinning disk with it's own dedicated disk. A dedicated SSD is recommended, but is not required.
- N documents indexed on a single test node
- A complete set of sample queries to be executed
- The maximum number of documents the system will support
- Create the schema.xml and solrconfig.xml files.
- Start a node.
- Add N docs.
- Run a range of queries that simulate a production environment.
- View the status of the field cache memory to discover the memory usage.
- View the size of the index (on disk) included in the status information about the Solr core.
- Based on the server's system IO cache available, set a maximum index size per server.
Based on the memory usage, set a maximum heap size required per server.
- For JVM memory to provide the required performance and memory capacity, DataStax recommends a heap size of 14 GB or larger.
- For faster live indexing you should configure
live indexing (RT) postings to be allocated offheap.
Note: Enable live indexing on only one Solr core per cluster.
Calculate the maximum number of documents per node based on steps 6 and
When the system is approaching the maximum docs per node, add more nodes.