Sitecore on Solr Cloud: Part 4 - Tuning Solr for Production

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the fourth part of this series, we will discuss updating your Solr settings, using Zookeeper to push those changes to the nodes, and some tuning and optimizations we can make to Solr and Sitecore for production.

Managing configurations with Zookeeper

Many of the changes you’ll need to make to Solr have to be done in the configuration files, such as solrconfig.xml and schema.xml. Each core’s /conf folder is stored in Zookeeper when you run Solr in cloud mode, so updating these files in Zookeeper updates all of your collections using that config set.

Solr ships with a tool that allows you to execute some common commands against Zookeeper. In the /scripts/cloud-scripts folder, we’ll use zkCli.bat to pull configs out of Zookeeper and push them back in. The following command will push the contents of C:\SolrConfig\sitecoreConf up to Zookeeper as the sitecoreConf configuration set:

zkcli.bat -zkhost localhost:2181 -cmd upconfig -confdir C:/SolrConfig/sitecoreConf -confname sitecoreConf

To pull configuration out of Zookeeper, you can use a similar command:

zkcli.bat -zkhost localhost:2181 -cmd downconfig -confdir C:/SolrConfig/sitecoreConf -confname sitecoreConf

Note that the tool provided with Solr is different than the zkCli tool bundled with Zookeeper.

Memory Settings

Solr works best when it is able to fully utilize its caches. Ideally you want to be able to keep your entire index in memory. This includes the heap space allocated to the JVM (java.exe in the process monitor) and the disk cache Windows will utilize. By default, Solr will allocate 25% of available memory to the JVM upon startup. If your indexes grow to be larger than this, you can safely increase this setting up to 50% of the server’s available memory. You can configure this with startup script paramaters, -Xmx for Solr 4.x and -m for Solr 5.

Warming Searchers

If you’ve done any heavy indexing with Sitecore, you may have noticed failures with errors like this:

Error opening new searcher. Exceeded limit of maxWarmingSearchers=2, try again later.

Sitecore’s indexing operations execute a lot of commits, and it can sometimes overwhelm the available searchers. Increasing this setting from 2 to 4 will usually clear this up. The maxWarmingSearchers setting is in solrconfig.xml.

Dedicated Indexing Server

Since Solr is a centralized index provider, there’s no need to run indexing operations on every Sitecore instance in your environment. Typically, you can set the indexing strategy to “manual” on your CD environments, and allow your CM server to handle the indexing.

On larger implementations, such as if you’re using a distributed CM environment, you can delegate indexing responsibilities to a job server. In this case, you would set your update strategies only on the index configurations for this server and set them to manual on all other Sitecore instances.

Near Real Time Searching

There can be some lag time between when a change is made in Sitecore and that change is reflected in Solr. This can be problematic if, for example, you make a change to an item an a page the item is referenced on is requested and cached before Solr has updated to reflect that change.

Solr can be configured to execute soft commits (commits to memory) much more frequently than hard commits (commits to disk). By default, Solr is set up to hard commit every 15 seconds, and soft commits are disabled. In solrconfig.xml, you can change these settings to enable soft commits. Look for the autoSoftCommit setting, and set the maxTime to 2000 (2s) or less.

You can increase the autoCommit (hard commit) setting if you wish, but be aware that if the Solr instance goes down any changes not written to disk will be lost, requiring re-indexing.

Health Checks

If you’re using Solr Cloud with Sitecore, you likely have a load balancer distributing your requests across the nodes. You need to set up a health check to take a node out of the load balancer if it for some reason becomes unavailable, so that requests stop being routed to it.

Solr provides a ping request handler that can be used for health checks. In solrconfig.xml, you can add this request handler:

<!-- ping/healthcheck -->
 <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
 <lst name="invariants">
 <str name="q">solrpingquery</str>
 </lst>
 <lst name="defaults">
 <str name="echoParams">all</str>
 </lst>
 <!-- An optional feature of the PingRequestHandler is to configure the
 handler with a "healthcheckFile" which can be used to enable/disable
 the PingRequestHandler.
 relative paths are resolved against the data dir
 -->
 <!-- <str name="healthcheckFile">server-enabled.txt</str> -->
 </requestHandler>

This sets up a ping handler at the core level. (Note I said core, not collection.) You’d use a url like this to issue your health check,

http://your.solrserver:8983/solr/itembuckets_shard1_replica1/admin/ping

You can issue these requests to a single core to check the health of the entire node. If you have multiple collections in your solution, you can even configure health checks at the core level to take a single node out of rotation for requests to that collection.

And more…

There’s a lot that goes into rolling out a production instance of Solr cloud, and no single blog post will be able to cover everything. Like any system in your solution, Solr needs to be continuously monitored and tweaked to ensure optimal performance. This list includes some of the things you can do in advance when taking Sitecore and Solr to production.