Benchmarking Sitecore Publishing

Publishing has been a sore spot lately for some of our clients due to the high amount of content they have in their Sitecore environment. When you start to get into hundreds of thousands of pieces of content, a full site publish is prohibitive. Any time a change is made that requires a large publish your deployment window goes from an hour to potentially an all-day affair. If a user accidentally starts a large publish, subsequent content publishes will get queued and backed up until that large publish completes, or until someone logs into the server and restarts the application.

Still waiting

There are options available to speed up the publishing process. Starting in Sitecore 7.2, parallel publishing was introduced, along with some experimental optimization settings. In Sitecore 8.2, we have a new option, the Sitecore Publishing Service.

What benefits can we see from these options?  I decided to do some tests of large content publishes using these techniques. Each publishing option has its own caveats of course, but this post is concerning itself mainly with the publishing performance of each of the available options.

Skip to the results!

Methodology

I wanted to run these tests in as pure an environment as possible. I set up 3 Sitecore 8.2 environments using Sitecore Instance Manager on my local machine. Using the FillDB tool, I generated 100,000 content items nested in a folder under the site root. Each of these items is of the Sample Item template that ships with a clean Sitecore installation. Full Publish on the entire site was used in each example. Each time the content was being published for the first time.

For benchmarking purposes, my local machine has the following specs,

  • Intel  i7, 8 Core, 2.3 GHz CPU
  • 16 GB RAM
  • Seagate SSHD (not an SSD, but it claims to perform like an SSD!)
  • Windows 7 x64, SP1
  • SQL Server Express 2015
  • .NET 4.6 and .NET Core installed

Default Publishing

The first test was doing a full site publish after generating 100,000 content items using the out-of-the-box publishing configuration. This is probably how most of Sitecore sites are configured unless you took steps to optimize the publishing processes. The results are, as expected, not great.

21620 12:19:30 INFO  Job started: Publish
21620 13:51:18 INFO  Job ended: Publish (units processed: 106669)

That’s over 90 minutes to publish these items, and the content items themselves only had 2 fields with any data.

Parallel Publishing

Next I tested parallel publishing, introduced in Sitecore 7.2. To use this, you need to enable Sitecore.Publishing.Parallel.config. Since I have an 8 core CPU, I set the Publishing.MaxDegreeOfParallelism setting to 8.

There is also Sitecore.Publishing.Optimizations.config, which contains, as the name implies, some optimization settings for publishing. The file comments state that the settings are experimental, and that you should evaluate them before using them in production. For purposes of this test, I ignored this file.

With parallel publishing enabled we see a much shorter publish time of around 25 minutes.

12164 14:27:10 INFO  Job started: Publish to 'web'
12164 14:52:58 INFO  Job ended: Publish to 'web' (units processed: 106669)

Publishing Optimizations

I reran the previous test with the Sitecore.Publishing.Optimizations.config enabled, along with the parallel publishing. This shortened the publish to around 15 minutes.

9836 15:52:34 INFO  Job started: Publish to 'web'
9836 16:07:20 INFO  Job ended: Publish to 'web' (units processed: 106669)

Sitecore Publishing Service

New in Sitecore 8.2 is the Publishing Service, which is a separate web application written in .NET Core that replaces the existing publishing mechanism in your Sitecore site. The documentation on setting up this service is thorough, so kudos to Sitecore for that, however it can be a bit dense. I found this blog post quite helpful in clearing up my confusion. Using it in conjunction with the official documentation, I was able to set up this service in less than an hour.

I ran into a problem using this method, however. The Publishing Service uses some new logic to gather the items it needs to publish, and one of the things it keys off of is the Revision field. Using the FillDb tool doesn’t explicitly write to the Revision field, therefore the service didn’t publish any of my generated items. I wound up running a script with Sitecore Powershell to make a simple edit to these items forcing the Revision field to be written. After that, my items published as expected.

The results were amazing. The new Publish Service was able to publish the entire site, over 100,000 items, in just over 4 minutes. That’s over 20x faster than the default publish settings.

2016-10-19 16:34:17.027 -04:00 [Information] New Job queued : 980bee8e-a132-4041-82d8-155b8496b19f - Targets: "Internet"
2016-10-19 16:39:07.304 -04:00 [Information] Job Result: 95b88a85-64f4-465e-b33d-a7a901331488 - "Complete" - "OK". Duration: 00:04:05.2786436

Summary

Publishing Mode Publishing Time
Default Publishing 90 minutes
Parallel Publishing 25 minutes
Parallel Publishing with Optimizations 15 minutes
Sitecore Publishing Service 1.1 4 minutes

Each of these optimizations come with caveats. Parallel Publishing can introduce concurrency issues if you’re firing events during publish. The optimization config settings need to be vetted before rolling out, as it disables or alters many features you may be using, even if you don’t realize you’re using them.

If you’re on Sitecore 8.2 I strongly recommend giving the Publishing Service a look. Like any change to your system, you’ll want to test the effects it has on your publishing events and other hooks before rolling it out.

Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the fourth part of this series, we will discuss updating your Solr settings, using Zookeeper to push those changes to the nodes, and some tuning and optimizations we can make to Solr and Sitecore for production.  Continue reading “Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production”

Sitecore Solr Support for Chinese Language

If you’re running Sitecore with Solr, you may have noticed crawling errors when you add versions in certain languages. A common requirement for multilingual sites is support for Chinese, which the generated Solr schema Sitecore provides does not support by default.  Fortunately, it’s relatively simple to correct this and add support for Chinese, as well as other languages that aren’t available in the default schema. Continue reading “Sitecore Solr Support for Chinese Language”

Sitecore on Solr Cloud: Part 3 – Creating Your Sitecore Collection

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the third part of this series, we will create our Sitecore collection, add replicas, and connect Sitecore to the collection. We’ll also go over load balancing the requests to distribute them among the Solr cloud nodes.

Continue reading “Sitecore on Solr Cloud: Part 3 – Creating Your Sitecore Collection”

Sitecore on Solr Cloud: Part 2 – Setting up Zookeeper and Solr

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the second part of this series, we will go through the steps to set up a Zookeeper Ensemble, individual Solr nodes, and linking them together in a Solr Cloud configuration. We’ll then create Windows services to start Zookeeper and Solr automatically on each server.

Continue reading “Sitecore on Solr Cloud: Part 2 – Setting up Zookeeper and Solr”

Sitecore on Solr Cloud: Part 1 – Architecture

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the first part of this series, we’ll go over some assumptions, prerequisites, definitions, as well as the architecture of Solr Cloud and how it interacts with a Sitecore application.

Continue reading “Sitecore on Solr Cloud: Part 1 – Architecture”

Seamless Sitecore Content Migration

If you’re rebuilding your site in Sitecore, chances are you are looking at a content migration at some point in the near future. CMS content migration can turn into a quagmire, but it doesn’t have to go that way.  The sooner you start planning for it, the better.

You’ll face a lot of questions and challenges during your content migration. This post aims to provide some guidance on common issues, including:

  • Questions to ask and items to define before you begin a migration
  • High-level process for conducting a content audit and deciding whether to migrate or archive content
  • Resources that you’ll need to have involved during the migration effort
  • Identifying if you’ll need a manual migration, an automated one or perhaps, a hybrid approach
  • Important tasks during a migration that are critical for success but often overlooked

While written with Sitecore in mind, and from experiences migrating content into and out of Sitecore, the principles can apply to any CMS content migration.

Check out the full post over on EContentMag.

Protect Your Sitecore Renderings From Bad Datasources

We’ve all been there before, you get the urgent email off hours, or the call while you’re sleeping. “The site is down!  Our homepage is not coming up!” You go to check and sure enough, the error handler page is displaying when you hit the homepage.  Checking the logs, you see your old friend, “Object reference not set to an instance of an object.”

There are many reasons why a rendering can get published with an invalid or missing datasource.  The datasource could still be stuck in workflow, but the page it was added to was approved and published. A content editor may have set a publishing restriction on the datasource, setting up content in advance to be published at a certain date dictated by the business.  Perhaps the datasource was created in English, but the site is multilingual and no version in Spanish exists.

There are strategies to handle each of these scenarios, but content editors are always finding ways to do unexpected things.  A Sitecore developer needs to write code that can handle these situations gracefully, without bringing down the page or emitting malformed HTML.  However, checking in every rendering for null or invalid datasources can become tedious.  This is ideally handled in a base class that your renderings inherit from. Continue reading “Protect Your Sitecore Renderings From Bad Datasources”

Sitecore Microsites Made Easy

A couple months ago I wrote a blog post detailing a microsite implementation we built for a client. The implementation uses a custom site resolver and content-specific configuration to allow new sites to be created and deployed without configuration updates or deployments.  That post was picked up by Sitecore’s technical blog, and can be read here: https://www.sitecore.net/learn/blogs/technical-blogs/chris-sulham/posts/2015/01/quick-guide.aspx

My friend Pete, around the same time, released a module that accomplishes this in a more robust way, essentially upgrading the way you manage sites in a Sitecore solution.  Pete’s module is worth considering if you need a more holistic solution to site management within Sitecore.  Check out Pete’s module on the marketplace, and the source on GitHub.

Adding Flickr Images to Sitecore Content

This past weekend I participated in the Sitecore Hackathon along with a few other people here at Velir.  My teammates (@distaula, @soyburgers) and I decided to build a tool that would allow content editors to embed media from external sources into their rich text areas.  We chose Flickr to start with due to its popularity with our clients and its accessible API, via the Flickr.NET library. Continue reading “Adding Flickr Images to Sitecore Content”