Jabberwocky Updated for Sitecore 9

a jabberwockyVelir’s Jabberwocky framework has been updated for Sitecore 9.0, initial release. This update doesn’t add any new features beyond support for Sitecore 9.

For now, the package is marked prerelease, due in-part to the dependency on Glass.Mapper, which is still in prerelease for Sitecore 9 support.  We’ll be assessing the framework during our upcoming Sitecore 9 upgrades and projects, and we will correct any uncaught issues with the framework. A final release will be available in the coming months.

As always, your feedback is welcomed!

Connecting Sitecore PaaS to Azure Cosmos DB

Sitecore’s new PaaS offering in Azure is now available. When you’re creating an instance of Sitecore Experience Platform, you’re required to provide a MongoDB connection string for XDB. There are a few options in Azure for a Mongo service, but I decided to try to set it up with Microsoft’s Cosmos DB (formerly DocumentDB). Unfortunately, it didn’t work immediately, so I had to dig in a little bit to get my new PAAS Sitecore instance up and running. This post will walk through setting up Cosmos DB in Azure, attaching it to a new Sitecore PAAS instance, and deploying some custom code to our Sitecore instance to resolve the error connecting to Cosmos DB.

Setup Azure Cosmos DB

The first thing you want to do is set up Cosmos DB in Azure. Log into your portal, and select New Resource on the left. Select Database, then “Database as a Service for MongoDB”. You’ll need to provide a resource ID, as well as a Resource Group and select a Location. Fill out the fields and click create. After a few moments your new Cosmos DB instance will be available.

If you didn’t click Pin to Dashboard before creating, you can find it in the Resouces list. Click on the new database and open up the resouce viewer. You’ll see some general information in the Overview tab. On the left, find and click Connection String under Settings. Here you’ll see the connection strings, port number, username and password you’ll need to connect Sitecore to CosmosDB.

Notice at the bottom of this page the disclaimer, “Azure Cosmos DB has strict security requirements and standards. Azure Cosmos DB accounts require authentication and secure communication via SSL.” This is a problem for Sitecore out of the box, and where we’ll need to do some customization to support secure connections to Mongo.

Setup Sitecore PaaS

Next we’ll set up Sitecore PAAS. This is quite easy with the latest release of Sitecore 8.2, update 3. If you click the New Resource button and search for Sitecore, you’ll see two offerings. Sitecore Experience Platform 8.2, and Sitecore Web Experience Manager 8.2. Since we’re setting up Mongo, that means we need Sitecore XP, so choose that. You’ll need to configure a few things. For the SQL, a username and password. For Sitecore, you’ll need to provide the admin password and your license file. Make a note of these.

Under Sitecore XP Settings, you’ll need to provide connection strings to MongoDb. These will be available in the resource view for the CosmosDB instance we set up, if you didn’t make a note of them previously. You’ll need to edit that connection string to add the XDB table name that Sitecore expects. For example, for the analytics connection string,

mongodb://your-resource-name:12345yourResourceToken12345==@your-resource-name.documents.azure.com:12345/sitecore_analytics?ssl=true&replicaSet=globaldb

It will take some time for your new Sitecore environment to be provisioned. Once it’s ready, open up the resource viewer. In the Essentials view, you’ll see the url of your new instance. Go ahead and open that up, and you’ll see the familiar Sitecore welcome page. You can even log into Sitecore. In Azure, open up Application Insights and view the Log Stream (you may need to turn on Application Logs in the Diagnostic Logging tab first). You’re probably seeing errors related to MongoDb, in particular an error about the transport stream.
“Unable to connect to a member of the replica set matching the read preference Primary: Authentication failed because the remote party has closed the transport stream.”

This is because Cosmos DB requires an SSL connection, and out of the box, Sitecore does not support that. So, we’ll need to deploy a fix for this. Fortunately, Sitecore provides us a pipeline to hook into to override the MongoDB Connection behavior. To correct this issue, we’ll need to enable secure connections to MongoDB.

Deploying a change to Sitecore PaaS

We’ll need to create a class to insert into the updateMongoDriverSettings pipeline. Our processor is going to explicitly set the connection mode to be secure and tell it to use TLS 1.2 in order to connect to Cosmos DB. Here’s the code:


using System.Security.Authentication;
using MongoDB.Driver;
using Sitecore.Analytics.Pipelines.UpdateMongoDriverSettings;

namespace Sitecore.SharedSource.CustomMongo
{
  public class CustomMongoDbClientProcessor : UpdateMongoDriverSettingsProcessor
  {
    public override void UpdateSettings(UpdateMongoDriverSettingsArgs args)
    {
    args.MongoSettings.SslSettings = new SslSettings();
    args.MongoSettings.SslSettings.EnabledSslProtocols = SslProtocols.Tls12;
    }
  }
}

And here’s the config file we need to insert the processor:


<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <sitecore>
    <pipelines>
      <updateMongoDriverSettings>
        <processor type="Sitecore.SharedSource.CustomMongo.CustomMongoDbClientProcessor, Sitecore.SharedSource.CustomMongo" />
      </updateMongoDriverSettings>
    </pipelines>
  </sitecore>
</configuration>

Finally we need to deploy this to our Azure app. Azure offers a lot of options for deployment, but for this example we’ll settle for FTP. You’ll need to set up credentials for the FTP connection, you can do that under Deployment credentials.

 

Once you’ve done that, take a look at the overview page, and you’ll see your FTP information.

Connect with your FTP client of choice, and upload our new DLL with our processor to the /bin folder and our new config to App_Config/Include/zzz.

With this processor in place, Sitecore should now be connected to Cosmos DB.

Why you should use Solr 6 with Sitecore

I’m recently set up Sitecore with Solr 6.2. Anyone who has used Sitecore and Solr already has probably been aggravated by one annoying bug/oversight in the Solr admin, something that’s finally been fixed.7dc962eabf1

How beautiful is that? We can finally see the full name of the core in the selector!

So far I haven’t found any compatibility issues with Sitecore 8.2 and Solr 6.2. Give it a try!

Publish Sitecore Media Items on Referenced Datasources

One of the great additions to Sitecore 8 is the ability to publish related items when executing a publish. Using this feature, you’ll be sure to publish out any necessary items that may be needed to render the page correctly, such as data sources, referenced taxonomy items, or images.

However, you may still have some gaps when using this feature. Consider common scenario where you have a new page, and you add a component to the page that uses an separate item as a data source. On that data source is a field for an image. When publishing the page, the newly created data source item goes out, but the media item linked to on that data source does not.

This is because of the way Sitecore processes referenced items. In essence, it only goes one level deep in the reference tree. So, items referenced by the item being published will be added to the queue, but items referenced by those referenced items will not.

Normally this is ok. If the publisher crawled references recursively, you’d probably wind up in an infinite publishing loop, or you’d at least wind up doing a large publish unintentionally. But it is common for data source items to reference new content, like media, so we need to include those in the publish too.

There’s a pipeline in Sitecore 8 we can use specifically for this purpose, the <getItemReferences> pipeline. Out of the box, it includes a step to AddItemLinkReferences. This step is the one responsible for adding our referenced data source item, so we can override this step to add logic to include media referenced by that data source.

Like all great Sitecore developers, we customize Sitecore by reflecting on their code and replacing it with our own logic. I opened up Sitecore.Publishing.Pipelines.GetItemReferences.AddItemLinkReferences, and added the following.

...
  foreach (Item obj in itemLinkArray.Select(link => link.GetTargetItem()).Where(relatedItem => relatedItem != null))
  {
    list.AddRange(PublishQueue.GetParents(obj));
    list.Add(obj);
    // This will look at the item's links looking for media items.
    list.AddRange(GetLinkedMediaItems(obj));
  }
  return list.Distinct(new ItemIdComparer());
}

Then we’ll add the GetLinkedMediaItems method,

protected virtual List<Item> GetLinkedMediaItems(Item item)
{
  List<Item> mediaList = new List<Item>();
  ItemLink[] itemLinkArray = item.Links.GetValidLinks()
    .Where(link => item.Database.Name.Equals(link.TargetDatabaseName, StringComparison.OrdinalIgnoreCase))
    .ToArray();
  foreach (ItemLink link in itemLinkArray)
  {
    try
    {
      Item target = link.GetTargetItem();       
      if (target == null || !target.Paths.IsMediaItem) 
        continue;
      // add parent media items or folders
      Item parent = target.Parent;
      while(parent != null && parent.ID != ItemIDs.MediaLibraryRoot)
      {
        mediaList.Insert(0, parent);
        parent = parent.Parent;
      }
      mediaList.Add(target);
    }
    catch (Exception ex)
    {
      Log.Error("Error publishing reference link related media items", ex, typeof(AddItemAndMediaLinkReferences));
    }
  }
  return mediaList;
}

We can include this new pipeline by replacing the old one we reflected on.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
 <sitecore>
  <pipelines>
   <getItemReferences>
    <processor type="Sitecore.SharedSource.Pipelines.Publish.AddItemAndMediaLinkReferences, Sitecore.SharedSource"
               patch:instead="processor[@type='Sitecore.Publishing.Pipelines.GetItemReferences.AddItemLinkReferences, Sitecore.Kernel']"/>
   </getItemReferences>
  </pipelines>
 </sitecore>
</configuration>

With this in place, media items referenced on any linked item will be published. You can further refine the logic to just consider data sources, perhaps by checking the path or template to ensure it’s a data source, to cut down on unintentional publishes.

Keep Sitecore Online When Solr Fails

There have been a few experimental patches made available from Sitecore to improve the support for Solr. One particularly thorny issue is that Sitecore will throw exceptions, thus bringing down your site, if Solr is misconfigured or unavailable. Recently the source for a patch was released on GitHub by Sitecore support that addresses this. It even supports switch on rebuild.

https://github.com/andrew-at-sitecore/Sitecore.Support.391039

With this patch, Sitecore will poll at a configured interval for a Solr connection. If available, it will initialize the indexes or create your IQueryable objects. Otherwise, it will log an error, and return an empty result set if applicable.

The magic is in the how the patch initializes the index and Queryable objects. Using the StatusMonitor class included with the patch, it checks to see if Solr is available before attempting to use the connection.

void ISearchIndex.Initialize()
{
  SolrStatusMonitor.CheckCoreStatus(this);
  if (this.PreviousConnectionStatus == 
      ConnectionStatus.Succeded)
  {
    base.Initialize();
  }
}

To use this patch, you’ll need to build it against your version of Sitecore. After that, drop in the patch config and follow the example in the provided configs to swap the types on your Solr indexes to the fail-tolerant Solr search indexes.

One disclaimer: This patch will keep your CD servers online if Solr fails, but the Sitecore Admin will not function. So, your authors will not be able to use the back end until the Solr problem is corrected.

Benchmarking Sitecore Publishing

Publishing has been a sore spot lately for some of our clients due to the high amount of content they have in their Sitecore environment. When you start to get into hundreds of thousands of pieces of content, a full site publish is prohibitive. Any time a change is made that requires a large publish your deployment window goes from an hour to potentially an all-day affair. If a user accidentally starts a large publish, subsequent content publishes will get queued and backed up until that large publish completes, or until someone logs into the server and restarts the application.

Still waiting

There are options available to speed up the publishing process. Starting in Sitecore 7.2, parallel publishing was introduced, along with some experimental optimization settings. In Sitecore 8.2, we have a new option, the Sitecore Publishing Service.

What benefits can we see from these options?  I decided to do some tests of large content publishes using these techniques. Each publishing option has its own caveats of course, but this post is concerning itself mainly with the publishing performance of each of the available options.

Skip to the results!

Methodology

I wanted to run these tests in as pure an environment as possible. I set up 3 Sitecore 8.2 environments using Sitecore Instance Manager on my local machine. Using the FillDB tool, I generated 100,000 content items nested in a folder under the site root. Each of these items is of the Sample Item template that ships with a clean Sitecore installation. Full Publish on the entire site was used in each example. Each time the content was being published for the first time.

For benchmarking purposes, my local machine has the following specs,

  • Intel  i7, 8 Core, 2.3 GHz CPU
  • 16 GB RAM
  • Seagate SSHD (not an SSD, but it claims to perform like an SSD!)
  • Windows 7 x64, SP1
  • SQL Server Express 2015
  • .NET 4.6 and .NET Core installed

Default Publishing

The first test was doing a full site publish after generating 100,000 content items using the out-of-the-box publishing configuration. This is probably how most of Sitecore sites are configured unless you took steps to optimize the publishing processes. The results are, as expected, not great.

21620 12:19:30 INFO  Job started: Publish
21620 13:51:18 INFO  Job ended: Publish (units processed: 106669)

That’s over 90 minutes to publish these items, and the content items themselves only had 2 fields with any data.

Parallel Publishing

Next I tested parallel publishing, introduced in Sitecore 7.2. To use this, you need to enable Sitecore.Publishing.Parallel.config. Since I have an 8 core CPU, I set the Publishing.MaxDegreeOfParallelism setting to 8.

There is also Sitecore.Publishing.Optimizations.config, which contains, as the name implies, some optimization settings for publishing. The file comments state that the settings are experimental, and that you should evaluate them before using them in production. For purposes of this test, I ignored this file.

With parallel publishing enabled we see a much shorter publish time of around 25 minutes.

12164 14:27:10 INFO  Job started: Publish to 'web'
12164 14:52:58 INFO  Job ended: Publish to 'web' (units processed: 106669)

Publishing Optimizations

I reran the previous test with the Sitecore.Publishing.Optimizations.config enabled, along with the parallel publishing. This shortened the publish to around 15 minutes.

9836 15:52:34 INFO  Job started: Publish to 'web'
9836 16:07:20 INFO  Job ended: Publish to 'web' (units processed: 106669)

Sitecore Publishing Service

New in Sitecore 8.2 is the Publishing Service, which is a separate web application written in .NET Core that replaces the existing publishing mechanism in your Sitecore site. The documentation on setting up this service is thorough, so kudos to Sitecore for that, however it can be a bit dense. I found this blog post quite helpful in clearing up my confusion. Using it in conjunction with the official documentation, I was able to set up this service in less than an hour.

I ran into a problem using this method, however. The Publishing Service uses some new logic to gather the items it needs to publish, and one of the things it keys off of is the Revision field. Using the FillDb tool doesn’t explicitly write to the Revision field, therefore the service didn’t publish any of my generated items. I wound up running a script with Sitecore Powershell to make a simple edit to these items forcing the Revision field to be written. After that, my items published as expected.

The results were amazing. The new Publish Service was able to publish the entire site, over 100,000 items, in just over 4 minutes. That’s over 20x faster than the default publish settings.

2016-10-19 16:34:17.027 -04:00 [Information] New Job queued : 980bee8e-a132-4041-82d8-155b8496b19f - Targets: "Internet"
2016-10-19 16:39:07.304 -04:00 [Information] Job Result: 95b88a85-64f4-465e-b33d-a7a901331488 - "Complete" - "OK". Duration: 00:04:05.2786436

Summary

Each of these optimizations come with caveats. Parallel Publishing can introduce concurrency issues if you’re firing events during publish. The optimization config settings need to be vetted before rolling out, as it disables or alters many features you may be using, even if you don’t realize you’re using them.

If you’re on Sitecore 8.2 I strongly recommend giving the Publishing Service a look. Like any change to your system, you’ll want to test the effects it has on your publishing events and other hooks before rolling it out.

Remote Debugging Your Sitecore Application

In this post I’ll walk you through attaching a debugger to a Sitecore application running on a remote server. If you’re in a pinch dealing with a production issue and looking for a TL;DR on the MSDN documentation hopefully this will help. I’m assuming you’re using Visual Studio 2015, but the steps are largely the same for earlier versions.

Installing and Starting the Remote Debugger service

You’ll need to download the appropriate remote debugger for the version of Visual Studio you’re using. To start the service, you’ll need administrator permissions on the remote machine. If you don’t have that, find someone who does to set up and start the service for you.

If you have an account on the remote machine, to connect you’ll need an account on the local machine with the same domain and username to connect. It’s easier to select “No Authentication” and “Allow Any User to Debug”, but be warned there are security concerns. I don’t recommend this option unless there’s an additional layer of security in your system, such as IP filtering. You should also change the default port.

rdb_host

Attaching the Debugger

Once the service is listening, open your project in Visual Studio and select Attach to Process from the Debug menu. In the dialog box, under Transport, select Default if using Windows Auth, or No Authentication if using the No Authentication option on the remote server. In the Qualifier box, put in the server:port and hit enter. If the remote debugger is listening, you’ll see a list of processes running on the remote machine. Select w3wp.exe from the list and attach. That’s it!  You’re now attached and ready to debug your remote application.

rdb_attach

Why can’t I hit my breakpoints?

Chances are, unless you built the app and deployed the DLLs from your local machine, you won’t be able to hit any breakpoints you set. In order for the remote debugger to function, it needs the exact same symbols (pdb) files from the remote machine.  You’ll need to copy these down to the machine you’re debugging from and tell Visual Studio where to find them.

Once you’ve copied the symbols to a local folder, open the Tools -> Options dialog from Visual Studio. Select Debugging -> Symbols, and add a new path by clicking the folder icon. Paste in the path to where you copied down the pdb files and make sure the box is checked next to that new entry.  Close this and reattach the remote debugger.

rdb_symb

Why can’t see my variable values?

Try this. In the Tools -> Options dialog, under Debugging, check the “Use Legacy C# and VB Expression evaluators”.

rdb_options

Happy debugging, and remember to turn off your remote debugger when you’re finished!

Test Regular Expressions in Real-Time with NCrunch

odns_regexToday I had to write some utility classes to parse query parameters from one search provider and transform them to work with another search provider.  Naturally this meant a lot of work with regular expressions.

Many developers dread working with regular expressions. The syntax is arcane, the patterns can be absurdly long, and debugging them is a chore. A single character change in a regex pattern, to address one edge case, can cause failures in many other inputs. Unit test covering an array of inputs is essential to have confidence in your regex code and faith that your changes after the fact don’t break existing use cases.

But how can we accelerate the initial implementation? There are an array of tools out there to help with developing regexes, but if you don’t need any of that if you have NCrunch installed to Visual Studio.

Chances are if you are a .NET developer and practice TDD, you’ve heard of NCrunch. If you’re not familiar with it, NCrunch is a tool that (among other things) runs your unit tests for you in the background, as you type. This gives you real-time feedback on your tests which is a significant accelerator, saving you the trouble of writing a test, writing some code to make the test pass, rebuilding, running the test, repeat until green. You know right away if your code is working as intended. If it isn’t, you know immediately what you broke when you see the red lights next to your test methods. Overall it’s a great tool and I highly recommend it.

Consider this example, testing an email validation pattern:

public class Email_Regex
{
	private const string EmailPattern = @".+";
	public bool IsValidEmail(string email)
	{
		return Regex.IsMatch(email, EmailPattern);
	}
}


[TestFixture]
public class Email_Regex_Tests
{
	private Email_Regex _regex;

	[SetUp]
	private void SetUp()
	{
		_regex = new Email_Regex();
	}

	[Test]
	public void IsValidEmail_SimpleEmail_IsValid()
	{
		string testEmail = "chris.sulham@gmail.com";
		Assert.IsTrue(_regex.IsValidEmail(testEmail));
	}
}

This test will pass, but clearly the pattern isn’t going to cut it in the real world. We can add another test with an invalid email, that should make the pattern not match.

[Test]
public void IsValidEmail_NonEmailInput_IsNotValid()
{
	string testEmail = "I'm not telling you my email address!";
	Assert.IsFalse(_regex.IsValidEmail(testEmail));
}

Here we can set up an array of tests against representative inputs that we’ll run through our regex. Using NCrunch, we can make edits to the pattern and see in real time whether or not our inputs are matching. As you add more tests and have to tweak the pattern, you’ll once again know immediately which test inputs are working with your changes. Seeing the lights go red the moment you add or change a character in the pattern will remove many, many painful cycles of change, build, run tests, wonder what broke everything, change again…

Developing regular expressions this way almost makes the whole process fun.  Almost.

Download NCrunch here

Sitecore User Csv Import Module

I’ve created a small module to assist with importing users into Sitecore from a csv file. The purpose of the module is to bulk-import users into Sitecore from an external FTP source, but it can also be used to push users into the system in a 1 off manner, for example if you had to move users from another system into Sitecore as part of a site migration. It also comes with an automated agent that can be configured to run regular downloads and imports of user files from an external FTP source.

Overview

The module operates off of items created in Sitecore to represent the import csv sheets. These items contain fields that let you configure how the user will be created based on the data in the sheet, as well as define a role and domain to assign the user to. The module is capable of downloading these csv sheets from an external FTP site and updating the users if the sheet is newer than the last time it was processed. The agent (disabled by default) will iterate over the items in the module’s folder to download the sheet and update the users if the sheet is newer each time it runs.  Imports can also be initiated manually using a custom ribbon button on the sheet import items from within Sitecore.

Setting Up

After downloading and installing the package to Sitecore, open /App_Config/Include/Sitecore.SharedSource.UserCsvImport.config to edit the module’s settings.  You’ll need to create the folder that will store your csv files that the module will read, this should be in the site’s /data folder. If your csv’s are hosted on an external FTP site, you can define the hostname, username and password here as well.

Using the Module

Open the Sitecore content editor, and in the master database navigate to/sitecore/system/Modules/User Csv Importer/User Csv Sheets. In this folder, you can create the User Csv Sheet items.

On the User Csv Sheet item you’ll find the following fields,

  • File Name: The make of the sheet of user data to import. If using the FTP download feature, the folder path should match the folders on the FTP server. Ex. /folder/subfolder/usersheet.csv.
  • Last Updated: The last time the sheet was processed. Clear this field to force the sheet to import again.
  • Role: The membership role to apply to this user.  If it does not exist it will be created.
  • Identity Field Name: The column in the csv to use for the user’s username.
  • Email Field Name: The column in the csv to use for the user’s email.
  • Custom Profile: The profile to use for the users being created.  The columns in the csv should map to the fields on this profile item, meaning the field names should match the names of the csv columns.  Fields that do not exist will be skipped.  See this post for how to set up custom user profiles in Sitecore.

The UserCsvImport module has been tested on Sitecore 7.2 update 3, as well as Sitecore 8.1 initial release. The module depends on a few external libraries.  The Custom Item Generator, CSVHelper for reading and parsing the CSV files, and SSH.NET for the support of secure ftp file transfers.

Download the module from the Sitecore Marketplace, or the source from GitHub.

Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the fourth part of this series, we will discuss updating your Solr settings, using Zookeeper to push those changes to the nodes, and some tuning and optimizations we can make to Solr and Sitecore for production.  Continue reading Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production