Real-time Search Updates with Experience Edge Webhooks: Part 1

In a previous post, we went over how to use GraphQL and a custom Next.js web service to crawl and index our Sitecore XM Cloud content into a search provider. That crawler runs on a schedule, so what happens when your authors update their content? They’ll need to wait for the next run of the crawler to see their content in the search index. This is a step back in capabilities from legacy Sitecore XP, which updated indexes at the end of every publish.

It’s possible to recreate this functionality using Experience Edge webhooks. Experience Edge offers quite a few webhook options (see the list here). To enable near real-time updates of our search index, we’ll use the ContentUpdated webhook, which fires after a publish to Edge from XM Cloud finishes. Let’s take a look at an example payload from that webhook:

{
"invocation_id": "56a95d51-b2af-496d-9bf1-a4f7eea5a7cf",
"updates": [
{
"identifier": "FA27B51EE4394CBB89F8F451B13FF9DC",
"entity_definition": "Item",
"operation": "Update",
"entity_culture": "en"
},
{
"identifier": "80471F77728345D4859E0FD004F42FEB",
"entity_definition": "Item",
"operation": "Update",
"entity_culture": "en"
},
{
"identifier": "80471F77728345D4859E0FD004F42FEB-layout",
"entity_definition": "LayoutData",
"operation": "Update",
"entity_culture": "en"
},
{
"identifier": "7C77981F5CE04246A98BF4A95279CBFB",
"entity_definition": "Item",
"operation": "Update",
"entity_culture": "en"
},
{
"identifier": "FFF8F4010B2646AF8804BA39EBEE8E83-layout",
"entity_definition": "LayoutData",
"operation": "Update",
"entity_culture": "en"
}
],
"continues": false
}

As you can see, we have item data here and layout data. The layout data is what we’re interested in, as this represents our actual web pages, and that is what we want to index.

The general process is as follows:

  1. Set up a receiver for this webhook. We’ll do this with a Next.js function.
  2. Loop over the webhook payload and for each piece of LayoutData, then make a GraphQL query to get the field data from Experience Edge.
  3. Finally, roll up the field data into a JSON object and push it to our search index.

Let’s start by setting up our webhook. You’ll need to create an Edge administration credential in the XM Cloud Deploy app. Make note of the Client ID and Client Secret. The secret will only be displayed once, so if you lose it you will need to create new credentials.

The next step is to create an auth token, you’ll need this to perform any Experience Edge administration actions. I used the ThunderClient plugin for Visual Studio Code to interact with the Sitecore APIs. To create an auth token, make a post request to https://auth.sitecorecloud.io/oauth/token with the following form data, using the client id and secret you just created in XM Cloud:

You’ll get back a json object containing an access token. This token is needed to be sent along with any API requests to Experience Edge. This token is passed as a Bearer Token in the Auth header. We can test it with a simple GET request that will list all the webhooks in this Edge tenant.

You should get back a json object containing a list of all the webhooks currently set up in your tenant (which is likely none to begin). The auth tokens expire after a day or so. If you get a message like edge.cdn.4:JWT Verification failed in your response, you have a problem with your token and should generate a new one.

Next let’s create our ContentUpdated webhook. You’ll need something to receive the webhook. Since we haven’t created our function in Next.js yet, we can use a testing service like Webhook.site. Create a POST request to https://edge.sitecorecloud.io/api/admin/v1/webhooks with the following body:

The important parameters here are uri and executionMode. The uri is where the webhook will be sent, in this case our testing endpoint at webhook.site. The execution mode OnUpdate indicates this will fire when content is Updated. (Note: There are separate webhooks for create and delete, which you will probably need to set up later following this same pattern.)

Send this request and you’ll get a response that looks like this:

{
"id": "3cc79139-294a-449e-9366-46bc629ffddc",
"tenantId": "myTenantName2157-xmcloudvani7a73-dev-2bda",
"label": "OnUpdate Webhook Sandbox",
"uri": "https://webhook.site/#!/view/d4ebda52-f7d8-4ae6-9ea2-968c40bc7f2f",
"method": "POST",
"headers": {
"x-acme": "ContentUpdated"
},
"body": "",
"createdBy": "CMS",
"created": "2024-04-03T15:42:43.079003+00:00",
"bodyInclude": null,
"executionMode": "OnUpdate"
}

Try your GET request again on https://edge.sitecorecloud.io/api/admin/v1/webhooks, and you should see your webhook returned in the json response.

Try making some content updates and publishing from XM Cloud. Over at webhook.site, wait a few minutes and make sure you’re getting the json payload sent over. If so, you’ve set this up correctly.

To delete this webhook, you can send a DELETE request to https://edge.sitecorecloud.io/api/admin/v1/webhooks/<your-webhook-id>. Make sure you include your auth bearer token!

In the next post, we’ll go over handling this webhook to push content updates into our search index.

Self-signed Certificates with Solr Cloud and Sitecore 9.1

If you’ve been using Sitecore 9 or 9.1, you know that all the services the platform depends upon must communicate using trusted, secure connections. This includes Solr. Sitecore’s instructions and the scripts provided by SIF helpfully walk you through setting up a secure Solr installation as part of standing up your 9.1 environment. Jeremy Davis has also created a wonderful powershell script to install Solr with a self signed certificate that I’ve used quite a bit.

But, what if you need to set up Solr Cloud? Sitecore has instructions for that too. These instructions largely send you off to the Solr documentation. My colleague Adam Lamarre has a post walking through the process of setting up Solr cloud on 9.1 as well, albeit on a single server.

If you follow the steps outlined in these posts, you’ll have Solr Cloud up and running on separate machines. But, when it comes time to create a collection you’re going to run into a problem. You may see something like this in the response:

{"responseHeader":
{"status":0,"QTime":33294},
"failure":{"solr3:8983_solr":"org.apache.solr.client.solrj.SolrServerException:IOException occured when talking to server at: https://solr3:8983/solr","solr2:8983_solr":"org.apache.solr.client.solrj.SolrServerException:IOException occured when talking to server at: https://solr2:8983/solr"},
"success":
{"solr:8983_solr":
{"responseHeader":{"status":0,"QTime":2323},"core":"sample_collection_shard1_replica2"}}}

We created our certificates, the nodes are up and running, Zookeeper is aware of them all, but the Solr nodes can’t seem to communicate with each other. So what gives? If we dig into the logs on any of the Solr servers, we get a little more insight into the problem.

2019-03-05 19:04:49.869 ERROR (OverseerThreadFactory-8-thread-1-processing-n:solr2:8983_solr) [   ] o.a.s.c.OverseerCollectionMessageHandler Error from shard: https://solr3:8983/solr
org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://solr3:8983/solr
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:626)
at
...
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
...

What we’re seeing here is the Solr servers don’t trust each other. We need to fix that.

There’s a couple of things we need to do here. First, we have to get the self-signed certificates we created for each Solr node and install them on the other servers. On each Solr server, do the following,

  1. Open certlm.msc
  2. Expand Trusted Root Certification Authority -> Certificates and find your Solr certificate you created.
  3. Open the certificate and make a note of the thumbprint. We’ll need this later.
  4. Export the certificate. Make sure you check Include Extended Properties and Mark this Certificate as Exportable in the dialogue.
  5. When prompted for a password, use the same one you configured when installing Solr (the default is “secret”)

Once you have the certificates, you’ll need to install them on the other nodes. On each Solr server,

  1. Open certlm.msc
  2. Expand Trusted Root Certification Authority -> Certificates
  3. Import the certificates from the other 2 Solr nodes.

Try to hit the other Solr nodes from the browser on each server. For example, try accessing https://solr2:8983/solr/ from the Solr1 server. (You may need host file entries). If your certificates are installed properly, the browser will not warn you about an untrusted site.

There is one more thing we need to do. The Windows servers might trust our Solr nodes now, but the Solr applications themselves do not. If you take a look at the Solr installation steps, you’ll notice we’re creating a keystore file that holds the certificate for that Solr node (typically named . These keystore files needs to be updated to include the certificates from ALL of the Solr nodes, not just the one for the instance on that server.

We can easily do this with Powershell. We can do it with Java’s keytool.exe too, but we’re Sitecore people and probably more comfortable in Powershell! Remember those thumbprints we noted earlier? We’ll need them now.

Here’s the script, assuming your password is “secret”. Run this on any of the Solr nodes.

$password = ConvertTo-SecureString -String "secret" -Force -AsPlainText
Get-ChildItem -Path `
    cert:\LocalMachine\Root\<THUMBPRINT_FOR_SOLR1>,`
    cert:\LocalMachine\Root\<THUMBPRINT_FOR_SOLR2>,`
    cert:\LocalMachine\Root\<THUMBPRINT_FOR_SOLR3>; `
    | Export-PfxCertificate -FilePath D:\solr-ssl.keystore.pfx -Password $password

Take this generated solr-ssl.keystore.pfx file and copy it over the keystore file in each of the Solr nodes, then stop each node and restart them.

If we did everything correctly, when we try to create our collections again, it should go smoothly and you’ll be up and running with Solr Cloud and Sitecore 9.1.

For more information on the architecture of a Solr Cloud cluster and how to set one up for Sitecore, you can refer to my old blog series on the topic. It was written for 7.2, but the architecture principles haven’t changed. (including the need for a load balancer!)

Modify Sitecore Install Framework Packages for Azure SQL

Unfrozen Caveman LawyerSitecore 9 is here, it’s in our lives, and we’re at the point where the projects we started at the beginning of the year are getting ready to roll out. That means we need to get our production environments ready. If you’re coming from the Sitecore 8.x and earlier world, this can be a challenge. There’s new databases, the xConnect service, security and certificate requirements, and of course our friend Solr is mandatory now. We have a new tool to help us get through all this, the Sitecore Install Framework (or SIF). It’s supposed to help us by automating our install steps, if you know how to use it.

Fortunately, Sitecore has really stepped up their documentation, especially with version 9. There’s a detailed guide on installing Sitecore 9, which covers a single instance (probably a local developer environment) and a scaled out production instance. However, when they say scaled out , they mean scaled out. There’s a script for every possible server role. In the real world, our environments don’t match what’s exactly in the documentation. For example, we often combine roles, or share hardware. We need to make some adjustments, and that’s when we start to go off the map.

Continue reading Modify Sitecore Install Framework Packages for Azure SQL

Jabberwocky Updated for Sitecore 9

a jabberwockyVelir’s Jabberwocky framework has been updated for Sitecore 9.0, initial release. This update doesn’t add any new features beyond support for Sitecore 9.

For now, the package is marked prerelease, due in-part to the dependency on Glass.Mapper, which is still in prerelease for Sitecore 9 support.  We’ll be assessing the framework during our upcoming Sitecore 9 upgrades and projects, and we will correct any uncaught issues with the framework. A final release will be available in the coming months.

As always, your feedback is welcomed!

Sitecore User Csv Import Module

I’ve created a small module to assist with importing users into Sitecore from a csv file. The purpose of the module is to bulk-import users into Sitecore from an external FTP source, but it can also be used to push users into the system in a 1 off manner, for example if you had to move users from another system into Sitecore as part of a site migration. It also comes with an automated agent that can be configured to run regular downloads and imports of user files from an external FTP source.

Overview

The module operates off of items created in Sitecore to represent the import csv sheets. These items contain fields that let you configure how the user will be created based on the data in the sheet, as well as define a role and domain to assign the user to. The module is capable of downloading these csv sheets from an external FTP site and updating the users if the sheet is newer than the last time it was processed. The agent (disabled by default) will iterate over the items in the module’s folder to download the sheet and update the users if the sheet is newer each time it runs.  Imports can also be initiated manually using a custom ribbon button on the sheet import items from within Sitecore.

Setting Up

After downloading and installing the package to Sitecore, open /App_Config/Include/Sitecore.SharedSource.UserCsvImport.config to edit the module’s settings.  You’ll need to create the folder that will store your csv files that the module will read, this should be in the site’s /data folder. If your csv’s are hosted on an external FTP site, you can define the hostname, username and password here as well.

Using the Module

Open the Sitecore content editor, and in the master database navigate to/sitecore/system/Modules/User Csv Importer/User Csv Sheets. In this folder, you can create the User Csv Sheet items.

On the User Csv Sheet item you’ll find the following fields,

  • File Name: The make of the sheet of user data to import. If using the FTP download feature, the folder path should match the folders on the FTP server. Ex. /folder/subfolder/usersheet.csv.
  • Last Updated: The last time the sheet was processed. Clear this field to force the sheet to import again.
  • Role: The membership role to apply to this user.  If it does not exist it will be created.
  • Identity Field Name: The column in the csv to use for the user’s username.
  • Email Field Name: The column in the csv to use for the user’s email.
  • Custom Profile: The profile to use for the users being created.  The columns in the csv should map to the fields on this profile item, meaning the field names should match the names of the csv columns.  Fields that do not exist will be skipped.  See this post for how to set up custom user profiles in Sitecore.

The UserCsvImport module has been tested on Sitecore 7.2 update 3, as well as Sitecore 8.1 initial release. The module depends on a few external libraries.  The Custom Item Generator, CSVHelper for reading and parsing the CSV files, and SSH.NET for the support of secure ftp file transfers.

Download the module from the Sitecore Marketplace, or the source from GitHub.

Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the fourth part of this series, we will discuss updating your Solr settings, using Zookeeper to push those changes to the nodes, and some tuning and optimizations we can make to Solr and Sitecore for production.  Continue reading Sitecore on Solr Cloud: Part 4 – Tuning Solr for Production

Sitecore Solr Support for Chinese Language

If you’re running Sitecore with Solr, you may have noticed crawling errors when you add versions in certain languages. A common requirement for multilingual sites is support for Chinese, which the generated Solr schema Sitecore provides does not support by default.  Fortunately, it’s relatively simple to correct this and add support for Chinese, as well as other languages that aren’t available in the default schema. Continue reading Sitecore Solr Support for Chinese Language

Sitecore on Solr Cloud: Part 3 – Creating Your Sitecore Collection

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the third part of this series, we will create our Sitecore collection, add replicas, and connect Sitecore to the collection. We’ll also go over load balancing the requests to distribute them among the Solr cloud nodes.

Continue reading Sitecore on Solr Cloud: Part 3 – Creating Your Sitecore Collection

Sitecore on Solr Cloud: Part 2 – Setting up Zookeeper and Solr

This post is part of a series of posts on setting up your Sitecore application to run with Solr Cloud. We’ll be covering the procedure for setting up a Sitecore environment using the Solr search provider, and the creation of a 3-node Solr cloud cluster. This series is broken into four parts.

For the second part of this series, we will go through the steps to set up a Zookeeper Ensemble, individual Solr nodes, and linking them together in a Solr Cloud configuration. We’ll then create Windows services to start Zookeeper and Solr automatically on each server.

Continue reading Sitecore on Solr Cloud: Part 2 – Setting up Zookeeper and Solr