Sample code available for download from Github.
This post outlines a minimal set of requirements for a developer-centric artefact provisioning engine. In broad terms, such engine is a piece of software which creates and applies changes to state over time. Examples include doing database migrations or upgrading configuration files with new software releases. The example I use throughout this post is that of a SharePoint Online provisioning engine which I've recently developed and put into production. Its task is to create and query site collections, webs, document libraries, views, pages, and the like in a reliable manner.
Inspired by the Command and Command Query Responsibility Segregation patterns, the main organizing principle behind the SharePoint Online provisioning engine is the grouping of functionality into reusable commands and queries. Commands, such as create site collection, create list, and add web part to page, are allowed to read and write to SharePoint whereas queries, such as a recursive web visitor or get search engine result, are read-only.
For commands to be truly reliable and reusable, they must adhere to the properties below. Queries being read-only are simpler and can get away with only being idempotent and composable.
Executing the same command or query against an artefact more than once must not change the result beyond the initial application. For instance, attempting to create a list more than once should only result in a single instance being created.
Executing a command must not delete or overwrite user created structure or content, unless of course it's an explicit delete command. For this to work, each provisioned artefact must be uniquely identifiable. For instance, lists, views, and web parts have a title, and content types have a name to identify those by.
Applying non-additive updates to SharePoint artefacts is particularly challenging as users are outright encouraged to change both structure and content. Contrast this with updating an SQL database whose structure at least remains constant in between upgrades. Then add the SharePoint factor.
Non-additive updates should be small and localized to avoid undesired side effects, and should be delegated to separate run-once scripts. As a corollary, once an artefact has been updated (or failed an update), I never roll back. Instead I roll forward by creating new run-once scripts.
More complex commands or queries are composed from simpler ones ad infinitum. In order for one to not interfere with the other, or have an imposed ordering, commands and queries must be as self-contained as possible. Performance considerations, however, may require a small amount of shared state, such as the client context or logging object, to get threaded through the tree.
As an example, suppose I want to provision an instance of the Acme Corp project site collection template. Then I'd create a top-level CreateAcmeCorpProjectSiteCollection command and compose it from smaller commands as illustrated below. These smaller commands come in two flavors: general ones which are independent of Acme Corp and specific ones which encapsulate Acme Corp provisioning logic.
CreateAcmeCorpProjectSiteCollection (specific) CreateSiteCollection (generic) CreateContentType (generic) CreateAnalysisWeb (specific) CreateWeb (generic) CreateAcmeDocumentLibrary (specific) CreateList (generic) AddColumn (generic) AddView (generic) CreateDesignWeb (specific) ...
In object oriented terms, each command and query is modelled as a class with an Execute method, accepting parameters specific to its functioning. The CreateSiteCollection command, for instance, is passed the title, URL, template, size, and language and in turn passes it along to the CSOM API (I could've used the REST API or PnP Core Component as well). As for the body of the Execute method, in this case and in general, it's a transaction script calling out to commands and queries.
To ensure artefacts get provisioned in the correct dependency order, the provisioning engine does a depth-first traversal of the above tree. This ordering maps nicely to the call stack metaphor.
As of current, a number of provisioning engines exist with the PnP engine probably getting the most attention.
The PnP engine is heavily dependent on XML and a mapping of properties from CSOM to XML and back. As the two evolve, maintaining feature parity and backward compatibility seems challenging. The PnP engine has to parse XML into an abstract syntax tree and, while traversing the tree, call the CSOM API (directly on indirectly through PnP Core component) and possible custom extensions. This indirection seems to introduce a considerable amount of accidental complexity.
While I like the idea of a shared provisioning engine, I opt for being closer to the metal and not rely on XML as an external domain specific language. I'm developer-centric and prefer expressing the template in code and take advantage of Visual Studio's refactoring, compiler, and debugging support. Taking the long view on maintainability, I'm not convinced that users defining templates through point and click and an engine exporting those verbatim as XML is such a good idea.
Another key difference is that my developer-centric provisioning engine focuses on state transitions leading to some end state, whereas the PnP engine focuses on declaratively defining the end state. I rely on loops (and table-driven code), helper methods, and other typical programming constructs to configure artefacts. In principle nothing precludes having the same constructs available in the XML. That's partly the philosophy of MSBuild and NAnt.
In this post I outlined what I consider the minimal requirements for a developer-centric SharePoint Online provisioning engine. The ideas stem from an engine that I've had in production for a few months, creating site collections from fairly involved templates. As I was wrapping up my engine, the PnP engine had started getting headlines. I chose to continue down my path, however, because it takes a more developer-centric view on provisioning.