In this post I'll show how to expose product information from a legacy SharePoint 2007 installation using the WSDL, XML, and Regex F# type providers. With F# type providers, little actual code needs to be written and most of it can be verified at compile time. Contrast this with using SharePoint from C# where, because SharePoint is heavy on (dynamically generated) XML, code generation and runtime exceptions are a common occurrence.
To follow along with the example, little SharePoint knowledge is required. The key point is that products are stored as items within a SharePoint Pages document library. In SharePoint terms, a document library with items is like a table with rows in a relational database. These items may then be queried through a SOAP web service using an XML query language.
The XML query that goes to SharePoint as well as the result coming back are of type XmlElement. I much prefer working with the never XElement type and thus need a couple of routines to convert back and forth. Also, the username and password to authenticate with are read from the configuration file using a convenience function. I tried out the AppSettings type provider, but kept getting NullReferenceExceptions.
// requires the following NuGet packages: // FSharp.Data // RegexProvider [<AutoOpen>] module Utils = open System.Configuration open System.Xml.Linq open System.Xml let toXmlElement (e: XElement) = let d = XmlDocument() d.LoadXml(e.ToString()) d.DocumentElement let toXElement (e: XmlElement) = XElement.Parse(e.OuterXml) // with "new XElement("foo", ...)", the constructor being called actually // has type XElement(XName, Object[]). The C# compiler will insert a call // to the implicit operator XName defined on the XName class to implicitly // convert from String to XName. The F# compiler doesn't trigger this // implicit convertion. Instead, we need to convert ourselves. let xn s = XName.Get s let getAppSetting (key: string) = ConfigurationManager.AppSettings.[key]
For this example, I'll keep the domain model simple. It consists of a Product type with optional references to the Document type, representing a web resource such as a PDF file:
module Domain = open System type Document = { Title: string; Url: Uri } type Product = { Title: string Url: Uri Manual: Document option Datasheet: Document option Summary: string option Image: Uri option Description: string }
Before issuing requests against the SharePoint 2007 APIs, you must first call the Authentication service with a username and password. The authentication service then returns an authentication cookie to pass with subsequent calls. Here's where the WSDL type provider comes in handy. Otherwise, if you wanted to call the service using strong typing, you'd have to add a separate C# project to the solution and from there add a web reference to the Authentication service before referencing the C# project from F#.
module ProductRepository = open System open System.Text.RegularExpressions open System.ServiceModel open System.ServiceModel.Channels open System.Xml.Linq open FSharp.Data open Microsoft.FSharp.Data.TypeProviders open FSharp.RegexProvider open Domain type AuthProxy = WsdlService<"http://www.acme.com/_vti_bin/Authentication.asmx"> type LoginErrorCode = AuthProxy.ServiceTypes.schemas.microsoft.com.sharepoint.soap.LoginErrorCode let getAuthCookie username password = // to gain access to the underlying transport specific response message // property, i.e., the cookie. Otherwise, we receive a runtime exception // when accessing the HTTP response. new OperationContextScope(AuthProxy.GetAuthenticationSoap().DataContext.InnerChannel) |> ignore let loginResult = AuthProxy.GetAuthenticationSoap().Login(username, password) if loginResult.ErrorCode <> LoginErrorCode.NoError then failwith "Unable to authenticate" let response = OperationContext .Current .IncomingMessageProperties .[HttpResponseMessageProperty.Name] :?> HttpResponseMessageProperty response.Headers.["Set-Cookie"]
A call to getAuthCookie with a valid username and passwords results in the following cookie being returned. The aspxauth part is unique for every authentication request:
.ASPXAUTH=2FD298088C07B99B5710F2F85A2599682321A0238156B36E4A5A70A986FD510A0D191 5756B55AFC95B53FBA6B6D3B3B68E5655DA63D8FCBF0775607C13E1E7E2D76B5F80DED397604EFB9 55B40FF83721C679146C7F4AF9E7F97B62C504170B27F666B58; path=/; HttpOnly
To query SharePoint 2007, I call the Lists SOAP service using the general execute method below. The three arguments to this method are all XML snippets, instructing SharePoint which predicate to filter products on and which fields of each product to return (actual XML constructed later). Again I call upon the WSDL type provider to generate a proxy for the Lists service. Notice how I add the authCookie to the header before calling GetListItems on the service.
type ListProxy = WsdlService<"http://www.acme.com/Products/_vti_bin/Lists.asmx"> let execute query viewFields queryOptions = let username = getAppSetting "SharePointUsername" let password = getAppSetting "SharePointPassword" let authCookie = getAuthCookie username password let client = ListProxy.GetListsSoap() let ctx = client.DataContext let binding = BasicHttpBinding(MaxReceivedMessageSize = 1024L * 1024L * 10L) ctx.Endpoint.Binding <- binding ctx.Endpoint.Address <- EndpointAddress("http://www.acme.com/Products/_vti_bin/Lists.asmx") new OperationContextScope(ctx.InnerChannel) |> ignore let request = HttpRequestMessageProperty() request.Headers.["Cookie"] <- authCookie OperationContext .Current .OutgoingMessageProperties .[HttpRequestMessageProperty.Name] <- request :> obj // see getProducts for query construction client.GetListItems( listName = "Pages", viewName = null, query = (query |> toXmlElement), viewFields = (viewFields |> toXmlElement), rowLimit = Int32.MaxValue.ToString(), queryOptions = (queryOptions |> toXmlElement), webID = null) |> toXElement
When I execute the query against SharePoint's Pages document library, I get back a complex XML document whose specific shape isn't that important. What is important is that I use the XML type provider to get strongly-typed access to its elements. This is possible by running the query beforehand and storing a subset of the result in ProductsTraining.xml. By feeding this sample to the XML type provider, it generates a type including element access logic.
For instance, each product contains an OwsPageX0020ContentX00203 element inferred to be of type string which I access in a strongly-typed way as p.OwsPageX0020ContentX00203 below.
Now that I got all the products the final step is extracting the pieces of information required to construct the Product instances. For legacy reasons, this information is embedded within the (not well-formed) HTML generated by the SharePoint editor control. For instance, the product's manual and datasheet are anchor links (in that order) within the OwsPageX0020ContentX00203 element.
Thus, I use regular expressions to extract relevant pieces of information from the result. To ensure I don't misspell the group names I make use of the simple Regex type provider. It inspects the match result of a Regex and provides strongly-typed access to its named groups.
type Products = XmlProvider<"ProductsTraining.xml"> type DocumentsRegex = Regex< @"<a(.*?)href=""(?<href>(.*?))""(.*?)>(?<title>(.*?))</a>"> let parseDocuments (p: Products.Row) = let documents = p.OwsPageX0020ContentX00203 match documents with | None -> (None, None) | Some ds -> let matches = DocumentsRegex().Matches(ds) let result = matches |> Seq.map(fun m -> { Document.Title = m.title.Value Url = Uri("http://www.acme.com" + m.href.Value) }) (result |> Seq.tryFind (fun m -> m.Url.ToString().Contains("/Documents/Manuals/")), result |> Seq.tryFind (fun m -> m.Url.ToString().Contains("/Documents/datasheets/"))) type ImageRegex = Regex< @"<img(.*?)src=""(?<src>(.*?))"""> let parseImage (p: Products.Row) = let image = p.OwsPublishingPageImage match image with | None -> None | Some i -> let m = ImageRegex().Match(i) match m.Success with | true -> Some (Uri("http://www.acme.com" + m.src.Value)) | false -> None let parseProduct (p: Products.Row) = let manual, datasheet = parseDocuments p { Title = p.OwsTitle Url = Uri(p.OwsEncodedAbsUrl) Manual = manual Datasheet = datasheet Summary = p.OwsPublishingPageContent Image = parseImage p Description = p.OwsPageX0020ContentX00202 }
Finally, I need to construct the specific query to get the products and call parseProduct on each item in the result:
let getProducts() = let query = XElement(xn "Query", XElement(xn "Where", XElement(xn "Contains", XElement(xn "FieldRef", XAttribute(xn "Name", "PublishingPageLayout")), XElement(xn "Value", XAttribute(xn "Type", "URL"), "Acme_Product.aspx")))) let fields = ["Title"; "PublishingPageLayout"; "PublishingPageImage"; "PublishingPageContent"; "Page_x0020_Content_x0020_2"; "Page_x0020_Content_x0020_3"; "EncodedAbsUrl"] |> List.fold (fun acc fld -> acc + XElement(xn "FieldRef", XAttribute(xn "Name", xn fld)).ToString()) "" let viewFields = XElement.Parse("<ViewFields>" + fields + "</ViewFields>") let queryOptions = XElement(xn "QueryOptions", "") let result = execute query viewFields queryOptions let products = Products.Parse(result.ToString()) products.Data.Rows |> Seq.map parseProduct |> Seq.toList
The three parts to the query generated above and passed to execute end up looking like so:
<Query> <Where> <Contains> <FieldRef Name="PublishingPageLayout" /> <Value Type="URL">Acme_Product.aspx</Value> </Contains> </Where> </Query> <FieldRef Name="Title" /> <FieldRef Name="PublishingPageLayout" /> <FieldRef Name="PublishingPageImage" /> <FieldRef Name="PublishingPageContent" /> <FieldRef Name="Page_x0020_Content_x0020_2" /> <FieldRef Name="Page_x0020_Content_x0020_3" /> <FieldRef Name="EncodedAbsUrl" /> <QueryOptions></QueryOptions>
To kick off everything from a console and display the first product, here's the code required. In the actual production code, I call getProducts from a C# WCF service:
[<EntryPoint>] let main _ = ProductRepository.getProducts() |> Seq.truncate 1 |> printfn "%A" 0