Lessons learned setting up an IIS web application for double-hop Kerberos authentication with delegation

18 May 2016

Part 1: Lessons learned setting up an IIS web application for double-hop Kerberos authentication with delegation
Part 2: Diagnosing network issues by building TCP/UDP ping into an application
Part 3: Making outbound requests from an ASP.NET MVC/WebAPI application using Kerberos delegated credentials

This post summarizes the lessons learned setting up an Internet Information Services (IIS) web application using Kerberos authentication (see introduction here and here) with delegation. Following the verification of prerequisites for Kerberos and IIS, a test web service (a web application) is installed. Its task is to verify end to end authentication by accepting requests from clients and making requests to a Kerberos-enabled SharePoint instance on behalf of those clients.

In terms of flow, communication from the client, such as from a browser, to SharePoint and back is illustrated below. Included in the diagram are optional load balancers sitting between the client and web service and between the web service and the Kerberos-enabled, on-premise SharePoint 2013 used for testing and illustration purposes:

Client <-> (Load balancer) <-> Web service <-> (Load balancer) <-> SharePoint

In addition to a client calling into SharePoint directly using the client's own credentials, the flow enables a client to make indirect calls through the service and have the service switch between executing using the client's credentials or those of a more privileged account. The service can then carry out privileged operations, such as creating a site collection, while otherwise execute code using the client's credentials, and be subject to SharePoint's build-in authorization checks. Only while executing using privileged credentials are custom authorization checks needed.

Verifying prerequisites

Before getting to the IIS part, let's first verify that Active Directory and Kerberos with delegation have been properly configured by a domain administrator.

Service Principal Names

With Windows Kerberos, the Key Distribution Center (KDC), Authentication Server (AS), and Ticket Granting Server (TGS) are integral parts of the domain controller. We therefore register a one-to-many binding with the domain controller, relating the web service's app pool account to endpoints. In Kerberos terms, we register a Service Principal Name (SPN) for an account. Think of a principal as the general term for a user, computer, or service. An SPN is then a unique identifier for a service running on a server, and is used by a client to locate a service.

Looking up, or forming, a service principal by name (for an endpoint) is what enables a client to request credentials from the TGS, valid for the particular service only. As part of the credentials received is a ticket encrypted by TGS for the client to pass to the service. Only the service is capable of decrypting this ticket, thereby validating the identity of the client from what the ticket contains. With the decrypted ticket in hand, and using an encryption key from inside the ticket, the service then sends back an encrypted message to the client that only it can decrypt, thereby validating the server's identity.

By both client and server trusting a common third-party (KDC, AS, TGS), this Kerberos handshake establishes mutual trust between two principals over an unsecured network. From then on, the parties may continue to exchange encrypted messages using Kerberos encryption keys or, more likely, the connection is already protected by Transport Layer Security.

The existence of an SPN for an app pool account is easily verified with the setspn command which, despite its name, is used for both setting and querying principals. While a domain administrator must have made a prior call to setspn to create the principal, any computer within the domain can query principals:

%> setspn <domain>\<app-pool-account>
Registered ServicePrincipalNames for CN=<app-pool-account>,OU=AD,OU=ServiceAccounts,OU=AcmeCorp,DC=<domain>,DC=local:
    HTTP/<my-server-1>.<domain>.local
    HTTP/<my-server-1>
    HTTP/<my-server-2>.<domain>.local
    HTTP/<my-server-2>

The output reveals that four SPNs are associated with the app pool account. It's common for an endpoint to have two registered principals matching DNS host name and NetBIOS name, respectively. In other words, our service could be deployed to either or both servers listed above.

Delegation

Let's move on and verify the app pool account's ability to not only receive and process, but also delegate, Kerberos tickets, i.e., call another service on the client's behalf.

With an intermediary service acting on behalf of a client, the client and SharePoint are now two hops apart (and by induction, any number of hops apart). Passing along the ticket, and having it remain valid for any intermediary to use for impersonation against the next service, is a distinguishing characteristic of Kerberos over NTLM. It requires only that any intermediary has an SPN registered and that its account is enabled for delegation.

For obvious security reasons, by default an account isn't allowed to transparently delegate client credentials. To verify that delegation is indeed enabled for our service's app pool account, locate the account (user) within Windows Active Directory Users and Groups. Under the user's Properties, on the Delegation tab, ensure that either "Trust this user for delegation to any service (Kerberos only)" or "Trust this user for delegation to specified services only" is selected. The latter option, also named constrained delegation, is considered more secure in that any endpoint the account wishes to delegate to must be explicitly listed.

Domain names

The syntactic resemblance between an SPN and a DNS host name is no coincidence. Assuming that machine name and host name are different, which in the real world they typically are, Window Kerberos requires an SPN to be matched by a DNS A record (and not a CNAME). Using DNS as a locator mechanism makes sense given that the DNS service is typically an integral part of a domain controller, and given that the domain controller dynamically adds and removes DNS records for Kerberos to operate smoothly.

The A record requirement is by design of the Windows Kerberos client library. Attempting to form an SPN with only a CNAME defined, the client library fails to resolve the host name and with both a CNAME and an A record (possibly pointing to different IP addresses), the client will use the A record. Unable to form an SPN, the client isn't able to query the TGS and authentication will fail.

We can verify the existence of an A record using the nslookup command.

%> nslookup -type=a <my-server-1>.<domain>.local

Server:   <dns-host>.<domain>.net
Address:  <some-ip>

Non-authoritative answer:
Name:     <my-server-1>.<domain>.local
Address:  <some-other-ip>

Here nslookup returned a single host, and assuming the IP address does indeed point to our host, our Kerberos prerequisites are now satisfied. Before continuing with the IIS and test service setup, however, a few additional limitations of the Windows Kerberos client library and Chrome are worth mentioning.

Limitations of the Windows Kerberos client and Chrome

Even though SPNs may optionally contain a port number, the Kerberos client library used by .NET and Internet Explorer forms SPNs by way of host names only. The client library is limited to request only two tickets per host name: one for the default HTTP port and one for the default HTTPS port. Running services on any other port, and with a different app pool account, isn't supported.

Chrome on the other hand comes with its own cross-platform Kerberos client library with support for SPNs registered with any port. Also, Chrome supports both CNAME and A record DNS entries matching SPNs.

A limitation of Chrome is that by default it doesn't allow calling a service which may delegate credentials. The service's URL must be explicitly whitelisted by modifying the registry (or via an update to Group Policy Objects) or an HTTP 401 Unauthorized results when calling the service. Behind the scenes, the whitelist affects how Chrome sets up its Windows security context. This context is responsible for actually passing the Kerberos ticket, including other security related parameters, to the service. Even if the Kerberos ticket has its delegation flag set, the context has its own delegation flag which takes precedence. It's the setting of this flag that's controlled by the whitelist. Similarly, a service creates a security context to read the Kerberos ticket and related security parameters. It's a mismatch between these parameters that causes the HTTP 401.

Setting up Internet Information Services

Setting up a new web application under a web site in IIS, a few common issues may arise. First, because a new application inherits settings from its parent web site, some of the settings below may already have their designated values. Second, because IIS reads and writes some setting from and to a web.config file, beware that xcopy deploying an application to IIS, or modifying settings through the IIS Manager, may reset settings in the other. Thus, if Kerberos isn't working after following the steps below, then check IIS and web.config settings:

Create a new web application and choose an existing app pool running under the identity of the aforementioned account. As an alternative, create a new app pool and adjust its identity.
For the new web application, under Authentication, ensure that Anonymous Authentication is disabled and that Windows Authentication is enabled. As per the second IIS issue mentioned above, beware that deploying a vanilla MVC/WebAPI app (with its minimal web.config) causes Anonymous Authentication to become re-enabled and future authentication to fail.

With Anonymous Authentication disabled, the client receives a 401 Unauthorized response from the service, indicating that authentication is required. The response includes WWW-Authenticate headers to inform client of supported authentication protocols. It's up to the client to resend the request and adhere to one of the authentication protocols.

If the request is generated using the .NET framework WebClient/WebRequest classes, this exchange is hidden from the application. The framework responds to the 401 Unauthorized on its own by choosing a compatible authentication protocol and resubmitting the original request with the extra Authorization header. The request does incur an extra roundtrip, but that too can be avoided on subsequent requests by setting the PreAuthenticate property of the WebRequest class to true.
Select Windows Authentication, Advanced Settings..., and ensure that Extended Protection is Off.

Somewhat simplified, when enabling Extended Protection for authentication, i.e., setting the option to Accept or Required, a kind of two-factor authentication against man-in-the-middle attacks is enabled, but only when communication happens over a TLS channel, such as over HTTPS.

As part of the TLS handshake between two endpoints, a TLS session key is generated. This key is included as part of the encrypted Kerberos ticket if client and server supports it, regardless of whether Extended Protection is enabled. By enabling Extended Protection, IIS ensures that the actual TLS session key is equal to the TLS session key in the Kerberos ticket. If the TLS session keys don't match, an intermediate host must exist between client and server.

Introducing a load balancer between client and service effectively makes the load balancer a man-in-the-middle (albeit a friendly one). One TLS connection exists between client and load balancer and another between load balancer and service. Each connection has a different TLS session key, causing authentication to fail if Extended Protection is enabled. In other words, Extended Protection is for use only when no load balancer is involved.
Select Windows Authentication, Advanced Settings..., and ensure that Kernel Mode authentication is disabled.

Kernel Mode Authentication makes the http.sys kernel mode driver responsible for decrypting and caching Kerberos tickets. With Kernel mode authentication, by default decryption can only succeed when the host name and DNS A record for the service are the same or if no DNS entry is present.

In that case, decryption works because a Windows server joining a Active Directory domain will automatically register an SPN with a special machine account and machine name. Given that http.sys runs under machine credentials, it can decrypt the ticket for that machine name using the machine credentials. In comparison, with regular accounts, decryption typically happens using a hash of the app pool account's password.

By either disabling Kernel Mode Authentication, or configuring Kernel Mode Authentication to use app pool credentials, authentication works when a load balancer is involved. For low-traffic web apps, disabling Kernel Mode Authentication is probably the simplest and will have negligible performance impact.
Under Providers, ensure the order is Negotiate followed by NTLM. That way authentication selects between Kerberos and NTLM and in that order, depending on what's available with the client. Strictly speaking, NTLM is now included twice, but as part of different providers. NTLM by the way is still both valid and usable in single-hop scenarios.

By installing into our newly created web application the DelegConfig test service, end to end validation of Kerberos authentication with delegation is now possible. What DelegConfig offers is a way to issue HTTP GET requests to a URL of our choosing while passing along the client's Kerberos ticket and displaying the response. On the Test.aspx page, we can issue a request to SharePoint, asking for details on the currently logged in user (via https://server/path-to-web/_api/Web/CurrentUser). If Kerberos with delegation is correctly configured, the page should display information about the user accessing DelegConfig accessing SharePoint.

Conclusion

In this post, we detailed the typical issues that one encounter with getting Kerberos authentication with delegation up and running. If Kerberos fails to work, however, besides testing with DelegConfig, make sure to always clear the Kerberos tickets cache using the klist command. In addition, Fiddler and psping come in handy to view the request/response headers and HTTP status codes and for ensuring unrestricted access to any host and port required by Kerberos and the service. For lower-level debugging of Kerberos, browsing the server's Event Viewer, under Windows Logs, Security, possibly in combination with registry tweaks to increase the Kerberos log level and running Wireshark to capture traffic on the wire is invaluable. Another handy resource for debugging is configuring Kerberos authentication for SharePoint 2010 products. Many of the considerations listed in the document are equally applicable outside SharePoint.