Mapping application components to Azure Active Directory

21 Oct 2018

Deploying a real-world application with components secured by Azure Active Directory (AAD), Azure's AAD app registration page can be a confusing place. It provides minimal guidance on application types and how many applications to register. To add to the confusion, many blog posts focus on simple scenarios and unless one has a good understanding of OAuth2 and OpenID Connect, arriving at a secure solution may be tricky.

This post provides guidance on the type and number of registrations needed. It then details how a real-world use case affects the choice of client authentication library and the parts of OAuth2 and OpenID Connect involved. The focus is on adding perspective and providing links to hands-on material with diagrams and code, not on providing a setup guide.

Real-world use case

Consider the following application implemented as a distributed set of components:

Angular app making secured calls to a custom REST service.
Native iOS app making secured calls to the same REST service.
ASP.NET Web API implementing the REST service.
ASP.NET Web API making secured calls to MS Graph.

When these components are secured by AAD, end-users sign-in or authenticate to (1) and (2) with their AAD credentials. AAD issues signed tokens which (1) and (2) passes along with calls to (3) which similarly requests a signed token and passes it along with calls to (4).

Reading along, keep in mind that authorization is orthogonal to authentication and that authorization requires authentication. OAuth2 is an authorization standard. It's OpenID Connect, the layer on top of OAuth2 that implements authentication. In other words, (1) and (2) passing OAuth2 access tokens to (3) authorizes that (1) and (2) may access (3) on the end-user's behalf, with the permissions consented to by the end-user and possibly an admin user. The OpenID Connect ID token (see primer on OpenID Connect, parts 1, 2, 3), is what authenticates an end-user to (1) and (2). It holds username, full name, and the like, but is never passed to (3).

Types of app registrations

Creating a new App registration inside the Azure portal, for the application type we have a choice between "Web app/API" and "Native". These map to the three OAuth2 client types of web application, user-agent-based application (a fancy term for a browser hosting a JavaScript application), and native application.

With AAD, the Angular app maps to "Web app/API" with oauth2AllowImplicitFlow turned on in the registration manifest (this doesn't disable other grant types), iOS app maps to "Native" and Web API maps to "Web app/API". But does that mean that Angular and Web API should run under the same registration?

Number of app registrations

Perhaps due to simple use cases, many blog posts share the app registration between web service and single page app. For our real-world use case, reusing the app registration is at best a shortcut to avoid an additional app registration and at worst a potential security risk:

Almost always should the Angular app have its own app registration. Only a small, self-contained Angular app + Web API, designed to work only with each other, without integrating with other APIs, should share a registration.
Authorization on a per-application basis doesn't apply when Angular app and Web API share a single app registration. With both having the same client ID and hence the same permissions, to a downstream API such as MS Graph, Angular app and Web API share access rights and are indistinguishable.
Frontend and backend are distinct components of an application, created with distinct technologies, and with distinct authorization needs. The two are likely deployed to different Azure app service instances and live in different source code repositories and have separate life-cycles. Having the two share an app registration is perhaps reminiscent of older postback-driven apps with occasional JavaScript.
An Angular app may be consuming many endpoints: one or more of our Web APIs and MS Graph, SharePoint, and so on. In this case, we shouldn't tie Angular to any one API by sharing its app registration. Especially since an app registration carry no monetary cost.
App registration metadata differ between an Angular app and a Web API. Maybe not in a significant way at first, but over time they're likely to drift apart. At that point, we must either split the registration, requiring end-user consent once again or admin consent on end-users behalf, or make the single registration contain settings of both, if possible.
By design, an Angular app uses the implicit grant type (of authorization flow or method of acquiring an access token) whereas Web API doesn't. An Angular app also requires advance registration of reply URLs to prevent a rogue app from requesting a token from AAD and have the token sent to its reply URL. Over time an Angular app will likely access a different set of APIs than any one Web API. In our use case for instance, only Web API calls MS Graph. Thus, Angular app and Web API have registration settings specific to each, meaning they should have separate app registrations.

Angular: ADAL.js, MSAL.js, and the implicit grant type

To support our use case, we're forced into using ADAL.js over MSAL.js as client authorization library (through a wrapper as in this example). To quote Microsoft's comparison of v1.0 and v2.0 endpoint capabilities:

You can use the v2.0 endpoint [with MSAL.js, not ADAL.js] to build a Web API that is secured with OAuth 2.0. However, that Web API can receive tokens only from an application that has the same Application ID [AAD term for client ID]. You cannot access a Web API from a client that has a different Application ID. The client won't be able to request or obtain permissions to your Web API.

It's an inherent limitation with MS' OAuth2 implementation and not the OAuth2 standard itself. In our case, we have separate app registrations for iOS and Angular, causing their application IDs to differ.

Despite known security concerns, ADAL.js implements authorization using the implicit grant type. The authorization endpoint returns ID token and access token on the less secure front channel and ADAL.js saves those to browser storage, available to any JavaScript on the page. Instead of a refresh token, ADAL.js periodically requests new ID tokens and on-demand requests new access tokens. A refresh token is by design longer lived than both ID token and access token and would pose a security risk if returned with the implicit grant type.

Without prompting the user for login, ADAL.js accomplishes periodic refresh by adding an invisible iframe element to the DOM, setting its source to the authorization server token request URL. The request includes any cookies for the domain, including the authorization server session cookie (actually a collection of cookies) received during the previous login. The authorization server then directs the browser, i.e., only the page inside the iframe, back to the redirect URL where ADAL.js checks the response and saves the new ID token to browser storage.

Access token refresh follows the same procedure, but on-demand, as storing an always valid access token in the browser would pose a security risk. For as long as the authentication server session cookie is valid, and kept alive, ADAL.js can request new tokens without user interaction.

We can observe the refresh procedure by keeping the Angular app idle and Chrome developer tools Network tab open. ADAL.js has refresh hardcoded to occur every 3,600 seconds, considering the ID token lifetime of 3,900 seconds.

iOS: ADAL for Objective C and the authorization code grant type

Compare the implicit grant type to the authorization code grant type used by the iOS app (implemented by ADAL for Objective C). On the front channel, a browser control transmits a single use, short-lived authorization code. This works by ADAL registering a scheme handler specific to the iOS app and matching the scheme registered with AAD. When the authorization server redirects the browser control to myapp://..., the app receives the code as part of the URL. The app then connects to the authorization server token endpoint over the more secure back channel and exchanges the authorization code (and later refresh token), for ID token, access token, and refresh token.

Like with implicit grant, because the authorization code grant originates from a public client (see OAuth2 client types), not even on the back channel does the client authenticate with the authorization server. Any client in possession of the code can exchange it for tokens. Client authentication would involve submitting with the code and client ID a client secret, shared between the client and authorization server. But since anyone would be able to discover the secret of a public client, and the secret isn't easily changed after deployment, public client authentication is pointless. Only a confidential client, code running on a server, authenticates using a secret.

To convince ourselves that iOS is using the authorization code grant type, we can turn on diagnostics logging. Log entries show scheme, authorization code, and its exchange for tokens. Otherwise, we'd only know that implicit grant type wasn't used because the iOS app registration has oauth2AllowImplicitFlow turned off.

Web API: MS Graph and the client credentials grant type

Besides Web API being an AAD secured endpoint, and hence the receiver of access tokens, for reasons listed below, Web API itself makes REST calls to MS Graph, another AAD secured endpoint:

To lookup security group memberships of authenticated users making Web API calls. When Angular requests an ID token and access token using the implicit grant type and the user is a member of more than (currently) five security groups, AAD by design substitutes the groups claim for a hasgroups claim with a value of true. It's up to Web API to inspect the hasgroups claim of the received token and if true, retrieve the user's groups from MS Graph. (Requires delegated or application permissions.)
To lookup an ObjectID given a username provided by Angular when one user assigns another user access to the app. Within AAD, ObjectID is the equivalent of AD's security identifier, i.e., both a GUID by which Web API identifies a user and a key by which it stores user information. Inside Web API, as users leave an organization and are disabled in AAD, we want to minimize dangling user references (being unable to resolve accounts and display names). That's why, besides ObjectID, Web API also stores the user's account and display names. (Requires delegated or application permissions.)
To periodically execute non-interactive batch queries against AAD to identify disabled users. With hundreds of app users, we require a notification mechanism to re-assign disabled users' active tasks. Time will tell if MS Graph queries triggered by each Web API request becomes a bottleneck, in which case batch querying or caching users and groups is an option. (Requires application permissions.)

Following each need for making MS Graph calls is the permissions required (actually permission type or grouping as it's the container for permissions). Application refers to Web API making service-to-service MS Graph calls with its own identity and delegated that Web API makes service-to-service MS Graph calls on behalf of the signed-in user calling Web API.

The hasgroups claim substitution mentioned above occurs due to a limitation in the length of the URL by which the implicit grant type returns its token to the client. The token comes back Base64 encoded in the URL fragment and a larger claim translate to a larger payload. Tokens returned through the authorization code grant type aren't subject to this URL length limitation as they're returned in the response body.

Application or delegated permissions meet two of the needs and application only one. Whether to go with fine-grained permissions and adhering to the principle of least privilege by mixing application and delegated permissions, or using only application permissions is a matter of preference: Web API is a confidential client, both application and delegated permissions require admin consent in this case, and Web API would require application permission to meet every need. Thus we go with application permissions and issue MS Graph calls as Web API, authorized using the client credentials grant type (regardless if it's an application or delegated call, it's still service-to-service).

For Web API to request an access token for MS Graph, it submits to the token endpoint its client ID, a pre-established shared client secret, and the resource ID of MS Graph. Had we used delegated permissions, we'd have to use the on-behalf-of grant type. Before calling MS Graph, Web API would have to call the token endpoint to exchange the user provided access token for one supporting delegation as shown here.

Within the Web API app registration setup, under required permissions, we add MS Graph and enable Read directory data under application permissions.

Web API: token validation

With our use case, Web API is a receiver of access tokens from Angular and iOS apps. Each such token consists of header, payload, and signature parts, and Web API handles both validating claims and signature. From the payload, the issuer (iss) claim must be our AAD tenant as defined in the OpenID Connect federation metadata document and the audience (aud) claim, the intended recipient, must be our Web API. As for the token's lifetime, the not before (nbf) and expiration time (exp) claims must be valid.

In addition, signature validation must adhere to AAD's key rotation policy, stating that

At any time, Azure AD may sign an id_token [the OpenID Connect token or an access token] using any one of a certain set of public-private key pairs. Azure AD rotates the possible set of keys on a periodic basis, so your app should be written to handle those key changes automatically. A reasonable frequency to check for updates to the public keys used by Azure AD is every 24 hours.

Within the metadata document, jwks_uri points to the location of the JSON Web Key Set, the public keys needed to verify the signature. Defined in the header's Key ID (kid) claim is the key used. For the X509 key details, such as expiration date, we can paste the value of each x5c into an online decoder.

To get going with ASP.NET Core token validation, this tutorial and this Github issue provide good starting points. Key to robust token validation is to acknowledge that the authorization server will rotate keys and that Web API must check for updated keys. Configuring the JwtBearer middleware with the metadata discovery URL, it auto-configures and supports key rotation as well.

To understand how JWT middleware supports key rotation, two pieces are essential: (1) the ConfigurationManager class and its GetConfigurationAsync method. This tutorial shows calling the method in Startup.cs to HTTP GET the metadata discovery URL and through it the jwks_uri URL. Web API then parses their JSON payloads and caches those inside ConfigurationManager for one day as per the DefaultAutomaticRefreshInterval value, (2) the JwtBearerHandler class implementing the JwtBearer middleware calls GetConfigurationAsync on each token validation. When validation fails due to a missing key, the middleware asks ConfigurationManager to refresh its cache, subject to a default 30 seconds cache duration as per the DefaultRefreshInterval value.

In other words, logic around DefaultAutomaticRefreshInterval covers regular key rotations with at least one day notice before use and never causes token validation to fail. Logic around DefaultRefreshInterval covers emergency key rotation and does cause token validation to fail for up to 30 seconds. An acceptable period of downtime given that keys may have been compromised, potentially leaving the application open to a third-party.

Additional resources

An IT pros guide to Open ID Connect, OAuth 2.0 with the V1 and V2 Azure Active Directory endpoints. An overview talk focused on AAD and how it relates to OAuth2 and Open ID Connect. At around 28m50s it compares OAuth2 and AAD terminology: the authorization server is AAD, client (Angular, iOS, Web API) is an app in AAD, resource server (Web API, MS Graph) is an app in AAD, and resource owner is a user in AAD.
Troubleshooting OpenID Connect and OAuth2.0 protocols on Azure Active Directory. Talk focuses on troubleshooting the OpenID Connect hybrid grant type, a combination of the authorization code and implicit code grant types, useful when frontend should have immediate access to short-lived ID token while backend exchanges authorization code for longer lived refresh token. The talk introduces Postman and Fiddler of which the latter is useful only until AAD rolls out certificate pinning. With Fiddler, we may use the OpenID Connect Inspector (dev branch) plugin to peek inside tokens. The talk examines in detail requests, responses, and consents. At 40m03s, a Fiddler breakpoint is set to change the response from AAD before it's passed to the app to verify that the app implements proper token validation.
OAuth 2.0 and OpenID Connect (in plain English). Covers the history and use cases leading up to the OAuth2.0 and OpenID Connect standards and explains in-depth the URL components that go into the common grant types. With OAuth 2.0 allowing for implementation dependent choices, a non-Microsoft view helps better understand how the standards and AAD concepts work together.

Summary

To support the diverse needs of our components, now and in the future, we ended up with three app registrations. Besides registration details, to understand what's going on as part of each component's authentication and authorization cycle, the post outlined relevant parts of the OAuth2 and OpenID Connect standards. Without such knowledge, debugging often amounts to time consuming guess work.