ClrMD: Recreating .NET objects from an Azure App Service memory dump

09 Aug 2017

Part 1: WinDbg: Recreating .NET objects from an Azure App Service memory dump
Part 2: ClrMD: Recreating .NET objects from an Azure App Service memory dump

In the previous post, we concluded that while WinDbg is well-suited for exploring and querying an object graph, scripting involves parsing text output from commands. As a way around text parsing, in this post we introduce the Microsoft Diagnostics Runtime, or ClrMD, a framework for loading and inspecting .NET memory dumps from within a .NET application.

WinDbg vs ClrMD

ClrMD supports the loading and querying of a dump file in much the same way that ADO.NET supports loading and querying an opaque MS SQL database file. Another way to think of ClrMD is like the difference between cmd and PowerShell. PowerShell returns objects over strings which comes in handy when processing the result. Instead of text, ClrMD returns meta-objects, representing a .NET application frozen in time, and subject to querying using standard programming constructs.

Locating instances of the Bugfree.Spo.Analytics.Cli.Domain+Visit type on the heap is simple enough with ClrMD. But extracting field values from an object isn't. For starters, querying requires knowledge about whether a field holds a value or reference type and whether the field is a primitive or complex type. Like with WinDbg, ClrMD understands only primitive types. With Guid, DateTime, IPAddress, and FSharpOption, we must query internal field values and call the appropriate constructor. For the Guid type, this amounts to querying values of internal fields _a through _k and passing those to the Guid constructor.

Sometimes ClrMD's API return null values when they're unexpected and it's all too easy to read the wrong memory location or misinterpret the result, e.g., interpret random memory content as an object of some type, resulting in an object initialized from random memory content.

ClrMD.Extensions

ClrMD.Extensions builds on top of ClrMD and makes querying fields intuitive. The extensions support directly constructing objects of type Guid, IPAddress, and DateTime from heap objects. As another benefit, we write much less code compared to ClrMD.

The only immediate downside to ClrMD.Extensions is that it doesn't come as a NuGet package. One must clone the GitHub repository, built the code, and reference the ClrMD.Extensions and Microsoft.Diagnostics.Runtime DLLs from its bin/debug folder.

Extracting and replaying visits

The following code is part of a console application and all it takes (code available on Github with the analytics web application) to query the dump, extract visit values, create Visit objects from these values, and replaying visits by posting those to the message queue:

using System;
using System.Net;
using System.Linq;
using System.Threading;
using System.Collections.Generic;
using Microsoft.FSharp.Core;
using ClrMD.Extensions;
using static Bugfree.Spo.Analytics.Cli.Domain;
using static Bugfree.Spo.Analytics.Cli.Agents;

namespace Bugfree.Spo.Analytics.MemoryDumpProcessor {
    class Program {
        static void Main() {
            var visits = new List();
            using (ClrMDSession session = ClrMDSession.LoadCrashDump(@"C:\AzureDump\Bugfree.Spo.Analytics.Cli-d3c510-07-25-13-08-00.dmp")) {
                foreach (ClrObject o in session.EnumerateClrObjects("Bugfree.Spo.Analytics.Cli.Domain+Visit")) {
                    var pageLoadTime = (int?)o["PageLoadTime@"]["value"].SimpleValue ?? null;
                    var userAgent = (string)o["UserAgent@"]["value"].SimpleValue ?? null;
                    var v = new Visit(
                        (Guid)o["CorrelationId@"],
                        (DateTime)o["Timestamp@"],
                        (string)o["LoginName@"],
                        (string)o["SiteCollectionUrl@"],
                        (string)o["VisitUrl@"], 
                        pageLoadTime == null ? FSharpOption.None : new FSharpOption(pageLoadTime.Value),
                        (IPAddress)o["IP@"],
                        userAgent == null ? FSharpOption.None : new FSharpOption(userAgent));
                    visits.Add(v);
                }

                // Enumerating the heap doesn't preserve allocation order. Hence we impose an
                // order using the visit's timestamp for easier inspection.
                foreach (var v in visits.OrderBy(v => v.Timestamp)) {
                    visitor.Post(VisitorMessage.NewVisit(v));
                }

                // Visitor mailbox processor processes messages/visits on a separate thread. 
                // We must wait for the thread to finish processing before terminating the 
                // program.
                while (true) {
                    var l = visitor.CurrentQueueLength;
                    Console.WriteLine($"Queue length: {l}");
                    Thread.Sleep(5000);

                    if (l == 0) {
                        break;
                    }
                }

                Console.ReadKey();
            }
        }
    }
}

This code fulfills our purpose, but only scratched the surface on what's possible with ClrMD and extensions. Because we have access to mostly the same data structures as SOS, besides walking the heap, we could count instances of each type looking for memory leaks, inspect the finalization queue, determine how objects root one another, and so on. In fact, Microsoft tools such as DebugDiag and PerfView make use of ClrMD under the hood.

Conclusion

With the WinDbg output and object graph in mind, the code ought to be fairly easy to follow. By having the console application share configuration settings with the running web application, it's able to behave as the web application with respect to replaying visits. With the console application, we were able to replay the 423k visits, having each visit go through the same pipeline as current visits.

In a way, we've used the memory dump as a message queue persistence mechanism.

Have a comment or question? Please drop me an email or tweet to @ronnieholm.