When we work on typical day to day applications, it's easy to forget some of the core features of the .NET framework. For me personally it's been a long time since I've learned about some of the underlying CLR system level services even though I rely on them on a daily basis. I often think only about high level application constructs and/or high level framework functionality, but the low level stuff is often just taken for granted.

Over the last week at DevConnections I had all sorts of low level discussions with other developers about the inner workings of this or that technology (especially in light of my Low Level ASP.NET Architecture talk and the Razor Hosting talk). One topic that came up a couple of times and ended up a point of confusion even amongst some seasoned developers (including some folks from Microsoft <snicker>) is when assemblies actually load into a .NET process.

There are a number of different ways that assemblies are loaded in .NET. When you create a typical project, assemblies usually come from:

  • The Assembly reference list of the top level 'executable' project
  • The Assembly references of referenced projects
  • Dynamically loaded assemblies, using runtime loading via AppDomain or Reflection loading

In addition .NET automatically loads mscorlib (most of the System namespace) as part of the .NET runtime hosting process that hoists up the .NET runtime in EXE apps, or some other kind of runtime hosting environment (runtime hosting in servers like IIS, SQL Server or COM Interop). In hosting environments the runtime host may also pre-load a bunch of assemblies on its own (for example the ASP.NET host requires all sorts of assemblies just to run itself, before ever routing into your user specific code).

Assembly Loading

The most obvious source of loaded assemblies is the top level application's assembly reference list. You can add assembly references to a top level application and those assembly references are then available to the application.

In a nutshell, referenced assemblies are not immediately loaded - they are loaded on the fly as needed. So regardless of whether you have an assembly reference in a top level project, or a dependent assembly assemblies typically load on an as needed basis, unless explicitly loaded by user code. The same is true of dependent assemblies.

To check this out I ran a simple test: I have a utility assembly Westwind.Utilities which is a general purpose library that can work in any type of project. Due to a couple of small requirements for encoding and a logging piece that allows logging Web content (dependency on HttpContext.Current) this utility library has a dependency on System.Web. Now System.Web is a pretty large assembly and generally you'd want to avoid adding it to a non-Web project if it can be helped.

So I created a Console Application that loads my utility library:

AssemblyListInProjects

You can see that the top level Console app a reference to Westwind.Utilities and System.Data (beyond the core .NET libs). The Westwind.Utilities project on the other hand has quite a few dependencies including System.Web.

I then add a main program that accesses only a simple utillity method in the Westwind.Utilities library that doesn't require any of the classes that access System.Web:

        static void Main(string[] args)
        {
            Console.WriteLine(StringUtils.NewStringId());            
            Console.ReadLine();
        }

StringUtils.NewStringId() calls into Westwind.Utilities, but it doesn't rely on System.Web.

Any guesses what the assembly list looks like when I stop the code on the ReadLine() command?

I'll wait here while you think about it…

So, when I stop on ReadLine() and then fire up Process Explorer and check the assembly list I get:

AssembliesProcessExplorer

We can see here that .NET has not actually loaded any of the dependencies of the Westwind.Utilities assembly. Also not loaded is the top level System.Data reference even though it's in the dependent assembly list of the top level project. Since this particular function I called only uses core System functionality (contained in mscorlib) there's in fact nothing else loaded beyond the main application and my Westwind.Utilities assembly that contains the method accessed. None of the dependencies of Westwind.Utilities loaded.

If you were to open the assembly in a disassembler like Reflector or ILSpy, you would however see all the compiled in dependencies. The referenced assemblies are in the dependency list and they are loadable, but they are not immediately loaded by the application.

In other words the C# compiler and .NET linker are smart enough to figure out the dependencies based on the code that actually is referenced from your application and any dependencies cascading down into the dependencies from your top level application into the referenced assemblies. In the example above the usage requirement is pretty obvious since I'm only calling a single static method and then exiting the app, but in more complex applications these dependency relationships become very complicated - however it's all taken care of by the compiler and linker figuring out what types and members are actually referenced and including only those assemblies that are in fact referenced in your code or required by any of your dependencies.

The good news here is: That if you are referencing an assembly that has a dependency on something like System.Web in a few places that are not actually accessed by any of your code or any dependent assembly code that you are calling, that assembly is never loaded into memory!

Some Hosting Environments pre-load Assemblies

The load behavior can vary however. In Console and desktop applications we have full control over assembly loading so we see the core CLR behavior. However other environments like ASP.NET for example will preload referenced assemblies explicitly as part of the startup process - primarily to minimize load conflicts. Specifically ASP.NET pre-loads all assemblies referenced in the GAC assembly list and the /bin folder. So in Web applications it definitely pays to minimize your top level assemblies if they are not used.

Understanding when Assemblies Load

To clarify and see it actually happen what I described in the first example , let's look at a couple of other scenarios. To see assemblies loading at runtime in real time lets create a utility function to print out loaded assemblies to the console:

    public static void PrintAssemblies()
    {
        var assemblies = AppDomain.CurrentDomain.GetAssemblies();
        foreach (var assembly in assemblies)
        {
            Console.WriteLine(assembly.GetName());
        }
    }

Now let's look at the first scenario where I have class method that references internally uses System.Web. In the first scenario lets add a method to my main program like this:

        static void Main(string[] args)
        {
            Console.WriteLine(StringUtils.NewStringId());
            Console.ReadLine();
PrintAssemblies(); }
public static void WebLogEntry() { var entry = new WebLogEntry(); entry.UpdateFromRequest(); Console.WriteLine(entry.QueryString); }

UpdateFromWebRequest() internally accesses HttpContext.Current to read some information of the ASP.NET Request object so it clearly needs a reference System.Web to work. In this first example, the method that holds the calling code is never called, but exists as a static method that can potentially be called externally at some point.

What do you think will happen here with the assembly loading? Will System.Web load in this example?

Console1

No - it doesn't. Because the WebLogEntry() method is never called by the mainline application (or anywhere else) System.Web is not loaded. .NET dynamically loads assemblies as code that needs it is called. No code references the WebLogEntry() method and so System.Web is never loaded.

Next, let's add the call to this method, which should trigger System.Web to be loaded because a dependency exists. Let's change the code to:

    static void Main(string[] args)
    {
        Console.WriteLine(StringUtils.NewStringId());

        Console.WriteLine("--- Before:");
        PrintAssemblies();

        WebLogEntry();

        Console.WriteLine("--- After:");
        PrintAssemblies();
        Console.ReadLine();
    }

    public static void WebLogEntry()
    {
        var entry = new WebLogEntry();
        entry.UpdateFromRequest();
        Console.WriteLine(entry.QueryString);
    }

Looking at the code now, when do you think System.Web will be loaded? Will the before list include it?

Console2

Yup System.Web gets loaded, but only after it's actually referenced. In fact, just until before the call to UpdateFromRequest() System.Web is not loaded - it only loads when the method is actually called and requires the reference in the executing code.

Assembly Unloading

As a side not, when an assembly is loaded in .NET, it loads into an AppDomain, and can never be unloaded from that AppDomain. That's as in never ever. Meaning once you take the memory hit from the assembly loading that memory can never release. The only way .NET assemblies can be unloaded is by unloading the AppDomain. If you have applications that need to dynamically load assemblies (like a hosting or scripting engine or plug-ins for example), it's a good idea to load assemblies into a separate AppDomain that can be unloaded when you're done, or optionally allows occasional unloading to minimize memory usage.

The fact that you can't unload assemblies is one of the reasons why discussions about assembly loading and trying to avoid loading unnecessary stuff usually comes up in the first place :-)

Moral of the Story

So what have we learned - or maybe remembered again?

  • Dependent Assembly References are not pre-loaded when an application starts (by default)
  • Dependent Assemblies that are not referenced by executing code are never loaded
  • Dependent Assemblies are just in time loaded when first referenced in code
  • Once Assemblies are loaded they can never be unloaded, unless the AppDomain that host them is unloaded.

All of this is nothing new - .NET has always worked like this. But it's good to have a refresher now and then and go through the exercise of seeing it work in action. It's not one of those things we think about everyday, and as I found out last week, I couldn't remember exactly how it worked since it's been so long since I've learned about this. And apparently I'm not the only one as several other people I had discussions with in relation to loaded assemblies also didn't recall exactly what should happen or assumed incorrectly that just having a reference automatically loads the assembly.

The moral of the story for me is: Trying at all costs to eliminate an assembly reference from a component is not quite as important as it's often made out to be.

For example, the Westwind.Utilities module described above has a logging component, including a Web specific logging entry that supports pulling information from the active HTTP Context. Adding that feature requires a reference to System.Web. Should I worry about this in the scope of this library? Probably not, because if I don't use that one class of nearly a hundred, System.Web never gets pulled into the parent process. IOW, System.Web only loads when I use that specific feature and if I am, well I clearly have to be running in a Web environment anyway to use it realistically. The alternative would be considerably uglier: Pulling out the WebLogEntry class and sticking it into another assembly and breaking up the logging code. In this case - definitely not worth it.

So, .NET definitely goes through some pretty nifty optimizations to ensure that it loads only what it needs and in most cases you can just rely on .NET to do the right thing. Sometimes though assembly loading can go wrong (especially when signed and versioned local assemblies are involved), but that's subject for a whole other post…