Architecture - DevSource
DevSource: Microsoft Developer Resource DevSource Home Sponsored by Microsoft Home Add Ons Architecture Languages Techniques Using VS Forums
Home arrow Architecture arrow Emitting and Debugging MSIL
Emitting and Debugging MSIL
By Paul Kimmel

Rate This Article: Add This Article To:

Emitting and Debugging MSIL
( Page 1 of 3 )

Some people feel that emitting code is akin to superheroes. Find out why Paul's powers include not just emitting code, but debugging it too!

Introduction

The .NET framework is an excellent example of well-craft software. I have said this many times before. It is clearly borrowing from the best of other computer languages (which I have also said). The challenge is that it is so big that even though it’s well-craft and organized it is a challenge to discover everything useful. Having started using the .NET framework in 1999, I am no longer surprised that by testing, tweaking, bending and flexing there are still treasures to be found.

ADVERTISEMENT

Exploring the DynamicMethod recently I began re-exploring Refleciton.Emit and writing MSIL emitters. Writing code that generates IL is a great way to really understand the plumbing underneath and can be a lot of fun, although challenging. On my recent exploration I stumbled upon System.Diagnostics.SymbolStore. This article shares a neat finding and demonstrates how to write a basic emitter.

Oh, the neat discovery… The neat discovery is that you can associate source code with emitted code after the emitted code has been written and even write emitters that permit you to map the source code to the emitted code, adding breakpoints through Reflection.Emit.

Writing a .NET Emitter with Reflection.Emit

CIL, or Common Intermediate Language, is the ECMA (European Computer Manufacturers Association) standard for .NET. MSIL (Microsoft Intermediate Language) is what Visual Studio languages like C# and VB.NET produce. CIL and MSIL are pretty similar. CIL is the international standard and MSIL is an implementation of that standard.

The reason IL exists is so that .NET languages can convert all source code into one pseudo code IL and then one compiler and linker can convert the IL to runable code. The net result is that Microsoft and other compiler writers for .NET don’t have to write a separate compiler and linker for every language based on .NET. There are other benefits too, like code access security checks for potentially dangerous code is converted to an executable and allowed to run.

Learning IL is similar to learning assembly language and writing emitters can teach you a lot about how .NET really works. As a production way of developing code it is much more time consuming and labor intensive, much as developing software in assembler would be, but it’s still worth understanding at a minimum. The upside is that writing emitters will help you become an expert in understanding the framework.

Listing 1 contains a Hello World! emitter. The code generates a simple console application that prints text sent to it from the command line. What it also does is allow you to explore the framework a little more and implicitly shows you the relationships between .PDB debug and symbol files and source code in reverse.

Listing 1: A HelloWorld.exe emitter with a twist for debugging.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Reflection.Emit;
using System.Reflection;
using System.Diagnostics;
using System.Diagnostics.SymbolStore;

namespace DebuggingEmittedCode
{
  class Program
  {
    private delegate void HelloDelegate(string str);

    static void Main(string[] args)
    {
      DebugEmittedVersion();
      Console.ReadLine();
    }

    // devsource
    private static void DebugEmittedVersion()
    {
      AssemblyName name = new AssemblyName();
      name.Name = "HelloWorld";
      AppDomain domain = System.Threading.Thread.GetDomain();
      AssemblyBuilder builder = domain.DefineDynamicAssembly(
        name, AssemblyBuilderAccess.RunAndSave);
      ModuleBuilder module = builder.DefineDynamicModule
        ("HelloWorld.exe", true);
      
      
      ISymbolDocumentWriter doc =
        module.DefineDocument(@"..\..\Source.txt", Guid.Empty, 
          Guid.Empty, Guid.Empty);
      
      TypeBuilder typeBuilder = module.DefineType("MyType", 
        TypeAttributes.Public | TypeAttributes.Class);
      MethodBuilder methodBuilder = typeBuilder.DefineMethod("Main",
        MethodAttributes.HideBySig | MethodAttributes.Static 
        | MethodAttributes.Public,
        typeof(void), new Type[]{typeof(string[])});

      
      builder.SetEntryPoint(methodBuilder,
        PEFileKinds.ConsoleApplication);

      // add the parameter type and name it; this is a little clunky
      ParameterBuilder paramBuilder = methodBuilder.DefineParameter(
        1, ParameterAttributes.None, "args");
            
      ILGenerator ilGenerator = methodBuilder.GetILGenerator();
      
      ilGenerator.Emit(OpCodes.Nop);
      ilGenerator.Emit(OpCodes.Ldarg_0);
      ilGenerator.Emit(OpCodes.Ldc_I4_0);
      ilGenerator.Emit(OpCodes.Ldelem_Ref);
      ilGenerator.MarkSequencePoint(doc, 6, 1, 6, 100);

      //ilGenerator.Emit(OpCodes.Break);

      MethodInfo writeLine = typeof(System.Console).GetMethod(
        "WriteLine", new Type[] { typeof(string) });
      ilGenerator.EmitCall(OpCodes.Call, writeLine, 
        new Type[]{typeof(string)});
      ilGenerator.Emit(OpCodes.Nop);
      Label eom = ilGenerator.DefineLabel();
      ilGenerator.Emit(OpCodes.Br_S, eom);
      ilGenerator.MarkLabel(eom);
      ilGenerator.Emit(OpCodes.Ret);
      Type test = typeBuilder.CreateType();

      builder.Save("HelloWorld.exe");

      test.GetMethod("Main").Invoke(null, 
        new object[]{new string[]{
        "Welcome To Valhalla Tower Material Defender!"}});
    }
  }
}

When you write an emitter you have to literally write the code that writes IL that equates to all of the elements you would add in source code. Let’s explore the code by starting at the DebugEmittedVersion function.

The code creates an AssemblyName, gets an AppDomain, an AssemblyBuilder, and a ModuleBuilder. The moduel represents where your code elements will code and the AppDomain is the context in which your dynamic assembly can be emitted and run. You can emit code dynamically and run it immediately or save the emitted code to a DLL or EXE. Our example will emit a HelloWorld.exe executable. The call to SentEntryPoint indicates that the function Main represented by the MethodBuilder is the entry point, and this emitted code is a console application. (Without SetEntryPoint the code will run as like a reflected DLL from within the emitter but will not run as a stand-alone .EXE.)

We’ll come back to the IbSymbolDocumentWriter in a moment.

Skipping the ISymbolDocumentWriter for a minute, next you will need a class and some code to run. A class (or struct) can be created with a TypeBuilder. The MethodBuilder lets you define methods for your type. For a Console application you will need a Main method. This is the only method our emitter generates for now. And, of course, you will need to emit the lines of MSIL.

To define parameters use the MethodBuilder constructor as show to define the return type and parameter types or use MethodBuilder.SetParameters. To name the parameter and adorn it with modifiers—like ref, out, or in—use a ParameterBuilder as shown in the listing. Up to the ParameterBuilder statement the MSIL is equivalent to:

public class MyType
{

    public static void Main(string[] args)
    {
    }
}

From the MethodBuilder you request the ILGenerator. The generator converts the expression tree to MSIL.

Tip: To learn how to write IL write regular code—in C# or VB.NET—and use ILDASM to see the MSIL Visual Studio generates. Use ILDASM to view the MSIL.

ILGenerator.Emit is used to write the lines of MSIL. OpCodes.Nop is filler. This is generally added to move code to even numbered offsets, but it doesn’t actually do anything. OpCodes.Ldarg_0 loads the methods first argument. OpCodes.Ldc_I4_0 loads the value 0 as an int32 on the evaluation stack. Ldelem_Ref loads a reference with the specified index on the evaluation stack. These three arguments (after OpCodes.Nop) is indexing the array of strings. For example, Ldc_I4_1 would get the second argument string.

ILGenerator.MarkSequence is marking a debug location in the source document. More on this in a minute.

The GetMethod call returns a MethodInfo object for the specified type. ILGenerator.EmitCall calls the method specified by the MethodInfo—the second—argument. These two lines combined with the previous three equate to the line

Console.WriteLine(args[0]);

The next line is filler. The next three lines show you how to define a label for a GOTO. They don’t fill any real purpose here. ILGenerator.Emit(OpCodes.Ret) is the end of the method, illustrating that everything is really a function, even void functions.

The statement including typeBuilder.CreateType loads the Type object for our emitted code, and GetMethod(“methodname”).Invoke calls our method using reflection passing the argument to main.

If you want to save the emitted code to an external use TypeBuilder.Save passing a filename for the executable (or DLL).



 
 
>>> More Architecture Articles          >>> More By Paul Kimmel
 



Microsoft's Future: A Chat With Their CTO, Barry Briggs

Play Video >

All Videos >

Julia explores the Robotics Studio!

Read now >

Messages to Bill Gates!

Read now >

View Now
DevSource RSS FEEDS
XML Want an easy way to keep up with breaking tech news? And the Get DevSource headlines delivered to your desktop with RSS.