Writing Code Generators in .NET - ' Code Gen Elements ' (
Page 3 of 3 )
Understanding the Elements Needed for Code Generation
When writing code generators, one must write a generator to generate every element that would be coded by hand. Thus, a code generator must generate a module, namespace, classes, methods, properties, fields, events, and possibly delegate defintions, enumerated types, structs, and interfaces. The easiest way to get the generator correct is to write an exemplar of the type of class that you want to generate, much like automobiles are prototyped before a production line is built.
ADVERTISEMENT
To build the entity generator, I manually wrote an entity based on the Customers table of the Northwind database. That exemplar is almost identical to the code shown in Listing 4. The generated entity class is about 130 lines of code and the generator, which can generate an infinite number of entity classes, is 179 lines of code. From that perspective, the entity generator is time well spent.
Dissecting the Entity Generator
To understand how to build the entity generator, let's look at the major pieces of code. There is no need to dissect every bit, because some of it is self-explanatory; for example, you need the using statements in Listing 3. The CodeDom namespaces contain the codegraph building classes and System.IO is used for writing the code to a stream.
Think of a code graph as a tree of elements that represent .NET code. For example, a class node represents code that will be generated as a class. When a specific language is selected, the CodeDom replaces the node with the actual code element.
I typically think of code generators as code factories, so use the Factory pattern and a static method as the starting point for the code generator. In Listing 3, this is the static Run method. The generic list of FieldDefinition (See Listing 1) objects is simply the name, type, and target field and property name of each item to generated for the entity.
At the beginning of the Factory method Run we need a CodeDomProvider. If you pick a VB provider, VB code is generated. Pick the CSharpCodeProvider, as shown, and C# code is generated.
A good strategy might be to pass in an argument indicating the target language. That strategy is not employed here. In fact, Strategy is the name of design pattern too. You can look up that pattern here.
Next, we need a CodeCompileUnit. This represents a module of code in the code graph. Following our template, we need a CodeNamespace and using statements, as shown in our exemplar. Just as you would add these elements manually, the CodeNamespace is added to the CodeCompileUnit, and the Imports/using statements are added to the namespace. Figure 1 gives you a rough idea of what the code graph should look like.
Figure 1: A typical code graph containing one namespace, class, and typical members (without individual lines of code elements).
The lines of code in listing 3 show you how to write each of these elements. By exploring the CodeDom namespace, you can find the classes for individual statement generators. Some take a little work to find. For example, it seems that while loops aren't supported so you can write a construct generator that generates for( ; test; ) to replace the while loop.
After you finish writing elements that generate all the elements in the graph, write the code that generates code, compiles the code, or both. At the end of the Run method, you can see this code (shown as an excerpt in Listing 5).
Listing 5: The CodeGeneratorOptions and provider are used in the exmaple to generate code from the CodeCompileUnit.
The CodeDom namespace can actually compile the code. You can use Reflection to dynamically load and run a generated assembly in the exact same appplication in which it was generated.
The CodeDom is a powerful namespace that can generate source code in multiple lanaguages. In this article, an exemplar was used and then each CodeDom element was written to generate that piece. An effective strategy is to factor out common pieces rather than to generate redundant code.
As you can discern from the examples, the generated code is approximately the same number of lines of code as its generator, but the generator can generate an infinite number of entities. Hence, writing the generator instead of the entities will speed up productivity by the number of entities you would otherwise write by hand. Granted, the CodeDom code is a little harder to write than is the entity code, and you may have to convince your boss that this is time well spent, but time and money savings will ultimately speak for itself.
Editor permitting, I will follow this article with factored code and a data access layer generator. If you are doing a big database project, this could speed up time to delivery vastly. If you use this code, feel free to let me know about your results at pkimmel@softconcepts.com. [Tell the editor, too, so she knows whether to permit the series to continue! —Ed.]
Paul Kimmel has written several books on object oriented programming and .NET. Check out his new book, UML DeMystified, from McGraw-Hill/Osborne. Paul is an architect for Tri-State Hospital Supply Corporation. You may contact him for technology questions at pkimmel@softconcepts.com.