Visual Studio 2010!

Read now >

View Now
DevSource RSS FEEDS
XML Want an easy way to keep up with breaking tech news? And the Get DevSource headlines delivered to your desktop with RSS.
ADVERTISEMENT
ADVERTISEMENT

 

DevSource.com: Your Source for Visual Studio on Facebook
ADVERTISEMENT
Retrieve Resources From PE Files (Part I)
By Jeff Friesen

Rate This Article: Add This Article To:

Retrieve Resources From PE Files (Part I)
( Page 2 of 4 )

Discover the resource directory structure

The resource section is organized into a directory structure that's analogous to a filesystem's directory structure. Figure 2 shows that this structure consists of a root directory pointing to one or more resource type directories, that point to one or more resource instance directories, that point to resource data areas.

An example of the hierarchical resource directory structure.

Figure 2: An example of the hierarchical resource directory structure.

 

The root directory begins with a header that describes the structure. The header is followed by an array of pointers to the resource type directories. Collectively, this header and pointer array are described by the SDK's IMAGE_RESOURCE_DIRECTORY structure, which is described below:

typedef struct _IMAGE_RESOURCE_DIRECTORY 
{
DWORD Characteristics; // Not important to resource retrieval.
DWORD TimeDateStamp; // Not important to resource retrieval.
WORD MajorVersion; // Not important to resource retrieval.
WORD MinorVersion; // Not important to resource retrieval.
WORD NumberOfNamedEntries;
WORD NumberOfIdEntries;
// IMAGE_RESOURCE_DIRECTORY_ENTRY DirectoryEntries[];
}
IMAGE_RESOURCE_DIRECTORY, *PIMAGE_RESOURCE_DIRECTORY;

Only the NumberOfNamedEntries, NumberOfIdEntries, and DirectoryEntries fields are important to resource retrieval. The sum of the first two fields' values identifies the number of entries in the latter array field. Entries with name identifiers appear before entries with integer IDs.

The DirectoryEntries field is commented out because C won't let you specify an array-based field with empty brackets. Instead, the root directory's IMAGE_RESOURCE_DIRECTORY structure is followed immediately with an array of IMAGE_RESOURCE_DIRECTORY_ENTRYs. This structure is described below:

typedef struct _IMAGE_RESOURCE_DIRECTORY_ENTRY 
{
DWORD Name;
DWORD OffsetToData;
}
IMAGE_RESOURCE_DIRECTORY_ENTRY, *PIMAGE_RESOURCE_DIRECTORY_ENTRY;

The Name field stores an integer ID if its high bit is 0, or an offset (in the lower 31 bits) to an IMAGE_RESOURCE_DIR_STRING_U structure if its high bit is 1. The offset is relative to the start of the resource section, and this structure identifies a Unicode string that names a resource instance:

typedef struct _IMAGE_RESOURCE_DIR_STRING_U 
{
WORD Length; // Number of Unicode characters in string
WCHAR NameString[1]; // Length Unicode characters
}
IMAGE_RESOURCE_DIR_STRING_U, *PIMAGE_RESOURCE_DIR_STRING_U;

The OffsetToData field stores an offset to an IMAGE_RESOURCE_DATA_ENTRY structure (which I'll discuss later) if its high bit is 0, or an offset to another IMAGE_RESOURCE_DIRECTORY structure if its high bit is 1. Both of these offsets are relative to the beginning of the resource section.

For each of the root directory's resource directory entries, Name contains the integer ID of a resource type directory -- I've never seen this directory given a string name. Also, OffsetToData contains an offset to the resource type directory's IMAGE_RESOURCE_DIRECTORY structure. Table 1 lists possible integer IDs for Name.

Integer ID Constant
1 RT_CURSOR
2 RT_BITMAP
3 RT_ICON
4 RT_MENU
5 RT_DIALOG
6 RT_STRING
7 RT_FONTDIR
8 RT_FONT
9 RT_ACCELERATOR
10 RT_RCDATA
11 RT_MESSAGETABLE
12 RT_GROUP_CURSOR
14 RT_GROUP_ICON
16 RT_VERSION
17 RT_DLGINCLUDE
19 RT_PLUGPLAY
20 RT_VXD
21 RT_ANICURSOR
22 RT_ANIICON
23 RT_HTML
24 RT_MANIFEST

Table 1: Resource type integer IDs and their C constants

Resource type directories are like classes in that they describe categories of resources (such as bitmaps, dialogs, and menus) that appear in the resource section. The resources themselves are described by resource instance directories, which are pointed to from the resource type directories.

For each of a resource type directory's resource directory entries, Name contains the integer ID or offset to the string name of a resource instance directory -- see Figure 2 for examples. Furthermore, OffsetToData contains an offset to the resource instance directory's IMAGE_RESOURCE_DIRECTORY structure.

Unlike a resource type directory, which can have multiple resource directory entries, a resource instance directory has only one entry. Furthermore, this entry's OffsetToData field will always have its most-significant bit clear, which causes it to reference an IMAGE_RESOURCE_DATA_ENTRY structure:

typedef struct _IMAGE_RESOURCE_DATA_ENTRY 
{
DWORD OffsetToData;
DWORD Size;
DWORD CodePage; // Possibly important to resource retrieval.
DWORD Reserved; // Not important to resource retrieval.
}
IMAGE_RESOURCE_DATA_ENTRY, *PIMAGE_RESOURCE_DATA_ENTRY;

The OffsetToData and Size fields specify the location (as a relative virtual address within the resource section) and size (in bytes) of the resource data. Although an RVA is not the same as a file offset, the equivalent file offset can be calculated by subtracting the resource section's RVA from OffsetToData's RVA value, and adding the difference to the offset of the root directory.

Note
A relative virtual address is an offset to some item where the offset is relative to the PE file's load address. For example, if the Windows loader maps a PE file into memory beginning at 0x400000 in the virtual address space, and if the resource section's virtual address is 0x470000, then the resource section's RVA is the difference between these values: 0x470000-0x400000, or 0x70000.

The CodePage field identifies the code page (a coded character set) used to decode code points (code page values) within the resource data. Although any valid code page number can appear in this field (such as 437, which describes the original IBM PC's character set, or 65501, which describes Unicode UTF-8), this field often contains 0 (standard Roman alphabet, numerals, punctuation, accented characters).



 
 
>>> More Microsoft Architecture Articles          >>> More By Jeff Friesen