Sunday, December 18, 2005

Fat vs. Tiny method header

More than a month ago Sameer asked me to write about Fat and Tiny method header formats. I'm sorry for answering only now but DILE really kept me busy. So, I'll try to continue the serie...

Theory


First of all, don't search for the documentation about metadata in the Visual Studio folder anymore. In the C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Tool Developers Guide (by default) folder, a readme.doc will only inform you that all the documentation can now be found online in the following location: http://msdn.microsoft.com/net/ecma/default.asp.
The description of the headers is in the Partition II Metadata document.

When you get the physical starting address of a method (after calculating it from the RVA), you should begin with checking the first byte. The two least significant bits tell the type of the method header. If the value is 2 then it's a tiny header, if 3 then it's a fat header.
There are some conditions that must be met for tiny header to be used. These are the following:
  • No local variables are allowed
  • No exceptions
  • No extra data sections
  • The operand stack shall be no bigger than 8 entries
If any of these requirements not met, a fat header will be used.

The tiny header is very simple as it's only 1 byte. The two least significant bits have already been used to tell the format of the header thus the remaining 6 bits tell the length (in bytes) of the method.
The fat header is a little bit more complicated:
  • 0-11 bits: Flags, the Fat format must be set and there are 2 other important values:
    • CorILMethod_MoreSects (0x8): this indicates that extra sections can be found after the method's IL body
    • CorILMethod_InitLocals (0x10): this means that the default constructor should be called on all local variables
  • 12-15 bits: size of the header, this number should be multiplied by 4 (currently it must always be 3 and thus the final size must be 12 bytes)
  • 2-3 byte: size of the stack
  • 4-7 byte: size of the method's IL body in bytes
  • 8-11 byte: the LocalVarSig's token that belongs to the method


In both the tiny and fat header's case, after the method must come the IL code. In the fat header's case we might still have some information to process. If the CorILMethod_MoreSects flag has been set then the extra sections must come after the IL code. These extra sections always start on a 4 byte boundary and currently only used to store information about exception handling.
An extra section's first byte tells the format, similarly to the method header. Data sections are also stored in two different formats, either in small or in fat format.
  • CorILMethod_Sect_EHTable (0x1): the section holds information about exception handling (currently this is the only possible value)
  • CorILMethod_Sect_OptILTable (0x2): the documentation doesn't tell much about this: "Reserved, shall be 0."...
  • CorILMethod_Sect_FatFormat (0x40): the section uses fat format
  • CorILMethod_Sect_MoreSects (0x80): another data section follows this one

In case of small format the section should continue like this:
  • 1-2 byte: DataSize, size of the data in the block, including the header. The number of clauses can be calculated using the following expression: n * 12 + 4 where n is the number of clauses (and that 4 is the size of the header)
  • 2-3 byte: padding, it's always 0
  • 4-: Clauses

In case of a fat format the section header looks like this:
  • 1-3 byte: DataSize, size of the data in the block, including the header. The number of clauses can be calculated using the following expression: n * 24 + 4 where n is the number of clauses (and that 4 is the size of the header)
  • 4-: Clauses

And after the header should come the clauses. Either small or fat format is used the order of the information would be the same, only the positions are different.
FieldDescriptionPosition (small format)Position (fat format)
FlagsDescribed below.0-1 byte0-3 byte
TryOffsetOffset in bytes of try block from start of method body.2-3 byte4-7 byte
TryLengthLength of the try block in bytes4-4 byte8-11 byte
HandlerOffsetOffset in bytes of the handler block from start of method body5-6 byte12-15 byte
HandlerLengthSize of the handler code in bytes7-7 byte16-19 byte
ClassTokenThe token of the exception class if this is a type-based exception handler8-11 byte20-23 byte
FilterOffsetOffset in method body for filter-based exception handler8-11 byte20-23 byte

The following flags can be set:
  • COR_ILEXCEPTION_CLAUSE_EXCEPTION (0x0000): indicates that the section clauses is a type-based exception handler
  • COR_ILEXCEPTION_CLAUSE_FILTER (0x0001): an exception filter and handler clause
  • COR_ILEXCEPTION_CLAUSE_FINALLY (0x0002): Finally clause
  • COR_ILEXCEPTION_CLAUSE_FAULT (0x0004): Fault clause (finally that is called on exception only)

That's all. Quite a lot of information. But luckily, it's still quite simple (compared to signatures for example ;-)).

Demonstration


I've developed a little further the sample program that I posted here. Now it's also able to read methods with fat header.


using System;
using 
System.Collections.Generic;
using 
System.Text;

using 
System.IO;
using 
System.Runtime.InteropServices;

namespace 
Blog4
{
    [Flags()]
    
public enum ILMethodHeader : byte
    
{
        CorILMethod_FatFormat 
0x3,
        CorILMethod_TinyFormat 
0x2,
        CorILMethod_MoreSects 
0x8,
        CorILMethod_InitLocals 
0x10
    }

    
// codes that identify attributes
    
public enum CorILMethodSect : uint
    
{
        CorILMethod_Sect_Reserved 
0,
        CorILMethod_Sect_EHTable 
1,
        CorILMethod_Sect_OptILTable 
2,

        
// The mask for decoding the type code
        
CorILMethod_Sect_KindMask 0x3F,
        
// fat format
        
CorILMethod_Sect_FatFormat 0x40,
        
// there is another attribute after this one
        
CorILMethod_Sect_MoreSects 0x80
    }

    
// defintitions for the Flags field below (for both big and small)
    
public enum CorExceptionFlag : uint
    
{
        
// This is a typed handler
        
COR_ILEXCEPTION_CLAUSE_NONE,
        
// Deprecated
        
COR_ILEXCEPTION_CLAUSE_OFFSETLEN 0x0000,
        
// Deprecated
        
COR_ILEXCEPTION_CLAUSE_DEPRECATED 0x0000,
        
// If this bit is on, then this EH entry is for a filter
        
COR_ILEXCEPTION_CLAUSE_FILTER 0x0001,
        
// This clause is a finally clause
        
COR_ILEXCEPTION_CLAUSE_FINALLY 0x0002,
        
// Fault clause (finally that is called on exception only)
        
COR_ILEXCEPTION_CLAUSE_FAULT 0x0004,
        
// duplicated clase..  this clause was duplicated down to a funclet
        //which was pulled out of line
        
COR_ILEXCEPTION_CLAUSE_DUPLICATED 0x0008
    }

    
public class Program
    {
        
public readonly static Guid IID_IMetaDataImport 
            new 
Guid("7DAC8207-D3AE-4c75-9B67-92801A497D44");

        private static uint 
virtualAddress 0;
        private static uint 
VirtualAddress
        {
            
get
            
{
                
return virtualAddress;
            
}

            
set
            
{
                virtualAddress 
= value;
            
}
        }

        
private static uint pointerToRawData 0;
        private static uint 
PointerToRawData
        {
            
get
            
{
                
return pointerToRawData;
            
}

            
set
            
{
                pointerToRawData 
= value;
            
}
        }

        
static void Main(string[] args)
        {
            Console.Write(
"Please enter the full path of the assembly: ");
            
//Read the path of the assembly from the console.
            
string assemblyPath Console.ReadLine();

            
Console.Write("Fully qualified name of the class : ");
            
//Read the name of the class from the console.
            
string className Console.ReadLine();

            
Console.Write("Name of the method: ");
            
//Read the name of the method from the console.
            
string methodName Console.ReadLine();

            
//Open the assembly with Unmanaged Metadata API.
            
IMetaDataDispenserEx dispenser = new MetaDataDispenserEx();
            
IMetaDataImport import = null;
            object 
rawScope = null;
            
Guid metaDataImportGuid IID_IMetaDataImport;

            
dispenser.OpenScope(assemblyPath, 0ref metaDataImportGuid, 
                
out rawScope);
            
import (IMetaDataImport)rawScope;

            
//Search for the desired class.
            
uint typeDefToken 0;
            
import.FindTypeDefByName(className, 0out typeDefToken);

            
//Search for the desired method.
            
uint methodDefToken 0;
            
import.FindMethod(typeDefToken, methodName, null0
                
out methodDefToken);

            char
[] methodDefName = new char[1024];
            uint 
methodDefCount 0;
            uint 
attributes 0;
            
IntPtr signature;
            uint 
signatureCount 0;
            uint 
rva 0;
            uint 
implementationFlags 0;

            
//Get the properties of the method (including its RVA).
            
import.GetMethodProps(methodDefToken, out typeDefToken,
                    methodDefName, Convert.ToUInt32(methodDefName.Length),
                    
out methodDefCount, out attributes, out signature,
                    
out signatureCount, out rva, out implementationFlags);

            
FileStream fileStream = new FileStream(assemblyPath, FileMode.Open,
                FileAccess.Read, FileShare.Read)
;
            
BinaryReader binaryReader = new BinaryReader(fileStream);

            
//Read the header of the file.
            
ReadHeader(binaryReader);
            
//Read the methods IL code.
            
ReadILCode(binaryReader, rva);

            
binaryReader.Close();

            
Console.ReadLine();
        
}

        
private static void ReadHeader(BinaryReader binaryReader)
        {
            
//Move to the beginning of the assembly.
            
binaryReader.BaseStream.Position 0;
            
//Read the MS-DOS header.
            
byte[] dosHeader binaryReader.ReadBytes(128);

            
//Read the lfanew value.
            
uint lfanew BitConverter.ToUInt32(dosHeader, 0x3c);

            
//Move to the section headers which starts at the following address:
            //lfanew + PE Signature length (24 bytes) + PE Optional Header
            //(224 bytes).
            
binaryReader.BaseStream.Seek(lfanew + 24 224, SeekOrigin.Begin);
            bool 
textSectionFound = false;

            
//Check all the section headers until we find the .text.
            
do
            
{
                
//Read the first 8 bytes from the section header which is the name.
                
byte[] sectionNameBytes binaryReader.ReadBytes(8);
                string 
sectionName UTF8Encoding.UTF8.GetString(sectionNameBytes);
                
textSectionFound (sectionName == ".text\0\0\0");

                
//When we have found the .text section then store the 
                //Pointer to Raw Data and the Virtual Address values.
                
if (textSectionFound)
                {
                    binaryReader.ReadBytes(
4);
                    
PointerToRawData binaryReader.ReadUInt32();
                    
binaryReader.ReadBytes(4);
                    
VirtualAddress binaryReader.ReadUInt32();
                
}
                
else
                
{
                    
//Otherwise skip the rest of the section header and move to the
                    //next one.
                    
binaryReader.ReadBytes(32);
                
}
            }
            
while (textSectionFound);
        
}

        
private static void ReadMethodDataSections(BinaryReader binaryReader)
        {
            
bool moreSections = true;
            
//Store the enum values in byte variables 
            //(just to make handling them easier :-)).
            
byte moreSectionsValue 
                
(byte)CorILMethodSect.CorILMethod_Sect_MoreSects;
            byte 
fatFormatValue (byte)CorILMethodSect.CorILMethod_Sect_FatFormat;
            byte 
exceptionHandlingTableValue 
                
(byte)CorILMethodSect.CorILMethod_Sect_EHTable;

            
//Let's iterate over all the sections...
            
while (moreSections)
            {
                
//Each section should start on a 4 byte boundary
                //so let's read from the stream until we find the next boundary.
                
int bytesToRead 
                    
Convert.ToInt32(binaryReader.BaseStream.Position % 4);

                if 
(bytesToRead > 0)
                {
                    binaryReader.ReadBytes(
- bytesToRead);
                
}

                
//The first byte of the data section is a flag.
                //This tells the type of the data section.
                
byte kind binaryReader.ReadByte();

                
//I have never seen anything else than an exception handling section...
                //According to the documentation "Currently, the method data sections
                //are only used for exception tables."
                
if ((kind & exceptionHandlingTableValue) !exceptionHandlingTableValue)
                {
                    
throw new NotImplementedException(
                        
"The method data section is    not an exception handling table.");
                
}

                
//Check whether more sections follow after this one.
                
moreSections ((kind & moreSectionsValue) == moreSectionsValue);
                int 
dataSize 0;
                int 
clauseNumber 0;
                
//Check whether the section has a Fat format.
                
bool fatFormat ((kind & fatFormatValue) == fatFormatValue);

                if 
(fatFormat)
                {
                    
//In case of a Fat format, after the kind comes the data size
                    //which is stored on 3 bytes.
                    
dataSize binaryReader.ReadByte() + binaryReader.ReadByte() * 0x100
                        + binaryReader.ReadByte() * 0x10000
;
                    
//The data size contains the size of all the clauses
                    //(each one 24 bytes) + 4 (the header's size).
                    //It's enough to divide the number by 24 and we'll get the 
                    //number of clauses that come next.
                    
clauseNumber dataSize / 24;
                
}
                
else
                
{
                    
//In case of a Small format, after the kind comes the data size
                    //which is stored on 1 byte.
                    
dataSize binaryReader.ReadByte();
                    
//Read padding.
                    
binaryReader.ReadBytes(2);
                    
//The data size contains the size of all the clauses
                    //(each one 12 bytes) + 4 (the header's size).
                    //It's enough to divide the number by 12 and we'll get the
                    //number of clauses that come next.
                    
clauseNumber dataSize / 12;
                
}

                
if (fatFormat)
                {
                    Console.WriteLine(
                        
"The exception handling section has fat format.");
                
}
                
else
                
{
                    Console.WriteLine(
                        
"The exception handling section has small format.");
                
}

                
//Let's read the clauses...
                
for (int clauseIndex 0clauseIndex < clauseNumberclauseIndex++)
                {
                    CorExceptionFlag flags
;
                    uint 
tryOffset;
                    uint 
tryLength;
                    uint 
handlerOffset;
                    uint 
handlerLength;

                    
//The structure of the clauses are the same in both Fat and
                    //Small format, only the sizes are different.
                    
if (fatFormat)
                    {
                        flags 
(CorExceptionFlag)binaryReader.ReadUInt32();
                        
tryOffset binaryReader.ReadUInt32();
                        
tryLength binaryReader.ReadUInt32();
                        
handlerOffset binaryReader.ReadUInt32();
                        
handlerLength binaryReader.ReadUInt32();
                    
}
                    
else
                    
{
                        flags 
(CorExceptionFlag)binaryReader.ReadUInt16();
                        
tryOffset binaryReader.ReadUInt16();
                        
tryLength binaryReader.ReadByte();
                        
handlerOffset binaryReader.ReadUInt16();
                        
handlerLength binaryReader.ReadByte();
                    
}

                    Console.WriteLine(
"{0}. section:", clauseIndex + 1);
                    
Console.WriteLine("Flags: " + Convert.ToString(flags));
                    
Console.WriteLine("Try offset: " + Convert.ToString(tryOffset));
                    
Console.WriteLine("Try length: " + Convert.ToString(tryLength));
                    
Console.WriteLine("Handler offset: " + Convert.ToString(handlerOffset));
                    
Console.WriteLine("Handler length: " + Convert.ToString(handlerLength));

                    if 
(flags == CorExceptionFlag.COR_ILEXCEPTION_CLAUSE_NONE)
                    {
                        
//If the clause is a typed exception clause then read the 
                        //token of the exception that will be handled by it.
                        
uint classToken binaryReader.ReadUInt32();
                        
Console.WriteLine("Exception class token: 0x{0}"
                            classToken.ToString(
"x").PadLeft(8'0'));
                    
}
                    
else if (flags == CorExceptionFlag.COR_ILEXCEPTION_CLAUSE_FILTER)
                    {
                        
//If the clause is a filter clause then read the offset 
                        //of the filter. (e.g.: VB.NET can generate such filter for the
                        //"Catch exc As Exception When value = True" code).
                        
uint filterOffset binaryReader.ReadUInt32();
                        
Console.WriteLine("Filter offset: {0}", filterOffset);
                    
}
                    
else
                    
{
                        
//The last 4 bytes are not relevant but we must read them to 
                        //position to the next section's header.
                        
binaryReader.ReadUInt32();
                    
}

                    Console.WriteLine(
"\n");
                
}
            }
        }

        
private static void ReadILCode(BinaryReader binaryReader, uint rva)
        {
            
//Move to the beginning of the IL code.
            
binaryReader.BaseStream.Position rva - PointerToRawData 
                + VirtualAddress
;
            
//Read the method header.
            
byte methodHeader binaryReader.ReadByte();
            int 
methodLength 0;
            
//Decide whether this is a Fat header.
            
bool isFatHeader ((methodHeader & 
                (
byte)ILMethodHeader.CorILMethod_FatFormat) == 
                
(byte)ILMethodHeader.CorILMethod_FatFormat);
            bool 
hasMoreSects = false;

            
//If the header is using Fat format then read the extra information.
            
if (isFatHeader)
            {
                
//The first two bytes together contain the flags (0-11 bits) and
                //the size of the header (12-15 bits).
                
byte methodHeaderByte2 binaryReader.ReadByte();
                
//Check whether more sections can be found after the IL code.
                
hasMoreSects ((methodHeader & 
                    (
byte)ILMethodHeader.CorILMethod_MoreSects) == 
                    
(byte)ILMethodHeader.CorILMethod_MoreSects);

                
//Calculate the size of the header.
                
byte headerSize Convert.ToByte((methodHeaderByte2 >> 4) * 4);
                
//After the header comes the maximum stack size (2 bytes).
                
ushort maxStack binaryReader.ReadUInt16();
                
//Then comes the IL method body's length (4 bytes).
                
methodLength binaryReader.ReadInt32();
                
//The last 4 bytes of the header is the LocalVarSig's token.
                
uint localVarSigToken binaryReader.ReadUInt32();

                
Console.WriteLine("Fat method header:");
                
Console.WriteLine("Has more sections: " 
                    Convert.ToString(hasMoreSects))
;
                
Console.WriteLine("Header size: " + Convert.ToString(headerSize));
                
Console.WriteLine("Maximum stack size: " 
                    Convert.ToString(maxStack))
;
                
Console.WriteLine("LocalVarSig token: 0x" 
                    localVarSigToken.ToString(
"x").PadLeft(8'0'));
            
}
            
else
            
{
                Console.WriteLine(
"Tiny method header:");
                
//Calculate the IL method body's length. There's no other 
                //information in a tiny header.
                
methodLength methodHeader >> 2;
            
}

            Console.WriteLine(
"Method length: {0}", methodLength);
            
Console.Write("Method's IL code: ");

            byte
[] methodCode = new byte[methodLength];
            int 
methodCodeIndex 0;

            
//Read the method's IL code until the end and write it to the console.
            
while (methodCodeIndex < methodLength)
            {
                methodCodeIndex++
;
                
Console.Write(string.Format("{0} "
                    binaryReader.ReadByte().ToString(
"X").PadLeft(2'0')));
            
}

            
//If the method's header is Fat and there are more data sections after
            //the IL code then read those also.
            
if (hasMoreSects)
            {
                Console.WriteLine(
"\n\nData sections:\n");
                
ReadMethodDataSections(binaryReader);
            
}
        }
    }
}


I get the following result if I run the program on a method with tiny header:
C:\Projects\Blog4\bin\Debug>Blog4.exe
Please enter the full path of the assembly: c:\projects\blogtest\bin\debug\BlogTest.dll
Fully qualified name of the class : BlogTest.Class
Name of the method: TinyFormatMethod
Tiny method header:
Method length: 13
Method's IL code: 00 72 01 00 00 70 28 10 00 00 0A 00 2A


And this for a fat header:
C:\Projects\Blog4\bin\Debug>Blog4.exe
Please enter the full path of the assembly: c:\projects\blogtest\bin\debug\BlogTest.dll
Fully qualified name of the class : BlogTest.Class
Name of the method: FatFormatMethod
Fat method header:
Has more sections: True
Header size: 12
Maximum stack size: 2
LocalVarSig token: 0x11000001
Method length: 49
Method's IL code: 00 17 0A 19 0B 00 06 07 58 0A 00 DE 0C 0C 00
08 28 11 00 00 0A 00 00 DE 00 00 DE 0E 00 72 01 00 00 70 28 10
00 00 0A 00 00 DC 00 06 0D 2B 00 09 2A

Data sections:

The exception handling section has small format.
1. section:
Flags: COR_ILEXCEPTION_CLAUSE_OFFSETLEN
Try offset: 5
Try length: 8
Handler offset: 13
Handler length: 12
Exception class token: 0x01000013

2. section:
Flags: COR_ILEXCEPTION_CLAUSE_FINALLY
Try offset: 5
Try length: 23
Handler offset: 28
Handler length: 14


Verification


Open ildasm (or rather DILE ;-)) and check the method's code. It should be the same like what we got.

Sunday, December 11, 2005

Final DILE v0.2

So, here is the final v0.2 release of DILE. I have fixed several debugging related bugs and I have improved the UI also. I hope all these changes will make it easier and more comfortable to use DILE.

The zip file: dile_v0_2.zip
readme.txt: readme.txt
license.txt: license.txt
change_log.txt: change_log.txt

List of the most important changes:
  • compiled using .NET Framework RTM (v2.0.50727)
  • modules panel
  • threads panel
  • MDA (Managed Debug Assistant) support
  • debuggee thrown exceptions can be skipped
  • debuggee can be automatically paused on chosen events (e.g.: LoadClass, LoadModule, CreateThread)
  • decimal or hexadecimal number display
  • plenty of new settings (configurable default directories, shortcuts etc.)
  • new project settings (list of exceptions to skip)
  • recent projects and recent assemblies list
  • projects and assemblies can be loaded by drag&drop
  • searching for project items in the Quick Search Panel can be aborted by pressing Escape
  • windows list
  • VS-like window-selecting by pressing Ctrl + Tab


To find bugs, I have tried to debug DILE, Reflector, ILMerge, MDbg, few v1.1 console and web applications. I really hope that I was able to make the debugging more stable.
As always, if you find any bug, please let me know.
Oh, and this just reminds me that now MDbg is also able to debug on IL level, so check it out also.

And the screenshots of the new features:

Debugging related settings

Project settings (list of exceptions that should be skipped)

Debug events when the debuggee can be automatically paused


An MDA (Managed Debug Assistant) notification

The new modules and threads panels

VS-like window-selecting (appears when Ctrl + Tab pressed)