Saturday, February 05, 2005

RVA, part 1

When I started to use the Unmanaged Metadata API, I had to search a lot for a method which could give me the IL code just like the MethodBody.GetILAsByteArray() which is new in the .NET Framework 2.0. Well, it was a little bit more difficult than I expected and I needed to do a lot of research to find what I need...

Theory


When the IMetaDataImport.GetMethodProps() method is called it returns an unsigned integer which is the RVA. RVA actually stands for Relative Virtual Address. This value shows where the method will be placed in the memory when an assembly is loaded. This value is relative which means that the RVA has to be added to the assembly's base address to get the real address of the method's body.
When we have this value we can start to read the method's body which always starts with a header (Fat or Tiny) and continues with the IL code.

Demonstration


The following steps are necessary to get the IL code of a method:
1. Load the assembly to the memory (the Unmanaged Metadata API _will not_ load it!).
2. Get the base address of the loaded assembly.
3. Open the assembly using Unmanaged Metadata API (using the IMetaDataImport interface).
4. Get the token of the TypeDef.
5. Get the token of the MethodDef.
6. Call the IMetaDataImport.GetMethodProps() method to get the RVA of the method.
7. Read the first byte which can be found at the RVA + base address.
8. If the method has a tiny header then the read byte will contain the method's length, if it's a fat header then a few more bytes should be read (I'll discuss this in another post later).
9. Read the method's IL code.

I'll give a little sample to demonstrate how this works. :-)
Let's create a dll which contains one class and a few methods. The method should have a tiny header. Here are the conditions to achieve this:
- No local variables are allowed
- No exceptions (no exception handling to be exact)
- No extra data sections
- The operand stack must be no bigger than 8 entries

My sample looks like this:

using System;
using 
System.Collections.Generic;
using 
System.Text;

namespace 
TestAssembly
{
    
public class Class1
    {
        
public Class1()
        {
        }

        
public void Test()
        {
            Console.WriteLine(
"This is the test assembly.");
        
}
    }
}



Now create a program which is able to read from the console an assembly's path, a class' name and a method's name. Then load the given assembly to the memory, read the given method's IL code and write it on the console as hexadecimal numbers.


using System;
using 
System.Collections.Generic;
using 
System.Text;

using 
System.Diagnostics;
using 
System.IO;
using 
System.Reflection;
using 
System.Runtime.InteropServices;

namespace 
Blog2
{
    
public class Program
    {
        
public readonly static Guid IID_IMetaDataImport = new Guid("7DAC8207-D3AE-4c75-9B67-92801A497D44");

        static void 
Main(string[] args)
        {
            Console.Write(
"Please enter the full path of the assembly: ");
            
//Read the path of the assembly from the console.
            
string assemblyPath Console.ReadLine();

            
Console.Write("Fully qualified name of the class : ");
            
//Read the name of the class from the console.
            
string className Console.ReadLine();

            
Console.Write("Name of the method: ");
            
//Read the name of the method from the console.
            
string methodName Console.ReadLine();

            
//Load the assembly to the memory.
            
Assembly assembly Assembly.LoadFrom(assemblyPath);

            
//This will point to the beginning of the assembly in the memory.
            
IntPtr baseAddress = new IntPtr();
            bool 
found = false;
            string 
fileName Path.GetFileNameWithoutExtension(assemblyPath);

            int 
index 0;
            
//Search the loaded process modules for the loaded assembly.
            
ProcessModuleCollection modules Process.GetCurrentProcess().Modules;

            while 
(!found && index < modules.Count)
            {
                ProcessModule module 
modules[index++];

                if 
(module.FileName == assemblyPath)
                {
                    
//If the loaded assembly has been found, store its base address.
                    
baseAddress module.BaseAddress;
                    
found = true;
                
}
            }

            
//Open the assembly with Unmanaged Metadata API.
            
IMetaDataDispenserEx dispenser = new MetaDataDispenserEx();
            
IMetaDataImport import = null;
            object 
rawScope = null;

            
Guid metaDataImportGuid IID_IMetaDataImport;

            
dispenser.OpenScope(assemblyPath, 0ref metaDataImportGuid, out rawScope);
            
import (IMetaDataImport)rawScope;

            
//Search for the desired class.
            
uint typeDefToken 0;
            
import.FindTypeDefByName(className, 0out typeDefToken);

            
//Search for the desired method.
            
uint methodDefToken 0;
            
import.FindMethod(typeDefToken, methodName, null0out methodDefToken);

            char
[] methodDefName = new char[1024];
            uint 
methodDefCount 0;
            uint 
attributes 0;
            
IntPtr signature;
            uint 
signatureCount 0;
            uint 
rva 0;
            uint 
implementationFlags 0;

            
//Get the properties of the method (including its RVA).
            
import.GetMethodProps(methodDefToken, out typeDefToken,
                methodDefName, Convert.ToUInt32(methodDefName.Length),
                
out methodDefCount, out attributes, out signature,
                
out signatureCount, out rva, out implementationFlags);

            int 
methodIndex Convert.ToInt32(rva);
            
//Read the first byte of the method. This will be the header.
            
byte methodHeader Marshal.ReadByte(baseAddress, methodIndex);

            
//If the 2 right-most bits are 10 then this is a tiny header.
            
if ((methodHeader & 0x3) == 0x2)
            {
                
//The method's length is stored in the 6 left-most bits.
                
int methodEnd (methodHeader >> 2) + methodIndex + 1;
                
methodIndex++;

                
//Read the method's IL code until the end and write it to the console.
                
while (methodIndex < methodEnd)
                {
                    Console.Write(
string.Format("{0} ", Marshal.ReadByte(baseAddress, methodIndex++).ToString("X").PadLeft(2'0')));
                
}
            }

            Console.ReadLine()
;
        
}
    }
}



The output for me looks like this:
C:\Projects\Blog2\bin\Debug>Blog2.exe
Please enter the full path of the assembly: c:\Projects\Blog2\TestAssembly\bin\Debug\TestAssembly.dll
Fully qualified name of the class : TestAssembly.Class1
Name of the method: Test
00 72 15 00 00 70 28 15 00 00 0A 00 2A

Verification


Well, all this looks very nice but how do we know that it's really correct?
Use ildasm to verify the output.
Start ildasm, open the TestAssembly.dll, turn on the Show bytes and the Show token values options (both can be found in the View menu) and open the Test method. I get the following:
.method /*06000009*/ public hidebysig instance void
        Test() cil managed
// SIG: 20 00 01
{
  // Method begins at RVA 0x2157
  // Code size 13 (0xd)
  .maxstack 8
  IL_0000: /* 00 | */ nop
  IL_0001: /* 72 | (70)000015 */ ldstr "This is the test assembly." /* 70000015 */
  IL_0006: /* 28 | (0A)000015 */ call void [mscorlib/*23000001*/]System.Console/*01000018*/::WriteLine(string) /* 0A000015 */
  IL_000b: /* 00 | */ nop
  IL_000c: /* 2A | */ ret
} // end of method Class1::Test

So:
00
72 70 00 00 15
28 0A 00 00 15
00
2A

Similar to the output of the sample, except the tokens of course.

P.S.: An observation: if the Assembly.ReflectionOnlyLoadFrom() method is used instead of the Assembly.LoadFrom(), then the assembly really can't be found among the process modules.
P.S. 2.: Thanks for Carlos Aguilar Mares for his Code Colorizer Tool. It's really useful...

Update
I have removed unnecessary line breaks from the code and fixed a typo...

3 Comments:

Anonymous Anonymous said...

Very useful.
Thanks man!

Sunday, April 27, 2008 at 4:04:00 PM GMT+2  
Blogger GSerjo said...

Thanks!

Sunday, December 20, 2009 at 11:47:00 AM GMT+1  
Anonymous Anonymous said...

Hello people, I just registered on this delightful discussion board and wished to say hey! Have a great day!

Sunday, March 14, 2010 at 2:59:00 AM GMT+1  

Post a Comment

<< Home