RVA, part 1
When I started to use the Unmanaged Metadata API, I had to search a lot for a method which could give me the IL code just like the MethodBody.GetILAsByteArray() which is new in the .NET Framework 2.0. Well, it was a little bit more difficult than I expected and I needed to do a lot of research to find what I need...
When the IMetaDataImport.GetMethodProps() method is called it returns an unsigned integer which is the RVA. RVA actually stands for Relative Virtual Address. This value shows where the method will be placed in the memory when an assembly is loaded. This value is relative which means that the RVA has to be added to the assembly's base address to get the real address of the method's body.
When we have this value we can start to read the method's body which always starts with a header (Fat or Tiny) and continues with the IL code.
The following steps are necessary to get the IL code of a method:
1. Load the assembly to the memory (the Unmanaged Metadata API _will not_ load it!).
2. Get the base address of the loaded assembly.
3. Open the assembly using Unmanaged Metadata API (using the IMetaDataImport interface).
4. Get the token of the TypeDef.
5. Get the token of the MethodDef.
6. Call the IMetaDataImport.GetMethodProps() method to get the RVA of the method.
7. Read the first byte which can be found at the RVA + base address.
8. If the method has a tiny header then the read byte will contain the method's length, if it's a fat header then a few more bytes should be read (I'll discuss this in another post later).
9. Read the method's IL code.
I'll give a little sample to demonstrate how this works. :-)
Let's create a dll which contains one class and a few methods. The method should have a tiny header. Here are the conditions to achieve this:
- No local variables are allowed
- No exceptions (no exception handling to be exact)
- No extra data sections
- The operand stack must be no bigger than 8 entries
My sample looks like this:
using System;
using System.Collections.Generic;
using System.Text;
namespace TestAssembly
{
public class Class1
{
public Class1()
{
}
public void Test()
{
Console.WriteLine("This is the test assembly.");
}
}
}
Now create a program which is able to read from the console an assembly's path, a class' name and a method's name. Then load the given assembly to the memory, read the given method's IL code and write it on the console as hexadecimal numbers.
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
namespace Blog2
{
public class Program
{
public readonly static Guid IID_IMetaDataImport = new Guid("7DAC8207-D3AE-4c75-9B67-92801A497D44");
static void Main(string[] args)
{
Console.Write("Please enter the full path of the assembly: ");
//Read the path of the assembly from the console.
string assemblyPath = Console.ReadLine();
Console.Write("Fully qualified name of the class : ");
//Read the name of the class from the console.
string className = Console.ReadLine();
Console.Write("Name of the method: ");
//Read the name of the method from the console.
string methodName = Console.ReadLine();
//Load the assembly to the memory.
Assembly assembly = Assembly.LoadFrom(assemblyPath);
//This will point to the beginning of the assembly in the memory.
IntPtr baseAddress = new IntPtr();
bool found = false;
string fileName = Path.GetFileNameWithoutExtension(assemblyPath);
int index = 0;
//Search the loaded process modules for the loaded assembly.
ProcessModuleCollection modules = Process.GetCurrentProcess().Modules;
while (!found && index < modules.Count)
{
ProcessModule module = modules[index++];
if (module.FileName == assemblyPath)
{
//If the loaded assembly has been found, store its base address.
baseAddress = module.BaseAddress;
found = true;
}
}
//Open the assembly with Unmanaged Metadata API.
IMetaDataDispenserEx dispenser = new MetaDataDispenserEx();
IMetaDataImport import = null;
object rawScope = null;
Guid metaDataImportGuid = IID_IMetaDataImport;
dispenser.OpenScope(assemblyPath, 0, ref metaDataImportGuid, out rawScope);
import = (IMetaDataImport)rawScope;
//Search for the desired class.
uint typeDefToken = 0;
import.FindTypeDefByName(className, 0, out typeDefToken);
//Search for the desired method.
uint methodDefToken = 0;
import.FindMethod(typeDefToken, methodName, null, 0, out methodDefToken);
char[] methodDefName = new char[1024];
uint methodDefCount = 0;
uint attributes = 0;
IntPtr signature;
uint signatureCount = 0;
uint rva = 0;
uint implementationFlags = 0;
//Get the properties of the method (including its RVA).
import.GetMethodProps(methodDefToken, out typeDefToken,
methodDefName, Convert.ToUInt32(methodDefName.Length),
out methodDefCount, out attributes, out signature,
out signatureCount, out rva, out implementationFlags);
int methodIndex = Convert.ToInt32(rva);
//Read the first byte of the method. This will be the header.
byte methodHeader = Marshal.ReadByte(baseAddress, methodIndex);
//If the 2 right-most bits are 10 then this is a tiny header.
if ((methodHeader & 0x3) == 0x2)
{
//The method's length is stored in the 6 left-most bits.
int methodEnd = (methodHeader >> 2) + methodIndex + 1;
methodIndex++;
//Read the method's IL code until the end and write it to the console.
while (methodIndex < methodEnd)
{
Console.Write(string.Format("{0} ", Marshal.ReadByte(baseAddress, methodIndex++).ToString("X").PadLeft(2, '0')));
}
}
Console.ReadLine();
}
}
}
The output for me looks like this:
C:\Projects\Blog2\bin\Debug>Blog2.exe
Please enter the full path of the assembly: c:\Projects\Blog2\TestAssembly\bin\Debug\TestAssembly.dll
Fully qualified name of the class : TestAssembly.Class1
Name of the method: Test
00 72 15 00 00 70 28 15 00 00 0A 00 2A
Well, all this looks very nice but how do we know that it's really correct?
Use ildasm to verify the output.
Start ildasm, open the TestAssembly.dll, turn on the Show bytes and the Show token values options (both can be found in the View menu) and open the Test method. I get the following:
.method /*06000009*/ public hidebysig instance void
Test() cil managed
// SIG: 20 00 01
{
// Method begins at RVA 0x2157
// Code size 13 (0xd)
.maxstack 8
IL_0000: /* 00 | */ nop
IL_0001: /* 72 | (70)000015 */ ldstr "This is the test assembly." /* 70000015 */
IL_0006: /* 28 | (0A)000015 */ call void [mscorlib/*23000001*/]System.Console/*01000018*/::WriteLine(string) /* 0A000015 */
IL_000b: /* 00 | */ nop
IL_000c: /* 2A | */ ret
} // end of method Class1::Test
So:
00
72 70 00 00 15
28 0A 00 00 15
00
2A
Similar to the output of the sample, except the tokens of course.
P.S.: An observation: if the Assembly.ReflectionOnlyLoadFrom() method is used instead of the Assembly.LoadFrom(), then the assembly really can't be found among the process modules.
P.S. 2.: Thanks for Carlos Aguilar Mares for his Code Colorizer Tool. It's really useful...
Update
I have removed unnecessary line breaks from the code and fixed a typo...
Theory
When the IMetaDataImport.GetMethodProps() method is called it returns an unsigned integer which is the RVA. RVA actually stands for Relative Virtual Address. This value shows where the method will be placed in the memory when an assembly is loaded. This value is relative which means that the RVA has to be added to the assembly's base address to get the real address of the method's body.
When we have this value we can start to read the method's body which always starts with a header (Fat or Tiny) and continues with the IL code.
Demonstration
The following steps are necessary to get the IL code of a method:
1. Load the assembly to the memory (the Unmanaged Metadata API _will not_ load it!).
2. Get the base address of the loaded assembly.
3. Open the assembly using Unmanaged Metadata API (using the IMetaDataImport interface).
4. Get the token of the TypeDef.
5. Get the token of the MethodDef.
6. Call the IMetaDataImport.GetMethodProps() method to get the RVA of the method.
7. Read the first byte which can be found at the RVA + base address.
8. If the method has a tiny header then the read byte will contain the method's length, if it's a fat header then a few more bytes should be read (I'll discuss this in another post later).
9. Read the method's IL code.
I'll give a little sample to demonstrate how this works. :-)
Let's create a dll which contains one class and a few methods. The method should have a tiny header. Here are the conditions to achieve this:
- No local variables are allowed
- No exceptions (no exception handling to be exact)
- No extra data sections
- The operand stack must be no bigger than 8 entries
My sample looks like this:
using System;
using System.Collections.Generic;
using System.Text;
namespace TestAssembly
{
public class Class1
{
public Class1()
{
}
public void Test()
{
Console.WriteLine("This is the test assembly.");
}
}
}
Now create a program which is able to read from the console an assembly's path, a class' name and a method's name. Then load the given assembly to the memory, read the given method's IL code and write it on the console as hexadecimal numbers.
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
namespace Blog2
{
public class Program
{
public readonly static Guid IID_IMetaDataImport = new Guid("7DAC8207-D3AE-4c75-9B67-92801A497D44");
static void Main(string[] args)
{
Console.Write("Please enter the full path of the assembly: ");
//Read the path of the assembly from the console.
string assemblyPath = Console.ReadLine();
Console.Write("Fully qualified name of the class : ");
//Read the name of the class from the console.
string className = Console.ReadLine();
Console.Write("Name of the method: ");
//Read the name of the method from the console.
string methodName = Console.ReadLine();
//Load the assembly to the memory.
Assembly assembly = Assembly.LoadFrom(assemblyPath);
//This will point to the beginning of the assembly in the memory.
IntPtr baseAddress = new IntPtr();
bool found = false;
string fileName = Path.GetFileNameWithoutExtension(assemblyPath);
int index = 0;
//Search the loaded process modules for the loaded assembly.
ProcessModuleCollection modules = Process.GetCurrentProcess().Modules;
while (!found && index < modules.Count)
{
ProcessModule module = modules[index++];
if (module.FileName == assemblyPath)
{
//If the loaded assembly has been found, store its base address.
baseAddress = module.BaseAddress;
found = true;
}
}
//Open the assembly with Unmanaged Metadata API.
IMetaDataDispenserEx dispenser = new MetaDataDispenserEx();
IMetaDataImport import = null;
object rawScope = null;
Guid metaDataImportGuid = IID_IMetaDataImport;
dispenser.OpenScope(assemblyPath, 0, ref metaDataImportGuid, out rawScope);
import = (IMetaDataImport)rawScope;
//Search for the desired class.
uint typeDefToken = 0;
import.FindTypeDefByName(className, 0, out typeDefToken);
//Search for the desired method.
uint methodDefToken = 0;
import.FindMethod(typeDefToken, methodName, null, 0, out methodDefToken);
char[] methodDefName = new char[1024];
uint methodDefCount = 0;
uint attributes = 0;
IntPtr signature;
uint signatureCount = 0;
uint rva = 0;
uint implementationFlags = 0;
//Get the properties of the method (including its RVA).
import.GetMethodProps(methodDefToken, out typeDefToken,
methodDefName, Convert.ToUInt32(methodDefName.Length),
out methodDefCount, out attributes, out signature,
out signatureCount, out rva, out implementationFlags);
int methodIndex = Convert.ToInt32(rva);
//Read the first byte of the method. This will be the header.
byte methodHeader = Marshal.ReadByte(baseAddress, methodIndex);
//If the 2 right-most bits are 10 then this is a tiny header.
if ((methodHeader & 0x3) == 0x2)
{
//The method's length is stored in the 6 left-most bits.
int methodEnd = (methodHeader >> 2) + methodIndex + 1;
methodIndex++;
//Read the method's IL code until the end and write it to the console.
while (methodIndex < methodEnd)
{
Console.Write(string.Format("{0} ", Marshal.ReadByte(baseAddress, methodIndex++).ToString("X").PadLeft(2, '0')));
}
}
Console.ReadLine();
}
}
}
The output for me looks like this:
C:\Projects\Blog2\bin\Debug>Blog2.exe
Please enter the full path of the assembly: c:\Projects\Blog2\TestAssembly\bin\Debug\TestAssembly.dll
Fully qualified name of the class : TestAssembly.Class1
Name of the method: Test
00 72 15 00 00 70 28 15 00 00 0A 00 2A
Verification
Well, all this looks very nice but how do we know that it's really correct?
Use ildasm to verify the output.
Start ildasm, open the TestAssembly.dll, turn on the Show bytes and the Show token values options (both can be found in the View menu) and open the Test method. I get the following:
.method /*06000009*/ public hidebysig instance void
Test() cil managed
// SIG: 20 00 01
{
// Method begins at RVA 0x2157
// Code size 13 (0xd)
.maxstack 8
IL_0000: /* 00 | */ nop
IL_0001: /* 72 | (70)000015 */ ldstr "This is the test assembly." /* 70000015 */
IL_0006: /* 28 | (0A)000015 */ call void [mscorlib/*23000001*/]System.Console/*01000018*/::WriteLine(string) /* 0A000015 */
IL_000b: /* 00 | */ nop
IL_000c: /* 2A | */ ret
} // end of method Class1::Test
So:
00
72 70 00 00 15
28 0A 00 00 15
00
2A
Similar to the output of the sample, except the tokens of course.
P.S.: An observation: if the Assembly.ReflectionOnlyLoadFrom() method is used instead of the Assembly.LoadFrom(), then the assembly really can't be found among the process modules.
P.S. 2.: Thanks for Carlos Aguilar Mares for his Code Colorizer Tool. It's really useful...
Update
I have removed unnecessary line breaks from the code and fixed a typo...
3 Comments:
Very useful.
Thanks man!
Thanks!
Hello people, I just registered on this delightful discussion board and wished to say hey! Have a great day!
Post a Comment
<< Home