Sunday, April 03, 2005

IL Instructions

This blog is becoming an Unmanaged Metadata API tutorial (which is not bad at all), so let's continue...

This time I'll write about IL instructions, as in the previous post I have explained already how to get a method's IL code. The byte array we read from the assembly actually contained IL instructions and their parameters.
An instruction can be 1 or 2 bytes long. If it's 2 bytes long then the first byte is always 0xFE.
An instruction might be followed by a parameter which is usually a number (e.g.: ldc.i4, ldarg, brfalse) or a token (e.g.: ldstr, call). Of course, not all instruction has a parameter because some operate based on the values which are already on the stack (e.g.: add, div, clt), or just does nothing like nop. The switch instruction is a special case, as it can have more than 1 parameters.
Also, before an instruction a prefix might also appear (tail, unaligned or volatile).
The exact description of the instructions can be found in the Partition III CIL.doc.

The easiest way to process the IL code is to group the instructions. The Partition V Annexes.doc mentions the following groups:
  • Instructions with no operand
  • Instructions that Refer to Parameters or Local Variables
  • Instructions that Take a Single 32-bit Integer Argument
  • Instructions that Take a Single 64-bit Integer Argument
  • Instructions that Take a Single Floating Point Argument
  • Branch instructions
  • Instructions that Take a Method as an Argument
  • Instructions that Take a Field of a Class as an Argument
  • Instructions that Take a Type as an Argument
  • Instructions that Take a String as an Argument
  • Instructions that Take a Signature as an Argument
  • Instructions that Take a Metadata Token as an Argument
  • Switch instruction
To make reading instructions even easier in DILE, I went further and also created groups based on the size/nature of the parameter:
  • Parameterless
  • Token parameter:
    • Field
    • Method
    • String
    • Type

  • Location parameter:
    • Sbyte
    • Int

  • Number parameter:
    • Byte
    • Ushort
    • Sbyte
    • Int
    • Long
    • Float
    • Double

  • Argument parameter:
    • Byte
    • Ushort

  • Variable parameter:
    • Byte
    • Ushort

  • Special parameters:
    • Switch
Fortunately, it's not necessary to make our own mapping between the byte value of the instructions and their name, since the Framework already contains all the instructions in the System.Reflection.Emit namespace in the OpCodes class. Every instruction is available as public static field of this class. Each field is an OpCode structure which contains both the name and the byte/short value (depending on the size) of the instruction.
It's easy to build a Dictionary (e.g.: Dictionary<short, OpCode>) which contains all the OpCodes as value and the byte/short value as the key. Such a Dictionary can be populated using Reflection (not the fastest but the easiest way ;-)). Just iterate over all the public static fields of the OpCodes class and fill the values in the Dictionary which can later be used for looking up the proper instruction.

That's it for now. :-)

P.S.: A little DILE status update: At last, I have finished decompiling of the methods' instructions. I'm using the ISymWrapper.dll which can be found in the Framework's directory as a test assembly which is quite complicated, I think. So far I haven't found difference between the IL code shown by ildasm and DILE. :-)
I hope soon I can also finish decompiling method/class/field etc. declarations and then I'll make a release. :-)


Anonymous Anonymous said...

I will not concur on it. I assume precise post. Specially the designation attracted me to review the unscathed story.

Thursday, January 14, 2010 at 8:38:00 PM GMT+1  
Anonymous Anonymous said...

Amiable fill someone in on and this enter helped me alot in my college assignement. Thank you on your information.

Monday, January 18, 2010 at 4:44:00 PM GMT+1  
Anonymous Anonymous said...

Good dispatch and this mail helped me alot in my college assignement. Thanks you for your information.

Sunday, February 14, 2010 at 9:07:00 PM GMT+1  

Post a Comment

<< Home