IL Instructions
This blog is becoming an Unmanaged Metadata API tutorial (which is not bad at all), so let's continue...
This time I'll write about IL instructions, as in the previous post I have explained already how to get a method's IL code. The byte array we read from the assembly actually contained IL instructions and their parameters.
An instruction can be 1 or 2 bytes long. If it's 2 bytes long then the first byte is always 0xFE.
An instruction might be followed by a parameter which is usually a number (e.g.: ldc.i4, ldarg, brfalse) or a token (e.g.: ldstr, call). Of course, not all instruction has a parameter because some operate based on the values which are already on the stack (e.g.: add, div, clt), or just does nothing like nop. The switch instruction is a special case, as it can have more than 1 parameters.
Also, before an instruction a prefix might also appear (tail, unaligned or volatile).
The exact description of the instructions can be found in the Partition III CIL.doc.
The easiest way to process the IL code is to group the instructions. The Partition V Annexes.doc mentions the following groups:
It's easy to build a Dictionary (e.g.: Dictionary<short, OpCode>) which contains all the OpCodes as value and the byte/short value as the key. Such a Dictionary can be populated using Reflection (not the fastest but the easiest way ;-)). Just iterate over all the public static fields of the OpCodes class and fill the values in the Dictionary which can later be used for looking up the proper instruction.
That's it for now. :-)
P.S.: A little DILE status update: At last, I have finished decompiling of the methods' instructions. I'm using the ISymWrapper.dll which can be found in the Framework's directory as a test assembly which is quite complicated, I think. So far I haven't found difference between the IL code shown by ildasm and DILE. :-)
I hope soon I can also finish decompiling method/class/field etc. declarations and then I'll make a release. :-)
This time I'll write about IL instructions, as in the previous post I have explained already how to get a method's IL code. The byte array we read from the assembly actually contained IL instructions and their parameters.
An instruction can be 1 or 2 bytes long. If it's 2 bytes long then the first byte is always 0xFE.
An instruction might be followed by a parameter which is usually a number (e.g.: ldc.i4, ldarg, brfalse) or a token (e.g.: ldstr, call). Of course, not all instruction has a parameter because some operate based on the values which are already on the stack (e.g.: add, div, clt), or just does nothing like nop. The switch instruction is a special case, as it can have more than 1 parameters.
Also, before an instruction a prefix might also appear (tail, unaligned or volatile).
The exact description of the instructions can be found in the Partition III CIL.doc.
The easiest way to process the IL code is to group the instructions. The Partition V Annexes.doc mentions the following groups:
- Instructions with no operand
- Instructions that Refer to Parameters or Local Variables
- Instructions that Take a Single 32-bit Integer Argument
- Instructions that Take a Single 64-bit Integer Argument
- Instructions that Take a Single Floating Point Argument
- Branch instructions
- Instructions that Take a Method as an Argument
- Instructions that Take a Field of a Class as an Argument
- Instructions that Take a Type as an Argument
- Instructions that Take a String as an Argument
- Instructions that Take a Signature as an Argument
- Instructions that Take a Metadata Token as an Argument
- Switch instruction
- Parameterless
- Token parameter:
- Field
- Method
- String
- Type
- Location parameter:
- Sbyte
- Int
- Number parameter:
- Byte
- Ushort
- Sbyte
- Int
- Long
- Float
- Double
- Argument parameter:
- Byte
- Ushort
- Variable parameter:
- Byte
- Ushort
- Special parameters:
- Switch
It's easy to build a Dictionary (e.g.: Dictionary<short, OpCode>) which contains all the OpCodes as value and the byte/short value as the key. Such a Dictionary can be populated using Reflection (not the fastest but the easiest way ;-)). Just iterate over all the public static fields of the OpCodes class and fill the values in the Dictionary which can later be used for looking up the proper instruction.
That's it for now. :-)
P.S.: A little DILE status update: At last, I have finished decompiling of the methods' instructions. I'm using the ISymWrapper.dll which can be found in the Framework's directory as a test assembly which is quite complicated, I think. So far I haven't found difference between the IL code shown by ildasm and DILE. :-)
I hope soon I can also finish decompiling method/class/field etc. declarations and then I'll make a release. :-)