Tuesday, January 04, 2005

Reading types from assembly

In medias res.

As a new post, here's a little sample code. Let's start with something simple. Try to open a file using Unmanaged Metadata API (in C#) and list all the types in it.
Unfortunately, I don't know yet how to upload files to blogger thus I'll copy-paste the code here. It won't be short, I know, but later I'll look for some way to upload files. And I don't know yet how to colorize the code but I'll try to solve this later also.

Most of the interfaces which are specified below can be found in the Tool Developers Guide which is installed in your VS.NET's folder (by default, c:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Tool Developers Guide\docs\) thus I don't detail them here.
As the name also shows the Unmanaged Metadata API is not managed code but implemented as COM objects. The declaration of the necessary interfaces can be found as C++ header files in the following folder: c:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\include\ (check cor.h, CorHdr.h and corhlpr.h). I also created the code based on these files.

And a warning. The following code was written using .NET Framework 2.0 (v2.0.40607) but I'm sure it can be easily changed to compile it under .NET Framework 1.x. And as a sample code this doesn't contain any exception handling, object cleaning code (don't forget that you're dealing with COM objects) etc.

So, create a console application in Visual Studio.NET and add a file called IMetaDataDispenserEx.cs and add the following interface declaration (I think it's a good practice to put every class/interface in different .cs file):

#region Using directives
using System;
using System.Collections.Generic;
using System.Text;


using System.Runtime.InteropServices;
using System.Runtime.InteropServices.ComTypes;
#endregion


namespace ReadingAssembly
{
   [ComImport, GuidAttribute("31BCFCE2-DAFB-11D2-9F81-00C04F79A0A3"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
   public interface IMetaDataDispenserEx
   {
     uint DefineScope(ref Guid rclsid, uint dwCreateFlags, ref Guid riid, [MarshalAs(UnmanagedType.Interface)]out object ppIUnk);

     uint OpenScope([MarshalAs(UnmanagedType.LPWStr)]string szScope, uint dwOpenFlags, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppIUnk);

     uint OpenScopeOnMemory(IntPtr pData, uint cbData, uint dwOpenFlags, ref Guid riid, [MarshalAs(UnmanagedType.Interface)]out object ppIUnk);

     uint SetOption(ref Guid optionid, [MarshalAs(UnmanagedType.Struct)]object value);

     uint GetOption(ref Guid optionid, [MarshalAs(UnmanagedType.Struct)]out object pvalue);

     uint OpenScopeOnITypeInfo([MarshalAs(UnmanagedType.Interface)]ITypeInfo pITI, uint dwOpenFlags, ref Guid riid, [MarshalAs(UnmanagedType.Interface)]out object ppIUnk);

     uint GetCORSystemDirectory([MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0)]char[] szBuffer, uint cchBuffer, out uint pchBuffer);

     uint FindAssembly([MarshalAs(UnmanagedType.LPWStr)]string szAppBase, [MarshalAs(UnmanagedType.LPWStr)]string szPrivateBin, [MarshalAs(UnmanagedType.LPWStr)]string szGlobalBin, [MarshalAs(UnmanagedType.LPWStr)]string szAssemblyName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 4)]char[] szName, uint cchName, out uint pcName);

     uint FindAssemblyModule([MarshalAs(UnmanagedType.LPWStr)]string szAppBase, [MarshalAs(UnmanagedType.LPWStr)]string szPrivateBin, [MarshalAs(UnmanagedType.LPWStr)]string szGlobalBin, [MarshalAs(UnmanagedType.LPWStr)]string szAssemblyName, [MarshalAs(UnmanagedType.LPWStr)]string szModuleName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 5)]char[] szName, uint cchName, out uint pcName);
   }
}


We should also define a class called CorMetaDataDispenserExClass. The definition is very short:

#region Using directives
using System;
using System.Collections.Generic;
using System.Text;

using System.Runtime.InteropServices;
#endregion

namespace ReadingAssembly
{
   [ComImport, GuidAttribute("E5CB7A31-7512-11D2-89CE-0080C792E5D8")]
   public class CorMetaDataDispenserExClass
   {
   }
}


We will need two more interfaces. The first is the MetaDataDispenserEx class:

#region Using directives
using System;
using System.Collections.Generic;
using System.Text;


using System.Runtime.InteropServices;
#endregion


namespace ReadingAssembly

{
   [ComImport, GuidAttribute("31BCFCE2-DAFB-11D2-9F81-00C04F79A0A3"), CoClass(typeof(CorMetaDataDispenserExClass))]
   public interface MetaDataDispenserEx : IMetaDataDispenserEx
   {
   }
}


The last interface is the longest, it's called IMetaDataImport:

#region Using directives
using System;
using System.Collections.Generic;
using System.Text;

using System.Runtime.InteropServices;
#endregion


namespace ReadingAssembly
{
   [ComImport, GuidAttribute("7DAC8207-D3AE-4c75-9B67-92801A497D44"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
   public interface IMetaDataImport
   {
     void CloseEnum(uint hEnum);


     uint CountEnum(uint hEnum, out uint count);

     uint ResetEnum(uint hEnum, uint ulPos);

     uint EnumTypeDefs(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rTypeDefs, uint cMax, out uint pcTypeDefs);

     uint EnumInterfaceImpls(ref uint phEnum, uint td, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rImpls, uint cMax, out uint pcImpls);

     uint EnumTypeRefs(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rTypeDefs, uint cMax, out uint pcTypeRefs);

     uint FindTypeDefByName([MarshalAs(UnmanagedType.LPWStr)]string szTypeDef, uint tkEnclosingClass, out uint ptd);

     uint GetScopeProps([MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0)]char[] szName, uint cchName, out uint pchName, ref Guid pmvid);

     uint GetModuleFromScope(out uint pmd);

     uint GetTypeDefProps(uint td, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szTypeDef, uint cchTypeDef, out uint pchTypeDef, out uint pdwTypeDefFlags, out uint ptkExtends);

     uint GetInterfaceImplProps(uint iiImpl, out uint pClass, out uint ptkIface);

     uint GetTypeRefProps(uint tr, out uint ptkResolutionScope, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]char[] szName, uint cchName, out uint pchName);

     uint ResolveTypeRef(uint tr, ref Guid riid, [MarshalAs(UnmanagedType.Interface)]out object ppIScope, out uint ptd);

     uint EnumMembers(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rMembers, uint cMax, out uint pcTokens);

     uint EnumMembersWithName(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 4)]uint[] rMembers, uint cMax, out uint pcTokens);

     uint EnumMethods(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rMethods, uint cMax, out uint pcTokens);

     uint EnumMethodsWithName(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 4)]uint[] rMethods, uint cMax, out uint pcTokens);

     uint EnumFields(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rFields, uint cMax, out uint pcTokens);

     uint EnumFieldsWithName(ref uint phEnum, uint cl, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 4)]uint[] rFields, uint cMax, out uint pcTokens);

     uint EnumParams(ref uint phEnum, uint mb, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rParams, uint cMax, out uint pcTokens);

     uint EnumMemberRefs(ref uint phEnum, uint tkParent, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rMemberRefs, uint cMax, out uint pcTokens);

     uint EnumMethodImpls(ref uint phEnum, uint td, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rMethodBody, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rMethodDecl, uint cMax, out uint pcTokens);

     uint EnumPermissionSets(ref uint phEnum, uint tk, uint dwActions, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rPermission, uint cMax, out uint pcTokens);

     uint FindMember(uint td, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]byte[] pvSigBlob, uint cbSigBlob, out uint pmb);

     uint FindMethod(uint td, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]byte[] pvSigBlob, uint cbSigBlob, out uint pmb);

     uint FindField(uint td, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]byte[] pvSigBlob, uint cbSigBlob, out uint pmb);

     uint FindMemberRef(uint td, [MarshalAs(UnmanagedType.LPWStr)]string szName, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]byte[] pvSigBlob, int cbSigBlob, out uint pmr);

     uint GetMethodProps(uint mb, out uint pClass, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szMethod, uint cchMethod, out uint pchMethod, out uint pdwAttr, out IntPtr ppvSigBlob, out uint pcbSigBlob, out uint pulCodeRVA, out uint pdwImplFlags);

     uint GetMemberRefProps(uint mr, out uint ptk, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szMember, uint cchMember, out uint pchMember, out IntPtr ppvSigBlob, out uint pbSigBlob);

     uint EnumProperties(ref uint phEnum, uint td, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rProperties, uint cMax, out uint pcProperties);

     uint EnumEvents(ref uint phEnum, uint td, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rEvents, uint cMax, out uint pcEvents);

     uint GetEventProps(uint ev, out uint pClass, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szEvent, uint cchEvent, out uint pchEvent, out uint pdwEventFlags, out uint ptkEventType, out uint pmdAddOn, out uint pmdRemoveOn, out uint pmdFire, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 10)]uint[] rmdOtherMethod, uint cMax, out uint pcOtherMethod);

     uint EnumMethodSemantics(ref uint phEnum, uint mb, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]uint[] rEventProp, uint cMax, out uint pcEventProp);

     uint GetMethodSemantics(uint mb, uint tkEventProp, out uint pdwSemanticsFlags);

     uint GetClassLayout(uint td, out uint pdwPackSize, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]long[] rFieldOffset, uint cMax, out uint pcFieldOffset, out uint pulClassSize);

     uint GetFieldMarshal(uint tk, out IntPtr ppvNativeType, out uint pcbNativeType);

     uint GetRVA(uint tk, out uint pulCodeRVA, out uint pdwImplFlags);

     uint GetPermissionSetProps(uint pm, out uint pdwAction, out IntPtr ppvPermission, out uint pcbPermission);

     uint GetSigFromToken(uint mdSig, out IntPtr ppvSig, out uint pcbSig);

     uint GetModuleRefProps(uint mur, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]char[] szName, uint cchName, out uint pchName);

     uint EnumModuleRefs(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rModuleRefs, uint cmax, out uint pcModuleRefs);

     uint GetTypeSpecFromToken(uint typespec, out IntPtr ppvSig, out uint pcbSig);

     uint GetNameFromToken(uint tk, out IntPtr pszUtf8NamePtr);

     uint EnumUnresolvedMethods(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rMethods, uint cMax, out uint pcTokens);

     uint GetUserString(uint stk, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)] char[] szString, uint cchString, out uint pchString);

     uint GetPinvokeMap(uint tk, out uint pdwMappingFlags, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szImportName, uint cchImportName, out uint pchImportName, out uint pmrImportDLL);

     uint EnumSignatures(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rSignatures, uint cmax, out uint pcSignatures);

     uint EnumTypeSpecs(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rTypeSpecs, uint cmax, out uint pcTypeSpecs);

     uint EnumUserStrings(ref uint phEnum, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]uint[] rStrings, uint cmax, out uint pcStrings);

     uint GetParamForMethodIndex(uint md, uint ulParamSeq, out uint ppd);

     uint EnumCustomAttributes(ref uint phEnum, uint tk, uint tkType, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]uint[] rCustomAttributes, uint cMax, out uint pcCustomAttributes);

     uint GetCustomAttributeProps(uint cv, out uint ptkObj, out uint ptkType, out IntPtr ppBlob, out uint pcbSize);

     uint FindTypeRef(uint tkResolutionScope, [MarshalAs(UnmanagedType.LPWStr)]string szName, out uint ptr);

     uint GetMemberProps(uint mb, out uint pClass, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szMember, uint cchMember, out uint pchMember, out uint pdwAttr, out IntPtr ppvSigBlob, out uint pcbSigBlob, out uint pulCodeRVA, out uint pdwImplFlags, out uint pdwCPlusTypeFlag, out IntPtr ppValue, out uint pcchValue);

     uint GetFieldProps(uint mb, out uint pClass, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szField, uint cchField, out uint pchField, out uint pdwAttr, out IntPtr ppvSigBlob, out uint pcbSigBlob, out uint pdwCPlusTypeFlag, out IntPtr ppValue, out uint pcchValue);

     uint GetPropertyProps(uint prop, out uint pClass, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)]char[] szProperty, uint cchProperty, out uint pchProperty, out uint pdwPropFlags, out IntPtr ppvSig, out uint pbSig, out uint pdwCPlusTypeFlag, out IntPtr ppDefaultValue, out uint pcchDefaultValue, out uint pmdSetter, out uint pmdGetter, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 13)]uint[] rmdOtherMethod, uint cMax, out uint pcOtherMethod);

     uint GetParamProps(uint tk, out uint pmd, out uint pulSequence, [MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)]char[] szName, uint cchName, out uint pchName, out uint pdwAttr, out uint pdwCPlusTypeFlag, out IntPtr ppValue, out uint pcchValue);

     uint GetCustomAttributeByName(uint tkObj, [MarshalAs(UnmanagedType.LPWStr)]string szName, out IntPtr ppData, out uint pcbData);

     bool IsValidToken(uint tk);

     uint GetNestedClassProps(uint tdNestedClass, out uint ptdEnclosingClass);

     uint GetNativeCallConvFromSig(IntPtr pvSig, uint cbSig, out uint pCallConv);

     uint IsGlobal(uint pd, out uint pbGlobal);
   }
}


These declarations are very important and I'll refer to them later also. These are necessary to open a file using Unmanaged Metadata API. As it can be seen, the IMetaDataDispenserEx is used for opening/defining an assembly while IMetaDataImport is used for exploring an opened assembly (similarly to Reflection API).

Now let's take a look at our main program.
To open a given assembly we should first create an object of MetaDataDispenserEx. This object will be used to open the assembly. Once we call its OpenScope() method we will get a reference to an IMetaDataImport object which can be used to explore the file.

static void Main(string[] args)
{

   Console.Write("Please enter the full path of the assembly: ");
   //Read the path of the assembly from the console.
   string assemblyPath = Console.ReadLine();

   IMetaDataDispenserEx dispenser = new MetaDataDispenserEx();
   IMetaDataImport import = null;

   object rawScope = null;

   //GUID of the IMetaDataImport interface.
   Guid metaDataImportGuid = new Guid("7DAC8207-D3AE-4c75-9B67-92801A497D44");


   //Open the assembly.
   dispenser.OpenScope(assemblyPath, 0, ref metaDataImportGuid, out rawScope);

   //The rawScope contains an IMetaDataImport interface.
   import = (IMetaDataImport)rawScope;

   //Write to the console the TypeDefs by calling a method.
   EnumarateTypeDefinitions(import);

   Console.ReadLine();
}


To get all the TypeDefs' names in the assembly, we should enumerate over the TypeDefs.
As the documentation also mentions, we should define a uint variable which will be a handle to the enumeration. Once we have finished enumerating the handle should be closed with the IMetaDataImport.CloseEnum() method. This rule is important and applies to every enumeration.
For enumerating, we should create a uint array which will hold all the TypeDef tokens. When we have a token we can get information about the given TypeDef by calling the IMetaDataImport.GetTypeDefProps() method. This method will also give the name of the TypeDef.
The EnumarateTypeDefinitions() method looks like this for me:

private static void EnumerateTypeDefinitions(IMetaDataImport import)
{

   //Handle of the enumeration.
   uint enumHandle = 0;

   //We will read maximum 10 TypeDefs at once which will be stored in this array.
   uint[] typeDefs = new uint[10];

   //Number of read TypeDefs.
   uint count = 0;


   import.EnumTypeDefs(ref enumHandle, typeDefs, Convert.ToUInt32(typeDefs.Length), out count);

   //Continue reading TypeDef's while he typeDefs array contains any new TypeDef.
   while (count > 0)
   {
     for (uint typeDefsIndex = 0; typeDefsIndex < count; typeDefsIndex++)
     {

       //Store the TypeDef's token.
       uint token = typeDefs[typeDefsIndex];

       //The TypeDef's name will be stored in this array. The 1024 is a "magical number", seems like a type's name can be maximum this long. The corhlpr.h also defines a suspicious constant like this: #define MAX_CLASSNAME_LENGTH 1024
       char[] typeName = new char[1024];

       //Number of how many characters were filled in the typeName array.
       uint nameLength;
       //TypeDef's flags.
       uint typeDefFlags;
       //If the TypeDef is a derived type then the base type's token.
       uint baseTypeToken;

       //Get the TypeDef's properties.
       import.GetTypeDefProps(token, typeName, Convert.ToUInt32(typeName.Length), out nameLength, out typeDefFlags, out baseTypeToken);

       //Get the TypeDef's name.
       string fullTypeName = new string(typeName, 0, Convert.ToInt32(nameLength));

       //Write the TypeDef's name to the console.
       Console.WriteLine(fullTypeName);
     }

     import.EnumTypeDefs(ref enumHandle, typeDefs, Convert.ToUInt32(typeDefs.Length), out count);
   }

   import.CloseEnum(enumHandle);
}

That's it. Compile and try it. :-)

I'm not sure yet of what will be my next post about. Maybe about the PE header or the tokens. There are lots of interesting subjects... :-)

11 Comments:

Anonymous Anonymous said...

Great Article. Did you manually create the COM interface definitions or did you use some tool?

Sunday, May 6, 2007 at 2:24:00 AM GMT+2  
Blogger Zsozso said...

Hello!

Good question. I don't want to lie, I have forgotten it... Either I created them or I found them online as part of an open-source project. I'm not sure. I didn't use any tool though, I'm certain about that. I started to work on DILE in 2004 and by now I really can't remember it. :-(

Regards,
Zsolt Petreny

Sunday, May 6, 2007 at 5:17:00 PM GMT+2  
Anonymous Anonymous said...

The last example doesn't take into account names of nested class names.

Wednesday, January 7, 2009 at 8:18:00 PM GMT+1  
Blogger Zsozso said...

You're right, reading nested types is missing from the code. However, if you want to see how that's done then you can check DILE's code [1].

Btw, thanks for your comment. It's really nice to see that an entry that I wrote 4 years ago is still useful for somebody. :-)

[1] FindEnclosingType() method:
http://dile.svn.sourceforge.net/viewvc/dile/trunk/Dile/DILE/Disassemble/TypeDefinition.cs?view=markup

Monday, January 26, 2009 at 11:56:00 AM GMT+1  
Anonymous Anonymous said...

Wow! Thank you! I always wanted to write in my site something like that. Can I take part of your post to my blog?

Sunday, December 27, 2009 at 1:44:00 AM GMT+1  
Anonymous Anonymous said...

Easily I to but I dream the post should secure more info then it has.

Sunday, December 27, 2009 at 6:46:00 PM GMT+1  
Blogger Zsozso said...

Feel free to use my blog as a source. Perhaps just add a link to my blog.

Although I haven't been a very diligent blogger lately...

Zsolt Petreny

Tuesday, December 29, 2009 at 5:56:00 AM GMT+1  
Anonymous Anonymous said...

Your blog keeps getting better and better! Your older articles are not as good as newer ones you have a lot more creativity and originality now keep it up!

Wednesday, January 6, 2010 at 11:24:00 PM GMT+1  
Anonymous Anonymous said...

This was exactly what I was looking for! BTW, some of these declares break when running under 64-bit. I change all of the hEnum and phEnum parms in the declares from uint's to IntPtr's so they automatically go to the correct pointer size and it worked great under both 32 and 64 bit.

Thanks again.

Thursday, March 11, 2010 at 5:20:00 AM GMT+1  
Blogger Zsozso said...

No doubt, that's really a mistake in the code. Thanks a lot, I'll fix it in DILE.

And please let me know if you find any other bug. :-)

Thursday, March 11, 2010 at 11:42:00 PM GMT+1  
Anonymous Anonymous said...

Amiable post and this fill someone in on helped me alot in my college assignement. Gratefulness you as your information.

Saturday, March 13, 2010 at 10:09:00 PM GMT+1  

Post a Comment

<< Home