Reading the file version from Windows EXE and DLL files.



One of my goals for this week was reading the version of Windows executable files. The code should be implemented in Java and avoid resorting to any native calls.

So, a solution that seemed simple and straightforward would be reading directly the PE header from these files and read the version number for our use.

As many things in life, it's easier said than done.

The troubles started with the code required to handle binary files. I'm using Java and had none of my trustworthy code from previous projects to read files using specific x86 sizes for DWORD, WORD and Unicode strings with a limited size.

To my rescue, I've found the nifty binary file class from Jeff Heaton: http://www.heatonresearch.com/articles/22/page2.html

It's simple and perfect. Far better than the code I used in my previous projects. To handle Unicode strings I read each byte of the Unicode string onto a byte buffer and then use a Java command to convert it properly:

Grabbing the byte sequence of the string:

rgbData = new byte[Data];
for (int i = 0; i < rgbData.length; i++){
rgbData[i] = (byte) bin.readByte();
if ((rgbData[i]==0)&&(rgbData[i-1]==0))
break;
}


Converting to plain string:
String output = new String(rgbData, "UTF-8");

--------------------------

But my biggest trouble was the fact that the file version was not kept inside the PE header itself. The file version for DLL, EXE, OCX, DRV, SCR and similar files is kept inside a resource on the file. (Thanks to TheK on boot-land for helping me sort this detail: http://www.boot-land.net/forums/index.php?showtopic=11890)

So, besides implementing the PE header reading part, it was also necessary to implement all the logic to correctly interpret resources inside executables.

For my luck, this format is extensively documented around the Internet and even MS itself has released official documentation that explains (to some extent) how the structures should be read.

Nevertheless, it took me far longer than initially expected. I had planned for a full day of work and ended up working 3 days to achieve this goal. The code itself is not optimized for speed but for the moment it will suffice the needs of the prototype.

I've tested both with DLL's from the Windows kernel and custom executables from other compilers that had inclusively been modified with UPX - the exe compressor.

It was a good adventure. I've learned far more than what I originally knew about the binary format of windows executable files and this added knowledge might certainly open the "window" for other adventures in the future.

:)