Wednesday, August 16, 2006

Porting a Win32 Delphi App to .net - Part 2

When we last left our hero, he was struggling to figure out why the .net compiled engine was so damn slow. When it comes to diagnosing performance issues with an application, the smart programmer reaches for a profiler.

From the wikipedia article:
A profiler is a performance analysis tool that measures the behavior of a program as it runs, particularly the frequency and duration of function calls.

Knowing what I needed to do, I sought out the tools which would allow me to profile the library. I downloaded Microsoft's CLR Profiler (there are separate versions for .net 1.1 and 2.0). However, after installing it, I could not get it to work.

Next, I searched Google for .net profilers that support Delphi. On of the first ones I spotted was called ANTS, by Red Gate Software. It only supports .net languages, so the profiler wouldn't be able to help me with the Win32 side of our product line, but I gave it a go anyhow - and it worked beautifully and intuitively.

Finally, I investigate a profiler which would work both with our Win32 library and our future .net library. The only one which meets this criteria is AQTime, by AutomatedQA. Unfortunately, to use AQTime, you must have Administrator privileges. I only use my administrator account for system maintenance, so this was a deal breaker for me. It will also be the subject of an upcoming post on dumb-ass software development for the Windows platform.

So ANTS it was! Using the profiler results from the 15 day free demo, I was quickly able to see where a large chunk of time was being spent - and it wasn't in any code I wrote! One of the data structures in the library is a very large array, which is exposed as a property of one of our objects. It seemed as if a call to initialize this large array was occurring way too many times, since only once instance of it should have been created.

Stumped, I turned to ildasm - Microsoft's Intermediate Language DisASseMbler. It allows you to look at the byte code generated by the .net compiler, and see what the heck is going on at a much lower level. Sure enough, in one routine I saw that it was requesting seven local copies of the large array I mentioned above - each of which needed to be initialized, and copied!

Investigating further, I found the cause of the problem. Instead of using the private field names for the large array (e.g. FArray), a property (e.g. Array) with get and set methods was being called, and for some reason this resulted in local copies being created. By replacing Array with FArray, I saw the calculation time plummet from the 2+ seconds observed before, down to around 1/3 or a second.

I cut the calculation time by another 10% using targeted fixes, and then measuring the results. Since then, I've further profiled the library, tuning code most likely to benefit, and not worrying about code which has little impact on the overall time. Just this morning, I reduced they profiling the calculations again. Now, I believe I am at a point of diminishing returns, and will stop profiling and optimizing until a later time.

Next step: see if I can use the library using mono on linux!

1 comment: