Published Monday, October 27, 2008 12:17 PM by martin

Some Thoughts on Software Performance

In recent years I've done quite a bit of performance and scalability testing of apps on the Microsoft platform.  I know plenty of people who are more knowledgeable about software performance than I am, but at the same time a little knowledge about how to measure and affect performance in software is a surprisingly rare commodity.  I guess it's a skill-set that not many people have the time and/or inclination to obtain.  Certainly, I only learned what I know because employers have put me in positions where I've had to learn that stuff.

I'm glad they did, because it's a fascinating area.  One of the things I like about this area is that so much of it appears counter-intuitive at first glance.  I'll give you some examples...

1.  Low CPU usage - good or bad?

Let's say we have some code running on an application server.  We test it by submitting a certain workload, and it takes a few minutes to work through it.  We observe the total CPU usage to be around 50%.  I've been in this situation countless times, and more often than not, people look at the CPU usage and say "great, our code is efficient, we have CPU to spare".  But let's look at that again: CPU 50%, and it takes a few minutes to work though the workload.  The right observation would be "why isn't my CPU running at (or near) 100%, and getting through the workload quicker?"

Perhaps because people are used to things slowing down when their desktop CPU creeps towards 100%, they assume that high CPU is bad.  On a dedicated application server, low CPU is bad, because it means that something is preventing that server CPU from working as hard as we'd like.  That is, we want the CPU to be working harder doing useful work.  Obviously there's no point increasing CPU work if our workload isn't processed any quicker.

So in our scenario, we can say that either (a) we're not generating the workload quickly enough to keep the CPU busy, or (b) the app is constrained by access to other resources, such as I/O on network or disk, or maybe contention around locking constructs.  Either way, we need to do some work to relieve that constraint and get the CPU working harder.

2. Asymmetric CPU usage

I'm sure I don't need to say why parallelism is important.  We in the software industry need to get much better at concurrent programming I think.  Most of us now have multiple CPU cores in our development machines.  That means, if we don't look in the right places, we might get a misleading view of the CPU usage in our applications.

It's surprisingly easy to build a .NET application that does most of its work on one specific thread.  This is particularly true of a Windows Forms desktop application.  I've seen apps that, although written in C#, were essentially VB6 apps because only the UI thread was used for doing any work.  On a dual-core machine, you might see something like this in task manager…

asymmetric

If you only ever look at an aggregated CPU usage value, you might be proud of the fact that your CPU usage never goes above 50% (although as I said in point 1 this is not necessarily something to be proud of).  In fact, this is showing a real problem in the design of your app, because no matter how you upgrade your hardware you're unlikely to make this app go any faster.

I guess I should admit that, right now, the app I'm working on looks a little bit like this.  Physician heal thyself, etc.

If you're using perfmon to analyse your app's CPU usage, you'll want to use the Process object so that you can view CPU usage just for your app's process(es).  It's worth pointing out that the CPU counters in the Process object go up to more than 100%.  If you have two CPU cores in the machine, it goes to 200%, or 400% for four cores, etc.

3. Extrapolation

One of the most interesting human factors around software performance is the common belief that software performance behaves linearly as variables change.  You most often see this as people trying to extrapolate performance figures to assume how software will perform on bigger hardware, with greater data volumes, or with more concurrent users, etc.  In general, it's safest to assume that you simply cannot extrapolate.

When you're working to improve the performance of an application, you find the first problem; let's say your app slows down to the point of being unusable when you're loading a large quantity of data into it.  Someone might say "ok, we'll work on getting that data loading problem fixed.  In the meantime, you find the next slowest part of the app and work on that".  The problem with this is that "the next slowest part of the app" might well be different when you manage to get that larger quantity of data loaded.  The slowest routine to operate on 100 items may not be the slowest routine to operate on 10000 items.

In many ways, improving the performance of an application is like mending a leaky garden hose.  You find the first problem and fix it, then you see the next problem and fix that, then the next, etc.