![]() |
![]() |



Games are real time applications. Real time applications are a special class of computer software where performance is unusually important. Sure, MS Excel needs to be able to recalculate a spreadsheet quickly when you change a number, but it only needs to do this occasionally, each time you finish editing a number. The bit of code that is responsible for updating the spread sheet after each edit is only a small fraction of Excel's code base, as most of Excel is not very performance critical.
Games are different.
Games need to generate 60 complex 3D images of their fantasy worlds a second. Games also need to update and move all the characters, bullets and effects using complex Artificial Intelligence algorithms before it is possible to draw the next frame. Games need to be able to determine when objects collide and calculate appropriate responses using the laws of physics. Frequently games will also be mixing and spatialising sounds, streaming music and level data from CDs, transmitting and receiving data over the Internet and keeping tabs on which buttons on the controller are being pressed. Until recently, they also had to do all this on a computer that didn't have a whole lot more processing power than a pocket calculator. The point is, almost every part of a games code is used each frame, 60 times a second. It has to get the timing exactly right. It is no good if it can do some frames in 1/200th of a second, but others take 2 seconds, or the player will see glitches and jumps in the game. So not only do games need to do everything as fast as possible, they also need to balance their workload so that they can take a consistent amount of time to perform all the tasks required of it. Optimisation is important and understanding how to optimise code effectively is an important skill when developing games and the technology that powers games.
During the early stages of one project I was working on, the game was running very slowly. It was getting so slow that it was becoming difficult to test and the publisher was not looking too pleased. We knew that the graphics engine was far from done (it had only really just started working at all). It was not batching up geometry efficiently, it was not sending triangle strips to the hardware, and it was not sorting anything by state. We knew it was going to be slow and foolishly assumed that this was where our performance problems were coming from. We spent the best part of a week improving the graphics engine so that it would behave in a more efficient manner, but after all that effort we had not really made any impact on the performance problem. The pressure was on to fix it and in desperation we reached for a profiler (a performance measurement tool). 15 minutes later we had a profile that showed that 60% of CPU time was being spent in a function that tested the validity of the games scene graph. This function was very thorough in ensuring that the scene graph had not been corrupted in some way and was called before and after practically every operation that used the scene graph. It did not take much longer to bring the over zealous validation code under control and suddenly the game was running just fine. If only we had profiled at the start of the week!
You could use your intuition and knowledge of your game to decide which parts of the code you are going to concentrate your optimisation efforts on, but like most human instincts, your intuition will often prove to be unreliable and you will waste a lot of your time. You first step should always be to measure the performance of your code base using a profiler. These give you an accurate idea of where your program is spending its time as well as giving you a baseline measurement so you can figure out if the changes you are making are actually making anything run any faster or not.
If you don't have a profiler, there are plenty of them out there to choose from, such as VTune, TrueTime and AQTime on the PC. Metrowerks do a good cross platform profiler, but they keep taking it off the market and changing its name. Sony even does a hardware based profiler for the PS2 that measures what all the chips are up to at all times. Find a tool that works on your target platform and learn how to use it. It is well worth the effort.
Now you have the tools to measure the performance of your game, you really need something consistent to measure. It is important to understand how the profiler will affect the performance of your game and how that will affect its result. Most profilers have a significant impact on performance (by as much as a factor of 10 times slower), so you should take this into account. For example, one of our current games updates the game logic (all the AI, physics etc) at 20 Hz and updates the display at 60+ Hz. This means that a typical snapshot from the game might look like this...
Update, Render, Render, Render, Update, Render, Render, Render, Update, Render, etc
However, when this is run though the profiler at only 5-6 Hz the game is always trying to catch up and a snapshot might look like this.
Update, Update, Update, Update, Render, Update, Update, Update, Update, Render, etc
If you profile like this and don't take into account the different make-ups of the normal and profiled execution, you may be mislead into thinking that updating the game logic was taking far too long.
Cipher (the game engine I develop) has a very useful feature to allow the player to record entire sessions, and I would recommend anyone to add such a feature to their game. All the inputs, including the passing of time, are recorded to a log file which can be played back at a later date. If I want to profile the game, I will first record a relevant sequence of the game using this feature and then play it back in the profile. This has two main advantages. First off, I am profiling what actually happens in the game, as my recording reproduces the original game in a frame for frame identical fashion. Secondly, I have a consistent set of data to profile. As I optimise the code, I can replay the exact same sequence of events and easily compare different optimisation methods. This recording and playback feature has one more advantage. All our testers always play the game with recording enabled, so if they encounter a problem they can attach the recording to the bug report, which makes it easy for a programmer to reproduce the bug and fix it.
Always remember that you can not measure something without affecting it. Different profilers affect performance in different ways, but as long as you are aware of how that will effect what you are testing, you should be able to take it into account.
One of Cipher's tools, imaginatively called ciphertool, is used to batch convert models and animations into Cipher's model file format. It performs a lot of optimisations to the models as they are processed, batching up geometry, cleaning up animations, tweaking models to support volume shadows. This tool was working fine and everyone was happy with the results, until one of our artists tried to convert a 60,000 polygon model with it. It just sat there, looking like it had crashed. A quick check in the debugger revealed that it hadn't crashed, but was just taking rather a long time to perform a particular calculation. I spent a bit of time with the profiler before I discovered that there was an order N cubed algorithm used to generate the shadow volumes, where N was the number of polygons. The solution was to replace the algorithm with a better one that worked out at order N and reduced the execution time for this particular model from 15 hours to 3 seconds. I could have optimised the N cubed version until I was blue in the face, but it would have achieved little to dent the 15 hour execution time.
Learning to understand the information collected from the profiler is important. I have seen many people run the profiler, see which function comes out at the top, and then set to work trying to make that function perform its tasks more quickly. The real trick is to try and look at the bigger picture and figure out why that particular function is proving to be a problem. Does it make any sense? How many times is it being called? Does that make any sense? Who is calling it? Are they using the function correctly? Are you using a bad algorithm? The following is a checklist of things to think about before you try and optimise something...
You won't always be expected to optimise for speed. Games often have a lot of other limitations imposed on them by the hardware that have to be taken into account, such as memory, disk space and network bandwidth. You may be forced to give up some of your hard won performance in exchanged for an improvement in one of these other areas.
"Premature optimization is the root of all evil", Donald Knuth.
Never a truer word has been spoken. Optimisation almost always makes your code more complex, harder to understand and hopefully faster. Clever caching schemes and more sophisticated algorithms make your code less flexible and less able to adapt to new situations. Avoid optimising code indiscriminately and only optimise when you have to. Absolutely avoid optimising code before it has even been finished.