Optimize Your Code with Register Variables

by Robert Zale, PowerBASIC Inc.

Register variables... a powerful tool for optimization, but often overlooked or misunderstood. Many compilers use them in one form or another. Better compilers, including all 32-bit PowerBASIC products, offer automatic allocation as well as specific control by the programmer. But more about that later.

So just what is a register? It's simply a small area of memory, located directly on the cpu. The Intel x86 chips offer eight 32-bit registers, while the x87 numeric coprocessor sports another eight 80-bit floating point registers. Typically, registers are used by the compiler for temporary storage and calculation. Because they're "on-chip", access is very fast... much faster than conventional memory. Even better, the code needed to access them is smaller, too!

So why don't we examine the code in a Sub or Function to find a few local variables which are used the most? Then, instead of storing these popular variables in memory, just reserve a couple of the cpu registers and store them there? We'd get a big boost in speed, and smaller code size, too. There you have it: REGISTER VARIABLES.

So just how much of a difference do register variables make? Let's look at a very simple example written in 32-bit PB/CC:

Function PBMain()
    Register X&, Counter&
    For Counter& = 1 To 300000000
        X& = Counter& + Counter& + Counter&
    Next
End Function

With Register Variables disabled on a P2/300, this runs in 7 seconds. Now, turn them on, and the identical code runs in under 4 seconds, a major improvement! That's close to double the execution speed!

Will it work for floating point code, too? You bet it will! Any local, extended precision (80-bit) float variable can be declared as a Register Variable. It even helps to use them for storage of overworked numeric constants -- anything that limits memory access will help your bottom line performance. Take this simple example:

Function PBMain()
    Register Counter&
    Register x##, y##
    x## = 1
    y## = 0.00001
    For Counter& = 1 To 100000000
        x## = x## * y##
    Next
End Function

Watch this closely... It just gets better and better! Without the benefit of Register Variables, this code runs in about 6.7 seconds. Fairly respectable by most standards. But turn them on, and running time is slashed to just 1.6 seconds. More than four times faster! Just by telling the compiler to make full use of Register Variables.

Register variables are always local to the Sub or Function where they are declared. In the current version of PowerBASIC, there may be up to two integer class register variables (word/dword/integer/long) and up to four extended precision (80-bit) floats within each Sub or Function. It's possible that future versions of the compiler will change these limits, so we place no restrictions on how many you may declare. Any "extra" register variables are simply reclassified by the compiler as locals.

PowerBASIC stores the two integer class Register Variables in the CPU registers ESI and EDI, though you really don't need to worry about it unless you use inline assembler. Since these two registers aren't addressable at the byte level, PowerBASIC must disallow bytes as Register Variables. Supported integer class variables would be any of integer, long, word, and dword. As with all 32-bit programs, 32-bit data size is preferred. It's faster and the generated code is smaller, too. In the 32-bit world, avoid the use of bytes, integers, and words whenever it's reasonable. Your code will benefit!

The Intel x87 numeric coprocessor offers eight 80-bit floating point registers. PowerBASIC takes four of them to store Register Variables. Since these registers are 80-bits in size, only extended precision floating point variables (such as x##) are eligible. If singles or doubles were allowed, round-off discrepancies would be introduced. Simply put, that would mean slight changes in calculation depending upon whether Register Variables were enabled: an unacceptable option.

The REGISTER statement, supported in PB/CC and PB/DLL, allows you to choose which variables will be classified as register variables. They are local, so each Sub and Function may have its own unique set. If you do not make the choice in a particular Sub/Function, the compiler will attempt to choose for you. By default, the compiler will always assign any integer class local variables available. Extended precision float variables will be automatically allocated only in functions which contain no external function calls.

The $REGISTER metastatement, also supported in PB/CC and PB/DLL, allows you to specify the method of auto-allocation of Register Variables. $REGISTER ALL requests automatic allocation of all possible register variables, both integer class and floating point. $REGISTER DEFAULT requests automatic allocation of integer class variables, but allocates floating point variables only in subs and functions which contain no external function calls. $REGISTER NONE disables automatic allocation of register variables. In the current version of the compilers, $REGISTER applies to the entire program. It must therefore precede any Sub or Function.

Integer class register variables are almost always desirable and beneficial. It's generally best to select those which are referenced most frequently, such as For/Next Loop Counter Variables, and those used repeatedly as array indexes. Float register variables should generally be chosen with a bit more caution, since the compiler must generate code to save and restore them to conventional memory around each call to a Sub or Function. In some rather rare cases, it is possible that float register variables could actually reduce execution speed. However, they are extremely valuable with intensive floating point calculations in functions which have few references to other Subs or Functions.

Due to the structure of the numeric coprocessor, and the instruction set available, the first float register variable declared in your program has far more optimization possibilities than the others. Use care in choosing the variable which is used most within floating point expressions (that is, on the right side of the '=' assignment operator), in order to gain the greatest advantage in execution speed. Also, remember it is typically valuable to assign floating point constants to register variables when they are used in repetitive or intensive calculations.

So what about Register Variables and inline assembler? Generally speaking, it's pretty straightforward. In most cases, you'll only be accessing integer class variables, and you can do that by just using the variable name. For example:

Register xyz&
asm  mov eax, xyz&
asm  mov eax, esi

In the above example, both "mov" instructions are interpreted in exactly the same way, as PowerBASIC is smart enough to understand that the Register Variable xyz& is stored in the cpu register esi. It's much to your advantage to use intuitive variable names rather than hardware registers, for obvious reasons, so be sure you do that whenever possible. That rule always applies when you are dealing with integer class Register Variables. That is, integer, long, word, and dword Register Variables.

You probably won't have nearly as much need to access floating point Register Variables from inline assembler, and that's good! If you try it, the rewards can be great, but there are hazards. You must use a good deal more care with assembler floating point code in functions with Register Variables. Floating point register variables may occupy up to four of the coprocessor registers, so you must limit your use of x87 registers to the remaining four. Further, float register variables should not be referenced by name in assembler code, as the compiler can't always track the register locations with absolute certainty. Here's why... Registers on the x87 are oriented as a stack. The first value loaded is saved in register st(0). The second value loaded goes to st(0) as well, but pushes the first to st(1). And so on, for up to eight float registers. When you declare float Register Variables, the first is stored in st(0), the second in st(1), then st(2) and st(3). Each time more values are loaded or stored, the Register Variables can shift up to four register positions in either direction! This isn't a problem with compiled PowerBASIC code, but it can be a logical nightmare with inline assembler. So the PowerBASIC rule is simple: Never, ever reference a float register variable by name from inline assembler. Just don't do it! (smile) Reference floats only by the register: st(0) through st(7). That's the safe thing to do.

One final restriction: Since Register Variables have no memory location, they cannot be used with the VARPTR() function.

Register Variables are supported in both 32-bit PowerBASIC compilers. PB/CC, the PowerBASIC Console Compiler, creates text mode applications for Win95/98/NT. It is ideally suited for those situations where a graphical user-interface is not needed nor desirable, such as Internet Web Server and CGI applications, or for a straightforward port of DOS Basic code to 32-bit Windows. PB/DLL, the PowerBASIC DLL Compiler, creates DLLs and executables for Win95/98/NT. Its industry-standard DLLs may be accessed from any Windows language to enhance capabilities and total performance. Both compilers offer multi-threaded capabilities, inline assembler, pointers, unsigned integers, conditional compilation, and much more.