Notes on 32 bit Assembler

by Steve Hutchesson

One of the big advantages in learning to write 32 bit assembler is the capacity to write it in a familiar environment. In most instances, learning in a pure assembler environment (MASM TASM etc..) is a very difficult way to begin as the startup code can be very complicated and the area is not all that well documented.

Programmers writing in either of the two PowerBASIC 32 bit compilers start with a considerable advantage in that the difficult and messy part of manipulating stack parameters for function/procedure calls at both the beginning and end of the function call are done by the compiler.

The other advantage is the capacity to write assembler line by line with high level functions so that many of the complicated things can be done in the high level part of the language.

Of the remaining languages that can still write inline ASM, neither "C" or Pascal have the same ease or convenience of writing mixed high level/ASM code.

Now depending on your sense of humour, one of the best places to start is with unconditional jumps. It is supposed to be politically incorrect to use GOTO in a modern structured program so the simple answer is not use them at all, use the real thing,

    ! jmp label  ; the real thing

A common task in programming is to increment or decrement a counter. The use of,

    ! inc var
    ! dec var

is not only very clear coding, its less typing as well.

Simple code replacements can be done that make the start to inline ASM a lot easier.

' Standard Basic
' ~~~~~~~~~~~~~~

var = 0

' code here
' ~~~~~~~~~
var = var + 1
Loop until var = 1000

can be replaced with,

' Inline ASM
' ~~~~~~~~~~
var = 0

' code here
' ~~~~~~~~~
! inc var ; increment var by one
! cmp var, 1000 ; compare it with constant
! je End ; if it is equal, jump to label "End"
! jmp Start ; if not, jump back to Start

Simple examples like this are the incremental approach, the more you use inline ASM, the easier it gets and while there is little speed improvement in a simple example of this type, loops of this type are the building blocks for very high speed loop optimization that comes with practice.

Bob Zale's implimentation of LOCAL for placing parameters on the stack is very well done.

  In MASM,
        LOCAL wc   :WNDCLASSEX
  In PowerBASIC
        LOCAL wc as WNDCLASSEX

Using the LOCAL capacity is a very tidy way of allocating what are normally called automatic stack variables. The Intel data recommends using LOCAL stack variables placed in decending order of size for performance reasons.

An empty function for writing inline ASM is as follows,


FUNCTION FunctionName(ByVal var1 as LONG, ByVal var2 as LONG) as LONG

LOCAL lvar1 as LONG
LOCAL lvar2 as LONG

' paramaters passed to the function
' as well as LOCAL parameters can
' be referenced directly in assembler

FUNCTION = lvar2



A point that has been sold by most is the need to use proper 32 bit size variables to take advantage of the processor's performance. Always use 32 bit registers for counters, string manipulation etc... it is simply faster to use a 32 bit processor in native 32 bit mode than 8 or 16 bit compatibility mode. The other thing is of course that you have a theoretical 4 gig counter range, not the 64k by using WORD size registers.

Two things that can be difficult for a programmer migrating from high level languages to assembler is data size and addressing. In assembler you have,

    DWORD = 00 00 00 00 32 bit register size data
    WORD  = 00 00       16 bit register size data
    BYTE  = 00           8 bit register size data

A quad word is usually constructed in the eax:edx register pair.

An ADDRESS is the memory location of data, labels, functions etc...

In the ordinary sense,

    LOCAL var as LONG
    LOCAL adr as LONG
    var = 100
    adr = VarPtr(var)
var is the CONTENT. adr is the ADDRESS of the CONTENT

If you convert both var and adr to string data for display,

    disp$ = str$(var)+" "+str$(adr)

you will end up with a display of 100 plus a 32 bit ADDRESS of where it is in memory.

PowerBASIC gives you a number of ADDRESS retrieval functions,

    VarPtr()   ' 32 bit ADDRESS of variable
    StrPtr()   ' 32 bit ADDRESS of BYTE data in dynamic string
    CodePtr()  ' 32 bit ADDRESS of location in code. EG CodePtr(WndProc)

You can also use the ASM mnemnic,

    ! lea  ; load effective address

Here is one use of inline ASM in doing a direct function call without using a declaration in the header file.

As a normal function it would be called with,

fRV = FunctionName(hWnd,Edit1&,hInstance)

By accessing its ADDRESS with the GetProcAddress() function, you manually push the parameters onto the stack in reverse order and call the function directly.

    LOCAL libName  as ASCIIZ * 128    '< DLL name
    Local szFnName as ASCIIZ * 24     '< Proc name
    LOCAL lpfnAdd  as DWORD           '< function address
    LOCAL hDLL     as LONG            '< DLL handle
    LOCAL fRV      as LONG            '< function return value
    libName  = "YOURDLL.DLL"
    szFnName = "FunctionName"
    hDLL = LoadLibrary(libName)
      lpfnAdd=GetProcAddress(hDLL&,szFnName) ' get ADDRESS of function
      ! push hInstance     ; place 3rd parameter on stack
      ! push Edit1&        ; place 2nd parameter on stack
      ! push hWnd          ; place 1st parameter on stack
      ! call lpfnAdd       ; call the function
      ! mov fRV, eax       ; place return value from eax into variable
    FreeLibrary hDLL


Differing from Windows APIs, the reference material from Intel is excellent to the stage of being overkill. You can directly download the Pentium manuals from Intel and if you don't ming using the equivelant of a medium size rainforest in paper, you can even print them out.