This is nasm.info, produced by Makeinfo version 3.12f from nasmdoc.texi.

INFO-DIR-SECTION Programming
START-INFO-DIR-ENTRY
* NASM: (nasm).                The Netwide Assembler for x86.
END-INFO-DIR-ENTRY


   This file documents NASM, the Netwide Assembler: an assembler
targetting the Intel x86 series of processors, with portable source.

   Copyright 1997 Simon Tatham

   All rights reserved. This document is redistributable under the
licence given in the file "Licence" distributed in the NASM archive.


File: nasm.info,  Node: Section 7.5.2,  Next: Section 7.5.3,  Prev: Section 7.5.1,  Up: Section 7.5

7.5.2. Borland Pascal Segment Name Restrictions
***********************************************

   Since Borland Pascal's internal unit file format is completely
different from `OBJ', it only makes a very sketchy job of actually
reading and understanding the various information contained in a real
`OBJ' file when it links that in. Therefore an object file intended to
be linked to a Pascal program must obey a number of restrictions:

   * Procedures and functions must be in a segment whose name is either
     `CODE', `CSEG', or something ending in `_TEXT'.

   * Initialised data must be in a segment whose name is either `CONST'
     or something ending in `_DATA'.

   * Uninitialised data must be in a segment whose name is either
     `DATA', `DSEG', or something ending in `_BSS'.

   * Any other segments in the object file are completely ignored.
     `GROUP' directives and segment attributes are also ignored.


File: nasm.info,  Node: Section 7.5.3,  Next: Chapter 8,  Prev: Section 7.5.2,  Up: Section 7.5

7.5.3. Using `c16.mac' With Pascal Programs
*******************************************

   The `c16.mac' macro package, described in *Note Section 7.4.5::, can
also be used to simplify writing functions to be called from Pascal
programs, if you code `%define PASCAL'. This definition ensures that
functions are far (it implies `FARCODE'), and also causes procedure
return instructions to be generated with an operand.

   Defining `PASCAL' does not change the code which calculates the
argument offsets; you must declare your function's arguments in reverse
order. For example:

     %define PASCAL
               proc _pascalproc
     %$j       arg 4
     %$i       arg
               mov ax,[bp + %$i]
               mov bx,[bp + %$j]
               mov es,[bp + %$j + 2]
               add ax,[bx]
               endproc

   This defines the same routine, conceptually, as the example in *Note
Section 7.4.5::: it defines a function taking two arguments, an integer
and a pointer to an integer, which returns the sum of the integer and
the contents of the pointer. The only difference between this code and
the large-model C version is that `PASCAL' is defined instead of
`FARCODE', and that the arguments are declared in reverse order.


File: nasm.info,  Node: Chapter 8,  Next: Section 8.1,  Prev: Section 7.5.3,  Up: Top

Chapter 8: Writing 32-bit Code (Unix, Win32, DJGPP)
***************************************************

   This chapter attempts to cover some of the common issues involved
when writing 32-bit code, to run under Win32 or Unix, or to be linked
with C code generated by a Unix-style C compiler such as DJGPP. It
covers how to write assembly code to interface with 32-bit C routines,
and how to write position-independent code for shared libraries.

   Almost all 32-bit code, and in particular all code running under
Win32, DJGPP or any of the PC Unix variants, runs in _flat_ memory
model.  This means that the segment registers and paging have already
been set up to give you the same 32-bit 4Gb address space no matter
what segment you work relative to, and that you should ignore all
segment registers completely. When writing flat-model application code,
you never need to use a segment override or modify any segment
register, and the code-section addresses you pass to `CALL' and `JMP'
live in the same address space as the data-section addresses you access
your variables by and the stack-section addresses you access local
variables and procedure parameters by. Every address is 32 bits long
and contains only an offset part.

* Menu:

* Section 8.1:: Interfacing to 32-bit C Programs
* Section 8.2:: Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF Shared Libraries


File: nasm.info,  Node: Section 8.1,  Next: Section 8.1.1,  Prev: Chapter 8,  Up: Chapter 8

8.1. Interfacing to 32-bit C Programs
*************************************

   A lot of the discussion in *Note Section 7.4::, about interfacing to
16-bit C programs, still applies when working in 32 bits. The absence
of memory models or segmentation worries simplifies things a lot.

* Menu:

* Section 8.1.1:: External Symbol Names
* Section 8.1.2:: Function Definitions and Function Calls
* Section 8.1.3:: Accessing Data Items
* Section 8.1.4:: `c32.mac': Helper Macros for the 32-bit C Interface


File: nasm.info,  Node: Section 8.1.1,  Next: Section 8.1.2,  Prev: Section 8.1,  Up: Section 8.1

8.1.1. External Symbol Names
****************************

   Most 32-bit C compilers share the convention used by 16-bit
compilers, that the names of all global symbols (functions or data)
they define are formed by prefixing an underscore to the name as it
appears in the C program.  However, not all of them do: the ELF
specification states that C symbols do _not_ have a leading underscore
on their assembly-language names.

   The older Linux `a.out' C compiler, all Win32 compilers, DJGPP, and
NetBSD and FreeBSD, all use the leading underscore; for these compilers,
the macros `cextern' and `cglobal', as given in *Note Section 7.4.1::,
will still work. For ELF, though, the leading underscore should not be
used.


File: nasm.info,  Node: Section 8.1.2,  Next: Section 8.1.3,  Prev: Section 8.1.1,  Up: Section 8.1

8.1.2. Function Definitions and Function Calls
**********************************************

   The C calling conventionThe C calling convention in 32-bit programs
is as follows. In the following description, the words _caller_ and
_callee_ are used to denote the function doing the calling and the
function which gets called.

   * The caller pushes the function's parameters on the stack, one after
     another, in reverse order (right to left, so that the first
     argument specified to the function is pushed last).

   * The caller then executes a near `CALL' instruction to pass control
     to the callee.

   * The callee receives control, and typically (although this is not
     actually necessary, in functions which do not need to access their
     parameters) starts by saving the value of `ESP' in `EBP' so as to
     be able to use `EBP' as a base pointer to find its parameters on
     the stack.  However, the caller was probably doing this too, so
     part of the calling convention states that `EBP' must be preserved
     by any C function.  Hence the callee, if it is going to set up
     `EBP' as a frame pointer, must push the previous value first.

   * The callee may then access its parameters relative to `EBP'. The
     doubleword at `[EBP]' holds the previous value of `EBP' as it was
     pushed; the next doubleword, at `[EBP+4]', holds the return
     address, pushed implicitly by `CALL'. The parameters start after
     that, at `[EBP+8]'. The leftmost parameter of the function, since
     it was pushed last, is accessible at this offset from `EBP'; the
     others follow, at successively greater offsets. Thus, in a
     function such as `printf' which takes a variable number of
     parameters, the pushing of the parameters in reverse order means
     that the function knows where to find its first parameter, which
     tells it the number and type of the remaining ones.

   * The callee may also wish to decrease `ESP' further, so as to
     allocate space on the stack for local variables, which will then
     be accessible at negative offsets from `EBP'.

   * The callee, if it wishes to return a value to the caller, should
     leave the value in `AL', `AX' or `EAX' depending on the size of the
     value. Floating-point results are typically returned in `ST0'.

   * Once the callee has finished processing, it restores `ESP' from
     `EBP' if it had allocated local stack space, then pops the previous
     value of `EBP', and returns via `RET' (equivalently, `RETN').

   * When the caller regains control from the callee, the function
     parameters are still on the stack, so it typically adds an
     immediate constant to `ESP' to remove them (instead of executing a
     number of slow `POP' instructions). Thus, if a function is
     accidentally called with the wrong number of parameters due to a
     prototype mismatch, the stack will still be returned to a sensible
     state since the caller, which _knows_ how many parameters it
     pushed, does the removing.

   There is an alternative calling convention used by Win32 programs for
Windows API calls, and also for functions called _by_ the Windows API
such as window procedures: they follow what Microsoft calls the
`__stdcall' convention. This is slightly closer to the Pascal
convention, in that the callee clears the stack by passing a parameter
to the `RET' instruction. However, the parameters are still pushed in
right-to-left order.

   Thus, you would define a function in C style in the following way:

               global _myfunc
     _myfunc:  push ebp
               mov ebp,esp
               sub esp,0x40           ; 64 bytes of local stack space
               mov ebx,[ebp+8]        ; first parameter to function
               ; some more code
               leave                  ; mov esp,ebp / pop ebp
               ret

   At the other end of the process, to call a C function from your
assembly code, you would do something like this:

               extern _printf
               ; and then, further down...
               push dword [myint]     ; one of my integer variables
               push dword mystring    ; pointer into my data segment
               call _printf
               add esp,byte 8         ; `byte' saves space
               ; then those data items...
               segment _DATA
     myint     dd 1234
     mystring  db 'This number -> %d <- should be 1234',10,0

   This piece of code is the assembly equivalent of the C code

         int myint = 1234;
         printf("This number -> %d <- should be 1234\n", myint);


File: nasm.info,  Node: Section 8.1.3,  Next: Section 8.1.4,  Prev: Section 8.1.2,  Up: Section 8.1

8.1.3. Accessing Data Items
***************************

   To get at the contents of C variables, or to declare variables which
C can access, you need only declare the names as `GLOBAL' or `EXTERN'.
(Again, the names require leading underscores, as stated in *Note
Section 8.1.1::.) Thus, a C variable declared as `int i' can be
accessed from assembler as

               extern _i
               mov eax,[_i]

   And to declare your own integer variable which C programs can access
as `extern int j', you do this (making sure you are assembling in the
`_DATA' segment, if necessary):

               global _j
     _j        dd 0

   To access a C array, you need to know the size of the components of
the array. For example, `int' variables are four bytes long, so if a C
program declares an array as `int a[10]', you can access `a[3]' by
coding `mov ax,[_a+12]'. (The byte offset 12 is obtained by multiplying
the desired array index, 3, by the size of the array element, 4.) The
sizes of the C base types in 32-bit compilers are: 1 for `char', 2 for
`short', 4 for `int', `long' and `float', and 8 for `double'. Pointers,
being 32-bit addresses, are also 4 bytes long.

   To access a C data structure, you need to know the offset from the
base of the structure to the field you are interested in. You can
either do this by converting the C structure definition into a NASM
structure definition (using `STRUC'), or by calculating the one offset
and using just that.

   To do either of these, you should read your C compiler's manual to
find out how it organises data structures. NASM gives no special
alignment to structure members in its own `STRUC' macro, so you have to
specify alignment yourself if the C compiler generates it. Typically,
you might find that a structure like

     struct {
         char c;
         int i;
     } foo;

   might be eight bytes long rather than five, since the `int' field
would be aligned to a four-byte boundary. However, this sort of feature
is sometimes a configurable option in the C compiler, either using
command- line options or `#pragma' lines, so you have to find out how
your own compiler does it.


File: nasm.info,  Node: Section 8.1.4,  Next: Section 8.2,  Prev: Section 8.1.3,  Up: Section 8.1

8.1.4. `c32.mac': Helper Macros for the 32-bit C Interface
**********************************************************

   Included in the NASM archives, in the `misc' directory, is a file
`c32.mac' of macros. It defines three macros: `proc', `arg' and
`endproc'. These are intended to be used for C-style procedure
definitions, and they automate a lot of the work involved in keeping
track of the calling convention.

   An example of an assembly function using the macro set is given here:

               proc _proc32
     %$i       arg
     %$j       arg
               mov eax,[ebp + %$i]
               mov ebx,[ebp + %$j]
               add eax,[ebx]
               endproc

   This defines `_proc32' to be a procedure taking two arguments, the
first (`i') an integer and the second (`j') a pointer to an integer. It
returns `i + *j'.

   Note that the `arg' macro has an `EQU' as the first line of its
expansion, and since the label before the macro call gets prepended to
the first line of the expanded macro, the `EQU' works, defining `%$i'
to be an offset from `BP'. A context-local variable is used, local to
the context pushed by the `proc' macro and popped by the `endproc'
macro, so that the same argument name can be used in later procedures.
Of course, you don't _have_ to do that.

   `arg' can take an optional parameter, giving the size of the
argument.  If no size is given, 4 is assumed, since it is likely that
many function parameters will be of type `int' or pointers.


File: nasm.info,  Node: Section 8.2,  Next: Section 8.2.1,  Prev: Section 8.1.4,  Up: Chapter 8

8.2. Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF Shared Libraries
******************************************************************

   ELF replaced the older `a.out' object file format under Linux because
it contains support for position-independent code (PIC), which makes
writing shared libraries much easier. NASM supports the ELF position-
independent code features, so you can write Linux ELF shared libraries
in NASM.

   NetBSD, and its close cousins FreeBSD and OpenBSD, take a different
approach by hacking PIC support into the `a.out' format. NASM supports
this as the `aoutb' output format, so you can write BSD shared
libraries in NASM too.

   The operating system loads a PIC shared library by memory-mapping the
library file at an arbitrarily chosen point in the address space of the
running process. The contents of the library's code section must
therefore not depend on where it is loaded in memory.

   Therefore, you cannot get at your variables by writing code like
this:

               mov eax,[myvar]        ; WRONG

   Instead, the linker provides an area of memory called the _global
offset table_, or GOT; the GOT is situated at a constant distance from
your library's code, so if you can find out where your library is
loaded (which is typically done using a `CALL' and `POP' combination),
you can obtain the address of the GOT, and you can then load the
addresses of your variables out of linker-generated entries in the GOT.

   The _data_ section of a PIC shared library does not have these
restrictions: since the data section is writable, it has to be copied
into memory anyway rather than just paged in from the library file, so
as long as it's being copied it can be relocated too. So you can put
ordinary types of relocation in the data section without too much worry
(but see *Note Section 8.2.4:: for a caveat).

* Menu:

* Section 8.2.1:: Obtaining the Address of the GOT
* Section 8.2.2:: Finding Your Local Data Items
* Section 8.2.3:: Finding External and Common Data Items
* Section 8.2.4:: Exporting Symbols to the Library User
* Section 8.2.5:: Calling Procedures Outside the Library
* Section 8.2.6:: Generating the Library File


File: nasm.info,  Node: Section 8.2.1,  Next: Section 8.2.2,  Prev: Section 8.2,  Up: Section 8.2

8.2.1. Obtaining the Address of the GOT
***************************************

   Each code module in your shared library should define the GOT as an
external symbol:

               extern _GLOBAL_OFFSET_TABLE_   ; in ELF
               extern __GLOBAL_OFFSET_TABLE_  ; in BSD a.out

   At the beginning of any function in your shared library which plans
to access your data or BSS sections, you must first calculate the
address of the GOT. This is typically done by writing the function in
this form:

     func:     push ebp
               mov ebp,esp
               push ebx
               call .get_GOT
     .get_GOT: pop ebx
               add ebx,_GLOBAL_OFFSET_TABLE_+$$-.get_GOT wrt ..gotpc
               ; the function body comes here
               mov ebx,[ebp-4]
               mov esp,ebp
               pop ebp
               ret

   (For BSD, again, the symbol `_GLOBAL_OFFSET_TABLE' requires a second
leading underscore.)

   The first two lines of this function are simply the standard C
prologue to set up a stack frame, and the last three lines are standard
C function epilogue. The third line, and the fourth to last line, save
and restore the `EBX' register, because PIC shared libraries use this
register to store the address of the GOT.

   The interesting bit is the `CALL' instruction and the following two
lines. The `CALL' and `POP' combination obtains the address of the
label `.get_GOT', without having to know in advance where the program
was loaded (since the `CALL' instruction is encoded relative to the
current position). The `ADD' instruction makes use of one of the
special PIC relocation types: GOTPC relocation. With the `WRT ..gotpc'
qualifier specified, the symbol referenced (here
`_GLOBAL_OFFSET_TABLE_', the special symbol assigned to the GOT) is
given as an offset from the beginning of the section. (Actually, ELF
encodes it as the offset from the operand field of the `ADD'
instruction, but NASM simplifies this deliberately, so you do things the
same way for both ELF and BSD.) So the instruction then _adds_ the
beginning of the section, to get the real address of the GOT, and
subtracts the value of `.get_GOT' which it knows is in `EBX'.
Therefore, by the time that instruction has finished, `EBX' contains
the address of the GOT.

   If you didn't follow that, don't worry: it's never necessary to
obtain the address of the GOT by any other means, so you can put those
three instructions into a macro and safely ignore them:

     %macro get_GOT 0
               call %%getgot
     %%getgot: pop ebx
               add ebx,_GLOBAL_OFFSET_TABLE_+$$-%%getgot wrt ..gotpc
     %endmacro


File: nasm.info,  Node: Section 8.2.2,  Next: Section 8.2.3,  Prev: Section 8.2.1,  Up: Section 8.2

8.2.2. Finding Your Local Data Items
************************************

   Having got the GOT, you can then use it to obtain the addresses of
your data items. Most variables will reside in the sections you have
declared; they can be accessed using the `..gotoff' special `WRT' type.
The way this works is like this:

               lea eax,[ebx+myvar wrt ..gotoff]

   The expression `myvar wrt ..gotoff' is calculated, when the shared
library is linked, to be the offset to the local variable `myvar' from
the beginning of the GOT. Therefore, adding it to `EBX' as above will
place the real address of `myvar' in `EAX'.

   If you declare variables as `GLOBAL' without specifying a size for
them, they are shared between code modules in the library, but do not
get exported from the library to the program that loaded it. They will
still be in your ordinary data and BSS sections, so you can access them
in the same way as local variables, using the above `..gotoff'
mechanism.

   Note that due to a peculiarity of the way BSD `a.out' format handles
this relocation type, there must be at least one non-local symbol in the
same section as the address you're trying to access.


File: nasm.info,  Node: Section 8.2.3,  Next: Section 8.2.4,  Prev: Section 8.2.2,  Up: Section 8.2

8.2.3. Finding External and Common Data Items
*********************************************

   If your library needs to get at an external variable (external to the
_library_, not just to one of the modules within it), you must use the
`..got' type to get at it. The `..got' type, instead of giving you the
offset from the GOT base to the variable, gives you the offset from the
GOT base to a GOT _entry_ containing the address of the variable.  The
linker will set up this GOT entry when it builds the library, and the
dynamic linker will place the correct address in it at load time. So to
obtain the address of an external variable `extvar' in `EAX', you would
code

               mov eax,[ebx+extvar wrt ..got]

   This loads the address of `extvar' out of an entry in the GOT. The
linker, when it builds the shared library, collects together every
relocation of type `..got', and builds the GOT so as to ensure it has
every necessary entry present.

   Common variables must also be accessed in this way.


File: nasm.info,  Node: Section 8.2.4,  Next: Section 8.2.5,  Prev: Section 8.2.3,  Up: Section 8.2

8.2.4. Exporting Symbols to the Library User
********************************************

   If you want to export symbols to the user of the library, you have to
declare whether they are functions or data, and if they are data, you
have to give the size of the data item. This is because the dynamic
linker has to build procedure linkage table entries for any exported
functions, and also moves exported data items away from the library's
data section in which they were declared.

   So to export a function to users of the library, you must use

               global func:function   ; declare it as a function
     func:     push ebp
               ; etc.

   And to export a data item such as an array, you would have to code

               global array:data array.end-array ; give the size too
     array:    resd 128
     .end:

   Be careful: If you export a variable to the library user, by
declaring it as `GLOBAL' and supplying a size, the variable will end up
living in the data section of the main program, rather than in your
library's data section, where you declared it. So you will have to
access your own global variable with the `..got' mechanism rather than
`..gotoff', as if it were external (which, effectively, it has become).

   Equally, if you need to store the address of an exported global in
one of your data sections, you can't do it by means of the standard
sort of code:

     dataptr:  dd global_data_item    ; WRONG

   NASM will interpret this code as an ordinary relocation, in which
`global_data_item' is merely an offset from the beginning of the
`.data' section (or whatever); so this reference will end up pointing
at your data section instead of at the exported global which resides
elsewhere.

   Instead of the above code, then, you must write

     dataptr:  dd global_data_item wrt ..sym

   which makes use of the special `WRT' type `..sym' to instruct NASM
to search the symbol table for a particular symbol at that address,
rather than just relocating by section base.

   Either method will work for functions: referring to one of your
functions by means of

     funcptr:  dd my_function

   will give the user the address of the code you wrote, whereas

     funcptr:  dd my_function wrt ..sym

   will give the address of the procedure linkage table for the
function, which is where the calling program will _believe_ the
function lives.  Either address is a valid way to call the function.


File: nasm.info,  Node: Section 8.2.5,  Next: Section 8.2.6,  Prev: Section 8.2.4,  Up: Section 8.2

8.2.5. Calling Procedures Outside the Library
*********************************************

   Calling procedures outside your shared library has to be done by
means of a _procedure linkage table_, or PLT. The PLT is placed at a
known offset from where the library is loaded, so the library code can
make calls to the PLT in a position-independent way. Within the PLT
there is code to jump to offsets contained in the GOT, so function
calls to other shared libraries or to routines in the main program can
be transparently passed off to their real destinations.

   To call an external routine, you must use another special PIC
relocation type, `WRT ..plt'. This is much easier than the GOT-based
ones: you simply replace calls such as `CALL printf' with the
PLT-relative version `CALL printf WRT ..plt'.


File: nasm.info,  Node: Section 8.2.6,  Next: Chapter 9,  Prev: Section 8.2.5,  Up: Section 8.2

8.2.6. Generating the Library File
**********************************

   Having written some code modules and assembled them to `.o' files,
you then generate your shared library with a command such as

     ld -shared -o library.so module1.o module2.o       # for ELF
     ld -Bshareable -o library.so module1.o module2.o   # for BSD

   For ELF, if your shared library is going to reside in system
directories such as `/usr/lib' or `/lib', it is usually worth using the
`-soname' flag to the linker, to store the final library file name,
with a version number, into the library:

     ld -shared -soname library.so.1 -o library.so.1.2 *.o

   You would then copy `library.so.1.2' into the library directory, and
create `library.so.1' as a symbolic link to it.


File: nasm.info,  Node: Chapter 9,  Next: Section 9.1,  Prev: Section 8.2.6,  Up: Top

Chapter 9: Mixing 16 and 32 Bit Code
************************************

   This chapter tries to cover some of the issues, largely related to
unusual forms of addressing and jump instructions, encountered when
writing operating system code such as protected-mode initialisation
routines, which require code that operates in mixed segment sizes, such
as code in a 16-bit segment trying to modify data in a 32-bit one, or
jumps between different- size segments.

* Menu:

* Section 9.1:: Mixed-Size Jumps
* Section 9.2:: Addressing Between Different-Size Segments
* Section 9.3:: Other Mixed-Size Instructions


File: nasm.info,  Node: Section 9.1,  Next: Section 9.2,  Prev: Chapter 9,  Up: Chapter 9

9.1. Mixed-Size Jumps
*********************

   The most common form of mixed-size instruction is the one used when
writing a 32-bit OS: having done your setup in 16-bit mode, such as
loading the kernel, you then have to boot it by switching into
protected mode and jumping to the 32-bit kernel start address. In a
fully 32-bit OS, this tends to be the _only_ mixed-size instruction you
need, since everything before it can be done in pure 16-bit code, and
everything after it can be pure 32-bit.

   This jump must specify a 48-bit far address, since the target
segment is a 32-bit one. However, it must be assembled in a 16-bit
segment, so just coding, for example,

               jmp 0x1234:0x56789ABC  ; wrong!

   will not work, since the offset part of the address will be
truncated to `0x9ABC' and the jump will be an ordinary 16-bit far one.

   The Linux kernel setup code gets round the inability of `as86' to
generate the required instruction by coding it manually, using `DB'
instructions. NASM can go one better than that, by actually generating
the right instruction itself. Here's how to do it right:

               jmp dword 0x1234:0x56789ABC  ; right

   The `DWORD' prefix (strictly speaking, it should come _after_ the
colon, since it is declaring the _offset_ field to be a doubleword; but
NASM will accept either form, since both are unambiguous) forces the
offset part to be treated as far, in the assumption that you are
deliberately writing a jump from a 16-bit segment to a 32-bit one.

   You can do the reverse operation, jumping from a 32-bit segment to a
16-bit one, by means of the `WORD' prefix:

               jmp word 0x8765:0x4321 ; 32 to 16 bit

   If the `WORD' prefix is specified in 16-bit mode, or the `DWORD'
prefix in 32-bit mode, they will be ignored, since each is explicitly
forcing NASM into a mode it was in anyway.


File: nasm.info,  Node: Section 9.2,  Next: Section 9.3,  Prev: Section 9.1,  Up: Chapter 9

9.2. Addressing Between Different-Size Segments
***********************************************

   If your OS is mixed 16 and 32-bit, or if you are writing a DOS
extender, you are likely to have to deal with some 16-bit segments and
some 32-bit ones. At some point, you will probably end up writing code
in a 16-bit segment which has to access data in a 32-bit segment, or
vice versa.

   If the data you are trying to access in a 32-bit segment lies within
the first 64K of the segment, you may be able to get away with using an
ordinary 16-bit addressing operation for the purpose; but sooner or
later, you will want to do 32-bit addressing from 16-bit mode.

   The easiest way to do this is to make sure you use a register for the
address, since any effective address containing a 32-bit register is
forced to be a 32-bit address. So you can do

               mov eax,offset_into_32_bit_segment_specified_by_fs
               mov dword [fs:eax],0x11223344

   This is fine, but slightly cumbersome (since it wastes an
instruction and a register) if you already know the precise offset you
are aiming at. The x86 architecture does allow 32-bit effective
addresses to specify nothing but a 4-byte offset, so why shouldn't NASM
be able to generate the best instruction for the purpose?

   It can. As in *Note Section 9.1::, you need only prefix the address
with the `DWORD' keyword, and it will be forced to be a 32-bit address:

               mov dword [fs:dword my_offset],0x11223344

   Also as in *Note Section 9.1::, NASM is not fussy about whether the
`DWORD' prefix comes before or after the segment override, so arguably
a nicer-looking way to code the above instruction is

               mov dword [dword fs:my_offset],0x11223344

   Don't confuse the `DWORD' prefix _outside_ the square brackets,
which controls the size of the data stored at the address, with the one
`inside' the square brackets which controls the length of the address
itself. The two can quite easily be different:

               mov word [dword 0x12345678],0x9ABC

   This moves 16 bits of data to an address specified by a 32-bit
offset.

   You can also specify `WORD' or `DWORD' prefixes along with the `FAR'
prefix to indirect far jumps or calls. For example:

               call dword far [fs:word 0x4321]

   This instruction contains an address specified by a 16-bit offset;
it loads a 48-bit far pointer from that (16-bit segment and 32-bit
offset), and calls that address.


File: nasm.info,  Node: Section 9.3,  Next: Chapter 10,  Prev: Section 9.2,  Up: Chapter 9

9.3. Other Mixed-Size Instructions
**********************************

   The other way you might want to access data might be using the string
instructions (`LODSx', `STOSx' and so on) or the `XLATB' instruction.
These instructions, since they take no parameters, might seem to have
no easy way to make them perform 32-bit addressing when assembled in a
16-bit segment.

   This is the purpose of NASM's `a16' and `a32' prefixes. If you are
coding `LODSB' in a 16-bit segment but it is supposed to be accessing a
string in a 32-bit segment, you should load the desired address into
`ESI' and then code

               a32 lodsb

   The prefix forces the addressing size to 32 bits, meaning that
`LODSB' loads from `[DS:ESI]' instead of `[DS:SI]'. To access a string
in a 16-bit segment when coding in a 32-bit one, the corresponding `a16'
prefix can be used.

   The `a16' and `a32' prefixes can be applied to any instruction in
NASM's instruction table, but most of them can generate all the useful
forms without them. The prefixes are necessary only for instructions
with implicit addressing: `CMPSx' (*Note Section A.19::), `SCASx'
(*Note Section A.149::), `LODSx' (*Note Section A.98::), `STOSx' (*Note
Section A.157::), `MOVSx' (*Note Section A.105::), `INSx' (*Note
Section A.80::), `OUTSx' (*Note Section A.112::), and `XLATB' (*Note
Section A.169::). Also, the various push and pop instructions (`PUSHA'
and `POPF' as well as the more usual `PUSH' and `POP') can accept `a16'
or `a32' prefixes to force a particular one of `SP' or `ESP' to be used
as a stack pointer, in case the stack segment in use is a different
size from the code segment.

   `PUSH' and `POP', when applied to segment registers in 32-bit mode,
also have the slightly odd behaviour that they push and pop 4 bytes at
a time, of which the top two are ignored and the bottom two give the
value of the segment register being manipulated. To force the 16-bit
behaviour of segment-register push and pop instructions, you can use the
operand-size prefix `o16':

               o16 push ss
               o16 push ds

   This code saves a doubleword of stack space by fitting two segment
registers into the space which would normally be consumed by pushing
one.

   (You can also use the `o32' prefix to force the 32-bit behaviour when
in 16-bit mode, but this seems less useful.)


File: nasm.info,  Node: Chapter 10,  Next: Section 10.1,  Prev: Section 9.3,  Up: Top

Chapter 10: Troubleshooting
***************************

   This chapter describes some of the common problems that users have
been known to encounter with NASM, and answers them. It also gives
instructions for reporting bugs in NASM if you find a difficulty that
isn't listed here.

* Menu:

* Section 10.1:: Common Problems
* Section 10.2:: Bugs


File: nasm.info,  Node: Section 10.1,  Next: Section 10.1.1,  Prev: Chapter 10,  Up: Chapter 10

10.1. Common Problems
*********************

* Menu:

* Section 10.1.1:: NASM Generates Inefficient Code
* Section 10.1.2:: My Jumps are Out of Range
* Section 10.1.3:: `ORG' Doesn't Work
* Section 10.1.4:: `TIMES' Doesn't Work


File: nasm.info,  Node: Section 10.1.1,  Next: Section 10.1.2,  Prev: Section 10.1,  Up: Section 10.1

10.1.1. NASM Generates Inefficient Code
***************************************

   I get a lot of `bug' reports about NASM generating inefficient, or
even `wrong', code on instructions such as `ADD ESP,8'. This is a
deliberate design feature, connected to predictability of output: NASM,
on seeing `ADD ESP,8', will generate the form of the instruction which
leaves room for a 32-bit offset. You need to code `ADD ESP,BYTE 8' if
you want the space-efficient form of the instruction. This isn't a bug:
at worst it's a misfeature, and that's a matter of opinion only.


File: nasm.info,  Node: Section 10.1.2,  Next: Section 10.1.3,  Prev: Section 10.1.1,  Up: Section 10.1

10.1.2. My Jumps are Out of Range
*********************************

   Similarly, people complain that when they issue conditional jumps
(which are `SHORT' by default) that try to jump too far, NASM reports
`short jump out of range' instead of making the jumps longer.

   This, again, is partly a predictability issue, but in fact has a more
practical reason as well. NASM has no means of being told what type of
processor the code it is generating will be run on; so it cannot decide
for itself that it should generate `Jcc NEAR' type instructions, because
it doesn't know that it's working for a 386 or above. Alternatively, it
could replace the out-of-range short `JNE' instruction with a very
short `JE' instruction that jumps over a `JMP NEAR'; this is a sensible
solution for processors below a 386, but hardly efficient on processors
which have good branch prediction _and_ could have used `JNE NEAR'
instead. So, once again, it's up to the user, not the assembler, to
decide what instructions should be generated.


File: nasm.info,  Node: Section 10.1.3,  Next: Section 10.1.4,  Prev: Section 10.1.2,  Up: Section 10.1

10.1.3. `ORG' Doesn't Work
**************************

   People writing boot sector programs in the `bin' format often
complain that `ORG' doesn't work the way they'd like: in order to place
the `0xAA55' signature word at the end of a 512-byte boot sector, people
who are used to MASM tend to code

               ORG 0
               ; some boot sector code
               ORG 510
               DW 0xAA55

   This is not the intended use of the `ORG' directive in NASM, and will
not work. The correct way to solve this problem in NASM is to use the
`TIMES' directive, like this:

               ORG 0
               ; some boot sector code
               TIMES 510-($-$$) DB 0
               DW 0xAA55

   The `TIMES' directive will insert exactly enough zero bytes into the
output to move the assembly point up to 510. This method also has the
advantage that if you accidentally fill your boot sector too full, NASM
will catch the problem at assembly time and report it, so you won't end
up with a boot sector that you have to disassemble to find out what's
wrong with it.


File: nasm.info,  Node: Section 10.1.4,  Next: Section 10.2,  Prev: Section 10.1.3,  Up: Section 10.1

10.1.4. `TIMES' Doesn't Work
****************************

   The other common problem with the above code is people who write the
`TIMES' line as

               TIMES 510-$ DB 0

   by reasoning that `$' should be a pure number, just like 510, so the
difference between them is also a pure number and can happily be fed to
`TIMES'.

   NASM is a _modular_ assembler: the various component parts are
designed to be easily separable for re-use, so they don't exchange
information unnecessarily. In consequence, the `bin' output format,
even though it has been told by the `ORG' directive that the `.text'
section should start at 0, does not pass that information back to the
expression evaluator. So from the evaluator's point of view, `$' isn't
a pure number: it's an offset from a section base. Therefore the
difference between `$' and 510 is also not a pure number, but involves
a section base. Values involving section bases cannot be passed as
arguments to `TIMES'.

   The solution, as in the previous section, is to code the `TIMES' line
in the form

               TIMES 510-($-$$) DB 0

   in which `$' and `$$' are offsets from the same section base, and so
their difference is a pure number. This will solve the problem and
generate sensible code.


File: nasm.info,  Node: Section 10.2,  Next: Appendix A,  Prev: Section 10.1.4,  Up: Chapter 10

10.2. Bugs
**********

   We have never yet released a version of NASM with any _known_ bugs.
That doesn't usually stop there being plenty we didn't know about,
though.  Any that you find should be reported to `hpa@zytor.com'.

   Please read *Note Section 2.2:: first, and don't report the bug if
it's listed in there as a deliberate feature. (If you think the feature
is badly thought out, feel free to send us reasons why you think it
should be changed, but don't just send us mail saying `This is a bug'
if the documentation says we did it on purpose.) Then read *Note
Section 10.1::, and don't bother reporting the bug if it's listed there.

   If you do report a bug, _please_ give us all of the following
information:

   * What operating system you're running NASM under. DOS, Linux,
     NetBSD, Win16, Win32, VMS (I'd be impressed), whatever.

   * If you're running NASM under DOS or Win32, tell us whether you've
     compiled your own executable from the DOS source archive, or
     whether you were using the standard distribution binaries out of
     the archive. If you were using a locally built executable, try to
     reproduce the problem using one of the standard binaries, as this
     will make it easier for us to reproduce your problem prior to
     fixing it.

   * Which version of NASM you're using, and exactly how you invoked
     it. Give us the precise command line, and the contents of the
     `NASM' environment variable if any.

   * Which versions of any supplementary programs you're using, and how
     you invoked them. If the problem only becomes visible at link
     time, tell us what linker you're using, what version of it you've
     got, and the exact linker command line. If the problem involves
     linking against object files generated by a compiler, tell us what
     compiler, what version, and what command line or options you used.
     (If you're compiling in an IDE, please try to reproduce the
     problem with the command-line version of the compiler.)

   * If at all possible, send us a NASM source file which exhibits the
     problem.  If this causes copyright problems (e.g. you can only
     reproduce the bug in restricted-distribution code) then bear in
     mind the following two points: firstly, we guarantee that any
     source code sent to us for the purposes of debugging NASM will be
     used _only_ for the purposes of debugging NASM, and that we will
     delete all our copies of it as soon as we have found and fixed the
     bug or bugs in question; and secondly, we would prefer _not_ to be
     mailed large chunks of code anyway. The smaller the file, the
     better.  A three-line sample file that does nothing useful
     _except_ demonstrate the problem is much easier to work with than
     a fully fledged ten-thousand- line program. (Of course, some
     errors _do_ only crop up in large files, so this may not be
     possible.)

   * A description of what the problem actually _is_. `It doesn't work'
     is _not_ a helpful description! Please describe exactly what is
     happening that shouldn't be, or what isn't happening that should.
     Examples might be: `NASM generates an error message saying Line 3
     for an error that's actually on Line 5'; `NASM generates an error
     message that I believe it shouldn't be generating at all'; `NASM
     fails to generate an error message that I believe it _should_ be
     generating'; `the object file produced from this source code
     crashes my linker'; `the ninth byte of the output file is 66 and I
     think it should be 77 instead'.

   * If you believe the output file from NASM to be faulty, send it to
     us. That allows us to determine whether our own copy of NASM
     generates the same file, or whether the problem is related to
     portability issues between our development platforms and yours. We
     can handle binary files mailed to us as MIME attachments,
     uuencoded, and even BinHex. Alternatively, we may be able to
     provide an FTP site you can upload the suspect files to; but
     mailing them is easier for us.

   * Any other information or data files that might be helpful. If, for
     example, the problem involves NASM failing to generate an object
     file while TASM can generate an equivalent file without trouble,
     then send us _both_ object files, so we can see what TASM is doing
     differently from us.