This is nasm.info, produced by Makeinfo version 3.12f from nasmdoc.texi.

INFO-DIR-SECTION Programming
START-INFO-DIR-ENTRY
* NASM: (nasm).                The Netwide Assembler for x86.
END-INFO-DIR-ENTRY


   This file documents NASM, the Netwide Assembler: an assembler
targetting the Intel x86 series of processors, with portable source.

   Copyright 1997 Simon Tatham

   All rights reserved. This document is redistributable under the
licence given in the file "Licence" distributed in the NASM archive.


File: nasm.info,  Node: Top,  Next: Chapter 1,  Prev: (dir),  Up: (dir)



   This file documents NASM, the Netwide Assembler: an assembler
targetting the Intel x86 series of processors, with portable source.

* Menu:

* Chapter 1:: Introduction
* Chapter 2:: Running NASM
* Chapter 3:: The NASM Language
* Chapter 4:: The NASM Preprocessor
* Chapter 5:: Assembler Directives
* Chapter 6:: Output Formats
* Chapter 7:: Writing 16-bit Code (DOS, Windows 3/3.1)
* Chapter 8:: Writing 32-bit Code (Unix, Win32, DJGPP)
* Chapter 9:: Mixing 16 and 32 Bit Code
* Chapter 10:: Troubleshooting
* Appendix A:: Intel x86 Instruction Reference
* Index::


File: nasm.info,  Node: Chapter 1,  Next: Section 1.1,  Prev: Top,  Up: Top

Chapter 1: Introduction
***********************

* Menu:

* Section 1.1:: What Is NASM?
* Section 1.2:: Contact Information
* Section 1.3:: Installation


File: nasm.info,  Node: Section 1.1,  Next: Section 1.1.1,  Prev: Chapter 1,  Up: Chapter 1

1.1. What Is NASM?
******************

   The Netwide Assembler, NASM, is an 80x86 assembler designed for
portability and modularity. It supports a range of object file formats,
including Linux `a.out' and ELF, NetBSD/FreeBSD, COFF, Microsoft 16-bit
OBJ and Win32.  It will also output plain binary files. Its syntax is
designed to be simple and easy to understand, similar to Intel's but
less complex. It supports Pentium, P6 and MMX opcodes, and has macro
capability.

* Menu:

* Section 1.1.1:: Why Yet Another Assembler?
* Section 1.1.2:: Licence Conditions


File: nasm.info,  Node: Section 1.1.1,  Next: Section 1.1.2,  Prev: Section 1.1,  Up: Section 1.1

1.1.1. Why Yet Another Assembler?
*********************************

   The Netwide Assembler grew out of an idea on `comp.lang.asm.x86' (or
possibly `alt.lang.asm' - I forget which), which was essentially that
there didn't seem to be a good free x86-series assembler around, and
that maybe someone ought to write one.

   * `a86' is good, but not free, and in particular you don't get any
     32- bit capability until you pay. It's DOS only, too.

   * `gas' is free, and ports over DOS and Unix, but it's not very good,
     since it's designed to be a back end to `gcc', which always feeds
     it correct code. So its error checking is minimal. Also, its
     syntax is horrible, from the point of view of anyone trying to
     actually _write_ anything in it. Plus you can't write 16-bit code
     in it (properly).

   * `as86' is Linux-specific, and (my version at least) doesn't seem to
     have much (or any) documentation.

   * MASM isn't very good, and it's expensive, and it runs only under
     DOS.

   * TASM is better, but still strives for MASM compatibility, which
     means millions of directives and tons of red tape. And its syntax
     is essentially MASM's, with the contradictions and quirks that
     entails (although it sorts out some of those by means of Ideal
     mode). It's expensive too. And it's DOS-only.

   So here, for your coding pleasure, is NASM. At present it's still in
prototype stage - we don't promise that it can outperform any of these
assemblers. But please, _please_ send us bug reports, fixes, helpful
information, and anything else you can get your hands on (and thanks to
the many people who've done this already! You all know who you are),
and we'll improve it out of all recognition. Again.


File: nasm.info,  Node: Section 1.1.2,  Next: Section 1.2,  Prev: Section 1.1.1,  Up: Section 1.1

1.1.2. Licence Conditions
*************************

   Please see the file `Licence', supplied as part of any NASM
distribution archive, for the licence conditions under which you may use
NASM.


File: nasm.info,  Node: Section 1.2,  Next: Section 1.3,  Prev: Section 1.1.2,  Up: Chapter 1

1.2. Contact Information
************************

   The current version of NASM (since 0.98) are maintained by H. Peter
Anvin, `hpa@zytor.com'. If you want to report a bug, please read *Note
Section 10.2:: first.

   NASM has a WWW page at `http://www.cryogen.com/Nasm'.

   The original authors are e-mailable as `jules@earthcorp.com' and
`anakin@pobox.com'.

   New releases of NASM are uploaded to `ftp.kernel.org',
`sunsite.unc.edu', `ftp.simtel.net' and `ftp.coast.net'.  Announcements
are posted to `comp.lang.asm.x86', `alt.lang.asm',
`comp.os.linux.announce' and `comp.archives.msdos.announce' (the last
one is done automagically by uploading to `ftp.simtel.net').

   If you don't have Usenet access, or would rather be informed by
e-mail when new releases come out, you can subscribe to the
`nasm-announce' email list by sending an email containing the line
`subscribe nasm-announce' to `majordomo@linux.kernel.org'.

   If you want information about NASM beta releases, please subscribe
to the `nasm-beta' email list by sending an email containing the line
`subscribe nasm-beta' to `majordomo@linux.kernel.org'.


File: nasm.info,  Node: Section 1.3,  Next: Section 1.3.1,  Prev: Section 1.2,  Up: Chapter 1

1.3. Installation
*****************

* Menu:

* Section 1.3.1:: Installing NASM under MS-DOS or Windows
* Section 1.3.2:: Installing NASM under Unix


File: nasm.info,  Node: Section 1.3.1,  Next: Section 1.3.2,  Prev: Section 1.3,  Up: Section 1.3

1.3.1. Installing NASM under MS-DOS or Windows
**********************************************

   Once you've obtained the DOS archive for NASM, `nasmXXX.zip' (where
`XXX' denotes the version number of NASM contained in the archive),
unpack it into its own directory (for example `c:\nasm').

   The archive will contain four executable files: the NASM executable
files `nasm.exe' and `nasmw.exe', and the NDISASM executable files
`ndisasm.exe' and `ndisasmw.exe'. In each case, the file whose name
ends in `w' is a Win32 executable, designed to run under Windows 95 or
Windows NT Intel, and the other one is a 16-bit DOS executable.

   The only file NASM needs to run is its own executable, so copy (at
least) one of `nasm.exe' and `nasmw.exe' to a directory on your PATH, or
alternatively edit `autoexec.bat' to add the `nasm' directory to your
`PATH'. (If you're only installing the Win32 version, you may wish to
rename it to `nasm.exe'.)

   That's it - NASM is installed. You don't need the `nasm' directory to
be present to run NASM (unless you've added it to your `PATH'), so you
can delete it if you need to save space; however, you may want to keep
the documentation or test programs.

   If you've downloaded the DOS source archive, `nasmXXXs.zip', the
`nasm' directory will also contain the full NASM source code, and a
selection of Makefiles you can (hopefully) use to rebuild your copy of
NASM from scratch. The file `Readme' lists the various Makefiles and
which compilers they work with.

   Note that the source files `insnsa.c', `insnsd.c', `insnsi.h' and
`insnsn.c' are automatically generated from the master instruction
table `insns.dat' by a Perl script; the file `macros.c' is generated
from `standard.mac' by another Perl script. Although the NASM 0.98
distribution includes these generated files, you will need to rebuild
them (and hence, will need a Perl interpreter) if you change
`insns.dat', `standard.mac' or the documentation. It is possible future
source distributions may not include these files at all.  Ports of Perl
for a variety of platforms, including DOS and Windows, are available
from www.cpan.org.


File: nasm.info,  Node: Section 1.3.2,  Next: Chapter 2,  Prev: Section 1.3.1,  Up: Section 1.3

1.3.2. Installing NASM under Unix
*********************************

   Once you've obtained the Unix source archive for NASM,
`nasm-X.XX.tar.gz' (where `X.XX' denotes the version number of NASM
contained in the archive), unpack it into a directory such as
`/usr/local/src'. The archive, when unpacked, will create its own
subdirectory `nasm-X.XX'.

   NASM is an auto-configuring package: once you've unpacked it, `cd' to
the directory it's been unpacked into and type `./configure'. This
shell script will find the best C compiler to use for building NASM and
set up Makefiles accordingly.

   Once NASM has auto-configured, you can type `make' to build the
`nasm' and `ndisasm' binaries, and then `make install' to install them
in `/usr/local/bin' and install the man pages `nasm.1' and `ndisasm.1'
in `/usr/local/man/man1'.  Alternatively, you can give options such as
`--prefix' to the `configure' script (see the file `INSTALL' for more
details), or install the programs yourself.

   NASM also comes with a set of utilities for handling the RDOFF custom
object-file format, which are in the `rdoff' subdirectory of the NASM
archive. You can build these with `make rdf' and install them with
`make rdf_install', if you want them.

   If NASM fails to auto-configure, you may still be able to make it
compile by using the fall-back Unix makefile `Makefile.unx'. Copy or
rename that file to `Makefile' and try typing `make'. There is also a
`Makefile.unx' file in the `rdoff' subdirectory.


File: nasm.info,  Node: Chapter 2,  Next: Section 2.1,  Prev: Section 1.3.2,  Up: Top

Chapter 2: Running NASM
***********************

* Menu:

* Section 2.1:: NASM Command-Line Syntax
* Section 2.2:: Quick Start for MASM Users


File: nasm.info,  Node: Section 2.1,  Next: Section 2.1.1,  Prev: Chapter 2,  Up: Chapter 2

2.1. NASM Command-Line Syntax
*****************************

   To assemble a file, you issue a command of the form

     nasm -f <format> <filename> [-o <output>]

   For example,

     nasm -f elf myfile.asm

   will assemble `myfile.asm' into an ELF object file `myfile.o'.  And

     nasm -f bin myfile.asm -o myfile.com

   will assemble `myfile.asm' into a raw binary file `myfile.com'.

   To produce a listing file, with the hex codes output from NASM
displayed on the left of the original sources, use the `-l' option to
give a listing file name, for example:

     nasm -f coff myfile.asm -l myfile.lst

   To get further usage instructions from NASM, try typing

     nasm -h

   This will also list the available output file formats, and what they
are.

   If you use Linux but aren't sure whether your system is `a.out' or
ELF, type

     file nasm

   (in the directory in which you put the NASM binary when you
installed it).  If it says something like

     nasm: ELF 32-bit LSB executable i386 (386 and up) Version 1

   then your system is ELF, and you should use the option `-f elf' when
you want NASM to produce Linux object files. If it says

     nasm: Linux/i386 demand-paged executable (QMAGIC)

   or something similar, your system is `a.out', and you should use `-f
aout' instead (Linux `a.out' systems are considered obsolete, and are
rare these days.)

   Like Unix compilers and assemblers, NASM is silent unless it goes
wrong: you won't see any output at all, unless it gives error messages.

* Menu:

* Section 2.1.1:: The `-o' Option: Specifying the Output File Name
* Section 2.1.2:: The `-f' Option: Specifying the Output File Format
* Section 2.1.3:: The `-l' Option: Generating a Listing File
* Section 2.1.4:: The `-E' Option: Send Errors to a File
* Section 2.1.5:: The `-s' Option: Send Errors to `stdout'
* Section 2.1.6:: The `-i' Option: Include File Search Directories
* Section 2.1.7:: The `-p' Option: Pre-Include a File
* Section 2.1.8:: The `-d' Option:  Pre-Define a Macro
* Section 2.1.9:: The `-u' Option:  Undefine a Macro
* Section 2.1.10:: The `-e' Option: Preprocess Only
* Section 2.1.11:: The `-a' Option: Don't Preprocess At All
* Section 2.1.12:: The `-w' Option: Enable or Disable Assembly Warnings
* Section 2.1.13:: The `NASM' Environment Variable


File: nasm.info,  Node: Section 2.1.1,  Next: Section 2.1.2,  Prev: Section 2.1,  Up: Section 2.1

2.1.1. The `-o' Option: Specifying the Output File Name
*******************************************************

   NASM will normally choose the name of your output file for you;
precisely how it does this is dependent on the object file format. For
Microsoft object file formats (`obj' and `win32'), it will remove the
`.asm' extension (or whatever extension you like to use - NASM doesn't
care) from your source file name and substitute `.obj'. For Unix object
file formats (`aout', `coff', `elf' and `as86') it will substitute
`.o'. For `rdf', it will use `.rdf', and for the `bin' format it will
simply remove the extension, so that `myfile.asm' produces the output
file `myfile'.

   If the output file already exists, NASM will overwrite it, unless it
has the same name as the input file, in which case it will give a
warning and use `nasm.out' as the output file name instead.

   For situations in which this behaviour is unacceptable, NASM
provides the `-o' command-line option, which allows you to specify your
desired output file name. You invoke `-o' by following it with the name
you wish for the output file, either with or without an intervening
space. For example:

     nasm -f bin program.asm -o program.com
     nasm -f bin driver.asm -odriver.sys


File: nasm.info,  Node: Section 2.1.2,  Next: Section 2.1.3,  Prev: Section 2.1.1,  Up: Section 2.1

2.1.2. The `-f' Option: Specifying the Output File Format
*********************************************************

   If you do not supply the `-f' option to NASM, it will choose an
output file format for you itself. In the distribution versions of
NASM, the default is always `bin'; if you've compiled your own copy of
NASM, you can redefine `OF_DEFAULT' at compile time and choose what you
want the default to be.

   Like `-o', the intervening space between `-f' and the output file
format is optional; so `-f elf' and `-felf' are both valid.

   A complete list of the available output file formats can be given by
issuing the command `nasm -h'.


File: nasm.info,  Node: Section 2.1.3,  Next: Section 2.1.4,  Prev: Section 2.1.2,  Up: Section 2.1

2.1.3. The `-l' Option: Generating a Listing File
*************************************************

   If you supply the `-l' option to NASM, followed (with the usual
optional space) by a file name, NASM will generate a source-listing file
for you, in which addresses and generated code are listed on the left,
and the actual source code, with expansions of multi-line macros
(except those which specifically request no expansion in source
listings: see *Note Section 4.2.9::) on the right. For example:

     nasm -f elf myfile.asm -l myfile.lst


File: nasm.info,  Node: Section 2.1.4,  Next: Section 2.1.5,  Prev: Section 2.1.3,  Up: Section 2.1

2.1.4. The `-E' Option: Send Errors to a File
*********************************************

   Under MS-DOS it can be difficult (though there are ways) to redirect
the standard-error output of a program to a file. Since NASM usually
produces its warning and error messages on `stderr', this can make it
hard to capture the errors if (for example) you want to load them into
an editor.

   NASM therefore provides the `-E' option, taking a filename argument
which causes errors to be sent to the specified files rather than
standard error. Therefore you can redirect the errors into a file by
typing

     nasm -E myfile.err -f obj myfile.asm


File: nasm.info,  Node: Section 2.1.5,  Next: Section 2.1.6,  Prev: Section 2.1.4,  Up: Section 2.1

2.1.5. The `-s' Option: Send Errors to `stdout'
***********************************************

   The `-s' option redirects error messages to `stdout' rather than
`stderr', so it can be redirected under MS-DOS. To assemble the file
`myfile.asm' and pipe its output to the `more' program, you can type:

     nasm -s -f obj myfile.asm | more

   See also the `-E' option, *Note Section 2.1.4::.


File: nasm.info,  Node: Section 2.1.6,  Next: Section 2.1.7,  Prev: Section 2.1.5,  Up: Section 2.1

2.1.6. The `-i' Option: Include File Search Directories
*******************************************************

   When NASM sees the `%include' directive in a source file (see *Note
Section 4.5::), it will search for the given file not only in the
current directory, but also in any directories specified on the command
line by the use of the `-i' option. Therefore you can include files
from a macro library, for example, by typing

     nasm -ic:\macrolib\ -f obj myfile.asm

   (As usual, a space between `-i' and the path name is allowed, and
optional).

   NASM, in the interests of complete source-code portability, does not
understand the file naming conventions of the OS it is running on; the
string you provide as an argument to the `-i' option will be prepended
exactly as written to the name of the include file. Therefore the
trailing backslash in the above example is necessary. Under Unix, a
trailing forward slash is similarly necessary.

   (You can use this to your advantage, if you're really perverse, by
noting that the option `-ifoo' will cause `%include "bar.i"' to search
for the file `foobar.i'...)

   If you want to define a _standard_ include search path, similar to
`/usr/include' on Unix systems, you should place one or more `-i'
directives in the `NASM' environment variable (see *Note Section
2.1.13::).

   For Makefile compatibility with many C compilers, this option can
also be specified as `-I'.


File: nasm.info,  Node: Section 2.1.7,  Next: Section 2.1.8,  Prev: Section 2.1.6,  Up: Section 2.1

2.1.7. The `-p' Option: Pre-Include a File
******************************************

   NASM allows you to specify files to be _pre-included_ into your
source file, by the use of the `-p' option. So running

     nasm myfile.asm -p myinc.inc

   is equivalent to running `nasm myfile.asm' and placing the directive
`%include "myinc.inc"' at the start of the file.

   For consistency with the `-I', `-D' and `-U' options, this option
can also be specified as `-P'.


File: nasm.info,  Node: Section 2.1.8,  Next: Section 2.1.9,  Prev: Section 2.1.7,  Up: Section 2.1

2.1.8. The `-d' Option:  Pre-Define a Macro
*******************************************

   Just as the `-p' option gives an alternative to placing `%include'
directives at the start of a source file, the `-d' option gives an
alternative to placing a `%define' directive. You could code

     nasm myfile.asm -dFOO=100

   as an alternative to placing the directive

     %define FOO 100

   at the start of the file. You can miss off the macro value, as well:
the option `-dFOO' is equivalent to coding `%define FOO'. This form of
the directive may be useful for selecting assembly-time options which
are then tested using `%ifdef', for example `-dDEBUG'.

   For Makefile compatibility with many C compilers, this option can
also be specified as `-D'.


File: nasm.info,  Node: Section 2.1.9,  Next: Section 2.1.10,  Prev: Section 2.1.8,  Up: Section 2.1

2.1.9. The `-u' Option:  Undefine a Macro
*****************************************

   The `-u' option undefines a macro that would otherwise have been pre-
defined, either automatically or by a `-p' or `-d' option specified
earlier on the command lines.

   For example, the following command line:

     nasm myfile.asm -dFOO=100 -uFOO

   would result in `FOO' _not_ being a predefined macro in the program.
This is useful to override options specified at a different point in a
Makefile.

   For Makefile compatibility with many C compilers, this option can
also be specified as `-U'.


File: nasm.info,  Node: Section 2.1.10,  Next: Section 2.1.11,  Prev: Section 2.1.9,  Up: Section 2.1

2.1.10. The `-e' Option: Preprocess Only
****************************************

   NASM allows the preprocessor to be run on its own, up to a point.
Using the `-e' option (which requires no arguments) will cause NASM to
preprocess its input file, expand all the macro references, remove all
the comments and preprocessor directives, and print the resulting file
on standard output (or save it to a file, if the `-o' option is also
used).

   This option cannot be applied to programs which require the
preprocessor to evaluate expressions which depend on the values of
symbols: so code such as

     %assign tablesize ($-tablestart)

   will cause an error in preprocess-only mode.


File: nasm.info,  Node: Section 2.1.11,  Next: Section 2.1.12,  Prev: Section 2.1.10,  Up: Section 2.1

2.1.11. The `-a' Option: Don't Preprocess At All
************************************************

   If NASM is being used as the back end to a compiler, it might be
desirable to suppress preprocessing completely and assume the compiler
has already done it, to save time and increase compilation speeds. The
`-a' option, requiring no argument, instructs NASM to replace its
powerful preprocessor with a stub preprocessor which does nothing.


File: nasm.info,  Node: Section 2.1.12,  Next: Section 2.1.13,  Prev: Section 2.1.11,  Up: Section 2.1

2.1.12. The `-w' Option: Enable or Disable Assembly Warnings
************************************************************

   NASM can observe many conditions during the course of assembly which
are worth mentioning to the user, but not a sufficiently severe error to
justify NASM refusing to generate an output file. These conditions are
reported like errors, but come up with the word `warning' before the
message. Warnings do not prevent NASM from generating an output file and
returning a success status to the operating system.

   Some conditions are even less severe than that: they are only
sometimes worth mentioning to the user. Therefore NASM supports the `-w'
command-line option, which enables or disables certain classes of
assembly warning. Such warning classes are described by a name, for
example `orphan-labels'; you can enable warnings of this class by the
command- line option `-w+orphan-labels' and disable it by
`-w-orphan-labels'.

   The suppressible warning classes are:

   * `macro-params' covers warnings about multi-line macros being
     invoked with the wrong number of parameters. This warning class is
     enabled by default; see *Note Section 4.2.1:: for an example of
     why you might want to disable it.

   * `orphan-labels' covers warnings about source lines which contain no
     instruction but define a label without a trailing colon. NASM does
     not warn about this somewhat obscure condition by default; see
     *Note Section 3.1:: for an example of why you might want it to.

   * `number-overflow' covers warnings about numeric constants which
     don't fit in 32 bits (for example, it's easy to type one too many
     Fs and produce `0x7ffffffff' by mistake). This warning class is
     enabled by default.


File: nasm.info,  Node: Section 2.1.13,  Next: Section 2.2,  Prev: Section 2.1.12,  Up: Section 2.1

2.1.13. The `NASM' Environment Variable
***************************************

   If you define an environment variable called `NASM', the program will
interpret it as a list of extra command-line options, which are
processed before the real command line. You can use this to define
standard search directories for include files, by putting `-i' options
in the `NASM' variable.

   The value of the variable is split up at white space, so that the
value `-s -ic:\nasmlib' will be treated as two separate options.
However, that means that the value `-dNAME="my name"' won't do what you
might want, because it will be split at the space and the NASM
command-line processing will get confused by the two nonsensical words
`-dNAME="my' and `name"'.

   To get round this, NASM provides a feature whereby, if you begin the
`NASM' environment variable with some character that isn't a minus
sign, then NASM will treat this character as the separator character for
options. So setting the `NASM' variable to the value `!-s!-ic:\nasmlib'
is equivalent to setting it to `-s -ic:\nasmlib', but `!-dNAME="my
name"' will work.


File: nasm.info,  Node: Section 2.2,  Next: Section 2.2.1,  Prev: Section 2.1.13,  Up: Chapter 2

2.2. Quick Start for MASM Users
*******************************

   If you're used to writing programs with MASM, or with TASM in MASM-
compatible (non-Ideal) mode, or with `a86', this section attempts to
outline the major differences between MASM's syntax and NASM's. If
you're not already used to MASM, it's probably worth skipping this
section.

* Menu:

* Section 2.2.1:: NASM Is Case-Sensitive
* Section 2.2.2:: NASM Requires Square Brackets For Memory References
* Section 2.2.3:: NASM Doesn't Store Variable Types
* Section 2.2.4:: NASM Doesn't `ASSUME'
* Section 2.2.5:: NASM Doesn't Support Memory Models
* Section 2.2.6:: Floating-Point Differences
* Section 2.2.7:: Other Differences


File: nasm.info,  Node: Section 2.2.1,  Next: Section 2.2.2,  Prev: Section 2.2,  Up: Section 2.2

2.2.1. NASM Is Case-Sensitive
*****************************

   One simple difference is that NASM is case-sensitive. It makes a
difference whether you call your label `foo', `Foo' or `FOO'. If you're
assembling to DOS or OS/2 `.OBJ' files, you can invoke the `UPPERCASE'
directive (documented in *Note Section 6.2::) to ensure that all
symbols exported to other code modules are forced to be upper case; but
even then, _within_ a single module, NASM will distinguish between
labels differing only in case.


File: nasm.info,  Node: Section 2.2.2,  Next: Section 2.2.3,  Prev: Section 2.2.1,  Up: Section 2.2

2.2.2. NASM Requires Square Brackets For Memory References
**********************************************************

   NASM was designed with simplicity of syntax in mind. One of the
design goals of NASM is that it should be possible, as far as is
practical, for the user to look at a single line of NASM code and tell
what opcode is generated by it. You can't do this in MASM: if you
declare, for example,

     foo       equ 1
     bar       dw 2

   then the two lines of code

               mov ax,foo
               mov ax,bar

   generate completely different opcodes, despite having
identical-looking syntaxes.

   NASM avoids this undesirable situation by having a much simpler
syntax for memory references. The rule is simply that any access to the
_contents_ of a memory location requires square brackets around the
address, and any access to the _address_ of a variable doesn't. So an
instruction of the form `mov ax,foo' will _always_ refer to a
compile-time constant, whether it's an `EQU' or the address of a
variable; and to access the _contents_ of the variable `bar', you must
code `mov ax,[bar]'.

   This also means that NASM has no need for MASM's `OFFSET' keyword,
since the MASM code `mov ax,offset bar' means exactly the same thing as
NASM's `mov ax,bar'. If you're trying to get large amounts of MASM code
to assemble sensibly under NASM, you can always code `%idefine offset'
to make the preprocessor treat the `OFFSET' keyword as a no-op.

   This issue is even more confusing in `a86', where declaring a label
with a trailing colon defines it to be a `label' as opposed to a
`variable' and causes `a86' to adopt NASM-style semantics; so in `a86',
`mov ax,var' has different behaviour depending on whether `var' was
declared as `var: dw 0' (a label) or `var dw 0' (a word-size variable).
NASM is very simple by comparison: _everything_ is a label.

   NASM, in the interests of simplicity, also does not support the
hybrid syntaxes supported by MASM and its clones, such as `mov
ax,table[bx]', where a memory reference is denoted by one portion
outside square brackets and another portion inside. The correct syntax
for the above is `mov ax,[table+bx]'. Likewise, `mov ax,es:[di]' is
wrong and `mov ax,[es:di]' is right.


File: nasm.info,  Node: Section 2.2.3,  Next: Section 2.2.4,  Prev: Section 2.2.2,  Up: Section 2.2

2.2.3. NASM Doesn't Store Variable Types
****************************************

   NASM, by design, chooses not to remember the types of variables you
declare. Whereas MASM will remember, on seeing `var dw 0', that you
declared `var' as a word-size variable, and will then be able to fill
in the ambiguity in the size of the instruction `mov var,2', NASM will
deliberately remember nothing about the symbol `var' except where it
begins, and so you must explicitly code `mov word [var],2'.

   For this reason, NASM doesn't support the `LODS', `MOVS', `STOS',
`SCAS', `CMPS', `INS', or `OUTS' instructions, but only supports the
forms such as `LODSB', `MOVSW', and `SCASD', which explicitly specify
the size of the components of the strings being manipulated.


File: nasm.info,  Node: Section 2.2.4,  Next: Section 2.2.5,  Prev: Section 2.2.3,  Up: Section 2.2

2.2.4. NASM Doesn't `ASSUME'
****************************

   As part of NASM's drive for simplicity, it also does not support the
`ASSUME' directive. NASM will not keep track of what values you choose
to put in your segment registers, and will never _automatically_
generate a segment override prefix.


File: nasm.info,  Node: Section 2.2.5,  Next: Section 2.2.6,  Prev: Section 2.2.4,  Up: Section 2.2

2.2.5. NASM Doesn't Support Memory Models
*****************************************

   NASM also does not have any directives to support different 16-bit
memory models. The programmer has to keep track of which functions are
supposed to be called with a far call and which with a near call, and
is responsible for putting the correct form of `RET' instruction
(`RETN' or `RETF'; NASM accepts `RET' itself as an alternate form for
`RETN'); in addition, the programmer is responsible for coding CALL FAR
instructions where necessary when calling _external_ functions, and
must also keep track of which external variable definitions are far and
which are near.


File: nasm.info,  Node: Section 2.2.6,  Next: Section 2.2.7,  Prev: Section 2.2.5,  Up: Section 2.2

2.2.6. Floating-Point Differences
*********************************

   NASM uses different names to refer to floating-point registers from
MASM: where MASM would call them `ST(0)', `ST(1)' and so on, and `a86'
would call them simply `0', `1' and so on, NASM chooses to call them
`st0', `st1' etc.

   As of version 0.96, NASM now treats the instructions with `nowait'
forms in the same way as MASM-compatible assemblers. The idiosyncratic
treatment employed by 0.95 and earlier was based on a misunderstanding
by the authors.


File: nasm.info,  Node: Section 2.2.7,  Next: Chapter 3,  Prev: Section 2.2.6,  Up: Section 2.2

2.2.7. Other Differences
************************

   For historical reasons, NASM uses the keyword `TWORD' where MASM and
compatible assemblers use `TBYTE'.

   NASM does not declare uninitialised storage in the same way as MASM:
where a MASM programmer might use `stack db 64 dup (?)', NASM requires
`stack resb 64', intended to be read as `reserve 64 bytes'. For a
limited amount of compatibility, since NASM treats `?' as a valid
character in symbol names, you can code `? equ 0' and then writing `dw
?' will at least do something vaguely useful. `DUP' is still not a
supported syntax, however.

   In addition to all of this, macros and directives work completely
differently to MASM. See *Note Chapter 4:: and *Note Chapter 5:: for
further details.


File: nasm.info,  Node: Chapter 3,  Next: Section 3.1,  Prev: Section 2.2.7,  Up: Top

Chapter 3: The NASM Language
****************************

* Menu:

* Section 3.1:: Layout of a NASM Source Line
* Section 3.2:: Pseudo-Instructions
* Section 3.3:: Effective Addresses
* Section 3.4:: Constants
* Section 3.5:: Expressions
* Section 3.6:: `SEG' and `WRT'
* Section 3.7:: Critical Expressions
* Section 3.8:: Local Labels


File: nasm.info,  Node: Section 3.1,  Next: Section 3.2,  Prev: Chapter 3,  Up: Chapter 3

3.1. Layout of a NASM Source Line
*********************************

   Like most assemblers, each NASM source line contains (unless it is a
macro, a preprocessor directive or an assembler directive: see *Note
Chapter 4:: and *Note Chapter 5::) some combination of the four fields

     label:    instruction operands        ; comment

   As usual, most of these fields are optional; the presence or absence
of any combination of a label, an instruction and a comment is allowed.
Of course, the operand field is either required or forbidden by the
presence and nature of the instruction field.

   NASM places no restrictions on white space within a line: labels may
have white space before them, or instructions may have no space before
them, or anything. The colon after a label is also optional. (Note that
this means that if you intend to code `lodsb' alone on a line, and type
`lodab' by accident, then that's still a valid source line which does
nothing but define a label. Running NASM with the command-line option
`-w+orphan-labels' will cause it to warn you if you define a label
alone on a line without a trailing colon.)

   Valid characters in labels are letters, numbers, `_', `$', `#', `@',
`~', `.', and `?'. The only characters which may be used as the _first_
character of an identifier are letters, `.' (with special meaning: see
*Note Section 3.8::), `_' and `?'. An identifier may also be prefixed
with a `$' to indicate that it is intended to be read as an identifier
and not a reserved word; thus, if some other module you are linking
with defines a symbol called `eax', you can refer to `$eax' in NASM
code to distinguish the symbol from the register.

   The instruction field may contain any machine instruction: Pentium
and P6 instructions, FPU instructions, MMX instructions and even
undocumented instructions are all supported. The instruction may be
prefixed by `LOCK', `REP', `REPE'/`REPZ' or `REPNE'/`REPNZ', in the
usual way. Explicit address-size and operand-size prefixes `A16',
`A32', `O16' and `O32' are provided - one example of their use is given
in *Note Chapter 9::. You can also use the name of a segment register
as an instruction prefix: coding `es mov [bx],ax' is equivalent to
coding `mov [es:bx],ax'. We recommend the latter syntax, since it is
consistent with other syntactic features of the language, but for
instructions such as `LODSB', which has no operands and yet can require
a segment override, there is no clean syntactic way to proceed apart
from `es lodsb'.

   An instruction is not required to use a prefix: prefixes such as
`CS', `A32', `LOCK' or `REPE' can appear on a line by themselves, and
NASM will just generate the prefix bytes.

   In addition to actual machine instructions, NASM also supports a
number of pseudo-instructions, described in *Note Section 3.2::.

   Instruction operands may take a number of forms: they can be
registers, described simply by the register name (e.g. `ax', `bp',
`ebx', `cr0': NASM does not use the `gas'-style syntax in which
register names must be prefixed by a `%' sign), or they can be
effective addresses (see *Note Section 3.3::), constants (*Note Section
3.4::) or expressions (*Note Section 3.5::).

   For floating-point instructions, NASM accepts a wide range of
syntaxes: you can use two-operand forms like MASM supports, or you can
use NASM's native single-operand forms in most cases. Details of all
forms of each supported instruction are given in *Note Appendix A::.
For example, you can code:

               fadd st1               ; this sets st0 := st0 + st1
               fadd st0,st1           ; so does this
     
               fadd st1,st0           ; this sets st1 := st1 + st0
               fadd to st1            ; so does this

   Almost any floating-point instruction that references memory must
use one of the prefixes `DWORD', `QWORD' or `TWORD' to indicate what
size of memory operand it refers to.


File: nasm.info,  Node: Section 3.2,  Next: Section 3.2.1,  Prev: Section 3.1,  Up: Chapter 3

3.2. Pseudo-Instructions
************************

   Pseudo-instructions are things which, though not real x86 machine
instructions, are used in the instruction field anyway because that's
the most convenient place to put them. The current pseudo-instructions
are `DB', `DW', `DD', `DQ' and `DT', their uninitialised counterparts
`RESB', `RESW', `RESD', `RESQ' and `REST', the `INCBIN' command, the
`EQU' command, and the `TIMES' prefix.

* Menu:

* Section 3.2.1:: `DB' and friends: Declaring Initialised Data
* Section 3.2.2:: `RESB' and friends: Declaring Uninitialised Data
* Section 3.2.3:: `INCBIN': Including External Binary Files
* Section 3.2.4:: `EQU': Defining Constants
* Section 3.2.5:: `TIMES': Repeating Instructions or Data


File: nasm.info,  Node: Section 3.2.1,  Next: Section 3.2.2,  Prev: Section 3.2,  Up: Section 3.2

3.2.1. `DB' and friends: Declaring Initialised Data
***************************************************

   `DB', `DW', `DD', `DQ' and `DT' are used, much as in MASM, to
declare initialised data in the output file. They can be invoked in a
wide range of ways:

               db 0x55                ; just the byte 0x55
               db 0x55,0x56,0x57      ; three bytes in succession
               db 'a',0x55            ; character constants are OK
               db 'hello',13,10,'$'   ; so are string constants
               dw 0x1234              ; 0x34 0x12
               dw 'a'                 ; 0x41 0x00 (it's just a number)
               dw 'ab'                ; 0x41 0x42 (character constant)
               dw 'abc'               ; 0x41 0x42 0x43 0x00 (string)
               dd 0x12345678          ; 0x78 0x56 0x34 0x12
               dd 1.234567e20         ; floating-point constant
               dq 1.234567e20         ; double-precision float
               dt 1.234567e20         ; extended-precision float

   `DQ' and `DT' do not accept numeric constants or string constants as
operands.


File: nasm.info,  Node: Section 3.2.2,  Next: Section 3.2.3,  Prev: Section 3.2.1,  Up: Section 3.2

3.2.2. `RESB' and friends: Declaring Uninitialised Data
*******************************************************

   `RESB', `RESW', `RESD', `RESQ' and `REST' are designed to be used in
the BSS section of a module: they declare _uninitialised_ storage
space. Each takes a single operand, which is the number of bytes,
words, doublewords or whatever to reserve. As stated in *Note Section
2.2.7::, NASM does not support the MASM/TASM syntax of reserving
uninitialised space by writing `DW ?' or similar things: this is what
it does instead. The operand to a `RESB'-type pseudo- instruction is a
_critical expression_: see *Note Section 3.7::.

   For example:

     buffer:   resb 64                ; reserve 64 bytes
     wordvar:  resw 1                 ; reserve a word
     realarray resq 10                ; array of ten reals


File: nasm.info,  Node: Section 3.2.3,  Next: Section 3.2.4,  Prev: Section 3.2.2,  Up: Section 3.2

3.2.3. `INCBIN': Including External Binary Files
************************************************

   `INCBIN' is borrowed from the old Amiga assembler DevPac: it includes
a binary file verbatim into the output file. This can be handy for (for
example) including graphics and sound data directly into a game
executable file. It can be called in one of these three ways:

               incbin "file.dat"      ; include the whole file
               incbin "file.dat",1024 ; skip the first 1024 bytes
               incbin "file.dat",1024,512 ; skip the first 1024, and
                                      ; actually include at most 512


File: nasm.info,  Node: Section 3.2.4,  Next: Section 3.2.5,  Prev: Section 3.2.3,  Up: Section 3.2

3.2.4. `EQU': Defining Constants
********************************

   `EQU' defines a symbol to a given constant value: when `EQU' is
used, the source line must contain a label. The action of `EQU' is to
define the given label name to the value of its (only) operand. This
definition is absolute, and cannot change later. So, for example,

     message   db 'hello, world'
     msglen    equ $-message

   defines `msglen' to be the constant 12. `msglen' may not then be
redefined later. This is not a preprocessor definition either: the
value of `msglen' is evaluated _once_, using the value of `$' (see
*Note Section 3.5:: for an explanation of `$') at the point of
definition, rather than being evaluated wherever it is referenced and
using the value of `$' at the point of reference. Note that the operand
to an `EQU' is also a critical expression (*Note Section 3.7::).


File: nasm.info,  Node: Section 3.2.5,  Next: Section 3.3,  Prev: Section 3.2.4,  Up: Section 3.2

3.2.5. `TIMES': Repeating Instructions or Data
**********************************************

   The `TIMES' prefix causes the instruction to be assembled multiple
times. This is partly present as NASM's equivalent of the `DUP' syntax
supported by MASM-compatible assemblers, in that you can code

     zerobuf:  times 64 db 0

   or similar things; but `TIMES' is more versatile than that. The
argument to `TIMES' is not just a numeric constant, but a numeric
_expression_, so you can do things like

     buffer:   db 'hello, world'
               times 64-$+buffer db ' '

   which will store exactly enough spaces to make the total length of
`buffer' up to 64. Finally, `TIMES' can be applied to ordinary
instructions, so you can code trivial unrolled loops in it:

               times 100 movsb

   Note that there is no effective difference between `times 100 resb 1'
and `resb 100', except that the latter will be assembled about 100
times faster due to the internal structure of the assembler.

   The operand to `TIMES', like that of `EQU' and those of `RESB' and
friends, is a critical expression (*Note Section 3.7::).

   Note also that `TIMES' can't be applied to macros: the reason for
this is that `TIMES' is processed after the macro phase, which allows
the argument to `TIMES' to contain expressions such as `64-$+buffer' as
above. To repeat more than one line of code, or a complex macro, use the
preprocessor `%rep' directive.


File: nasm.info,  Node: Section 3.3,  Next: Section 3.4,  Prev: Section 3.2.5,  Up: Chapter 3

3.3. Effective Addresses
************************

   An effective address is any operand to an instruction which
references memory. Effective addresses, in NASM, have a very simple
syntax: they consist of an expression evaluating to the desired
address, enclosed in square brackets. For example:

     wordvar   dw 123
               mov ax,[wordvar]
               mov ax,[wordvar+1]
               mov ax,[es:wordvar+bx]

   Anything not conforming to this simple system is not a valid memory
reference in NASM, for example `es:wordvar[bx]'.

   More complicated effective addresses, such as those involving more
than one register, work in exactly the same way:

               mov eax,[ebx*2+ecx+offset]
               mov ax,[bp+di+8]

   NASM is capable of doing algebra on these effective addresses, so
that things which don't necessarily _look_ legal are perfectly all
right:

               mov eax,[ebx*5]        ; assembles as [ebx*4+ebx]
               mov eax,[label1*2-label2] ; ie [label1+(label1-label2)]

   Some forms of effective address have more than one assembled form;
in most such cases NASM will generate the smallest form it can. For
example, there are distinct assembled forms for the 32-bit effective
addresses `[eax*2+0]' and `[eax+eax]', and NASM will generally generate
the latter on the grounds that the former requires four bytes to store
a zero offset.

   NASM has a hinting mechanism which will cause `[eax+ebx]' and
`[ebx+eax]' to generate different opcodes; this is occasionally useful
because `[esi+ebp]' and `[ebp+esi]' have different default segment
registers.

   However, you can force NASM to generate an effective address in a
particular form by the use of the keywords `BYTE', `WORD', `DWORD' and
`NOSPLIT'. If you need `[eax+3]' to be assembled using a double-word
offset field instead of the one byte NASM will normally generate, you
can code `[dword eax+3]'. Similarly, you can force NASM to use a byte
offset for a small value which it hasn't seen on the first pass (see
*Note Section 3.7:: for an example of such a code fragment) by using
`[byte eax+offset]'. As special cases, `[byte eax]' will code `[eax+0]'
with a byte offset of zero, and `[dword eax]' will code it with a
double-word offset of zero. The normal form, `[eax]', will be coded
with no offset field.

   Similarly, NASM will split `[eax*2]' into `[eax+eax]' because that
allows the offset field to be absent and space to be saved; in fact, it
will also split `[eax*2+offset]' into `[eax+eax+offset]'. You can
combat this behaviour by the use of the `NOSPLIT' keyword: `[nosplit
eax*2]' will force `[eax*2+0]' to be generated literally.


File: nasm.info,  Node: Section 3.4,  Next: Section 3.4.1,  Prev: Section 3.3,  Up: Chapter 3

3.4. Constants
**************

   NASM understands four different types of constant: numeric,
character, string and floating-point.

* Menu:

* Section 3.4.1:: Numeric Constants
* Section 3.4.2:: Character Constants
* Section 3.4.3:: String Constants
* Section 3.4.4:: Floating-Point Constants


File: nasm.info,  Node: Section 3.4.1,  Next: Section 3.4.2,  Prev: Section 3.4,  Up: Section 3.4

3.4.1. Numeric Constants
************************

   A numeric constant is simply a number. NASM allows you to specify
numbers in a variety of number bases, in a variety of ways: you can
suffix `H', `Q' and `B' for hex, octal and binary, or you can prefix
`0x' for hex in the style of C, or you can prefix `$' for hex in the
style of Borland Pascal. Note, though, that the `$' prefix does double
duty as a prefix on identifiers (see *Note Section 3.1::), so a hex
number prefixed with a `$' sign must have a digit after the `$' rather
than a letter.

   Some examples:

               mov ax,100             ; decimal
               mov ax,0a2h            ; hex
               mov ax,$0a2            ; hex again: the 0 is required
               mov ax,0xa2            ; hex yet again
               mov ax,777q            ; octal
               mov ax,10010011b       ; binary


File: nasm.info,  Node: Section 3.4.2,  Next: Section 3.4.3,  Prev: Section 3.4.1,  Up: Section 3.4

3.4.2. Character Constants
**************************

   A character constant consists of up to four characters enclosed in
either single or double quotes. The type of quote makes no difference
to NASM, except of course that surrounding the constant with single
quotes allows double quotes to appear within it and vice versa.

   A character constant with more than one character will be arranged
with little-endian order in mind: if you code

               mov eax,'abcd'

   then the constant generated is not `0x61626364', but `0x64636261',
so that if you were then to store the value into memory, it would read
`abcd' rather than `dcba'. This is also the sense of character
constants understood by the Pentium's `CPUID' instruction (see *Note
Section A.22::).


File: nasm.info,  Node: Section 3.4.3,  Next: Section 3.4.4,  Prev: Section 3.4.2,  Up: Section 3.4

3.4.3. String Constants
***********************

   String constants are only acceptable to some pseudo-instructions,
namely the `DB' family and `INCBIN'.

   A string constant looks like a character constant, only longer. It is
treated as a concatenation of maximum-size character constants for the
conditions. So the following are equivalent:

               db 'hello'             ; string constant
               db 'h','e','l','l','o' ; equivalent character constants

   And the following are also equivalent:

               dd 'ninechars'         ; doubleword string constant
               dd 'nine','char','s'   ; becomes three doublewords
               db 'ninechars',0,0,0   ; and really looks like this

   Note that when used as an operand to `db', a constant like `'ab'' is
treated as a string constant despite being short enough to be a
character constant, because otherwise `db 'ab'' would have the same
effect as `db 'a'', which would be silly. Similarly, three-character or
four-character constants are treated as strings when they are operands
to `dw'.


File: nasm.info,  Node: Section 3.4.4,  Next: Section 3.5,  Prev: Section 3.4.3,  Up: Section 3.4

3.4.4. Floating-Point Constants
*******************************

   Floating-point constants are acceptable only as arguments to `DD',
`DQ' and `DT'. They are expressed in the traditional form: digits, then
a period, then optionally more digits, then optionally an `E' followed
by an exponent. The period is mandatory, so that NASM can distinguish
between `dd 1', which declares an integer constant, and `dd 1.0' which
declares a floating-point constant.

   Some examples:

               dd 1.2                 ; an easy one
               dq 1.e10               ; 10,000,000,000
               dq 1.e+10              ; synonymous with 1.e10
               dq 1.e-10              ; 0.000 000 000 1
               dt 3.141592653589793238462 ; pi

   NASM cannot do compile-time arithmetic on floating-point constants.
This is because NASM is designed to be portable - although it always
generates code to run on x86 processors, the assembler itself can run
on any system with an ANSI C compiler. Therefore, the assembler cannot
guarantee the presence of a floating-point unit capable of handling the
Intel number formats, and so for NASM to be able to do floating
arithmetic it would have to include its own complete set of
floating-point routines, which would significantly increase the size of
the assembler for very little benefit.