MASM Assembly
Introduction
The Microsoft Macro Assembler (MASM) provides several advantages over inline assembly. MASM contains macro features that include loops, arithmetic, and text string processing. MASM gives you greater control over your hardware such as CPU and memory.
MASM is generally used for programming firmware, developing operating systems, and programming at system level.
Syntax
There are several assembly syntax types, the two most important are:
Intel Syntax - Widely used in Windows Operating Systems. Friendly look.

AT&T Syntax - Generally used in Unix Systems. Default for GDB debugger.

Introduction to Registers
CPU Registers are small, high-speed storage locations within the CPU used to store data and addresses during the execution of instructions. Registers are the single place where mathematical functions (additions, multiplication, subtractions) can be carried out. Registers often hold pointers that refer to the memory.
Types of Registers
CPU registers can mainly be classified into 4 different categories.
General Purpose Registers
Segment Registers
Special purpose application-accessible registers
Special Purpose Kernel-Mode Registers
In this page we will only go over general purpose registers since their commonly used by programmers.
General Purpose Registers
Used to store temporary data. It's content can be accessed by assembly programming. Numbered: R0, R1, R2,...Rn-1.
Windows x86 Architecture
In Windows x86 the general purpose registers look like this:
EAX
AX
AH / AL
EBX
BX
BH / BL
ECX
CX
CH / CL
EDX
DX
DH / DL
ESI
SI
SIL
EDI
DI
DIL
EBP
BP
BPL
ESP
SP
SPL
R8D
R8W
R8L
R9D
R9W
R9L
R10D
R10W
R10L
R11D
R11W
R11L
R12D
R12W
R12L
R13D
R13W
R13L
R14D
R14W
R14L
R15D
R15W
R15L
x86 Register Structure
The following diagram shows the first two registers. EAX & EBX.

Windows x64 Architecture
In Windows x64 the general purpose registers look like this:
RAX
EAX
AX
AH / AL
RBX
EBX
BX
BH / BL
RCX
ECX
CX
CH / CL
RDX
EDX
DX
DH / DL
RSI
ESI
SI
SIL
RDI
EDI
DI
DIL
RBP
EBP
BP
BPL
RSP
ESP
SP
SPL
R8
R8D
R8W
R8B
R9
R9D
R9W
R9B
R10
R10D
R10W
R10B
R11
R11D
R11W
R11B
R12
R12D
R12W
R12B
R13
R13D
R13W
R13B
R14
R14D
R14W
R14B
R15
R15D
R15W
R15B
x64 Register Structure
The following diagram shows the first two registers. RAX & RBX.

IMPORTANT: It's important to note that structure of these registers are not independent. They are arranged in a hierarchical structure. Where registers of larger size overlay smaller ones. RAX (64 bit) overlays EAX (32 bit), which in turn overlays the 16 bit registers (AX & AH/AL).
**Therefore, modifying the value of RAX will change the value of EAX, and therefore the values of AX. **The hierachical relationship implies that changing the values of higher registers effects the value of lower registers, vice versa.
For example, modifying the value of BL
will impact the value of BX
, which will then influence the value of EBX
, and subsequently modify the value of RBX
.
Volatile vs Non-Volative Registers
During a function or procedure call (assembly functions are called procedures), certain registers automatically change value. These are called non-volatile registers.
Common Non-Volatile registers:
x64 Systems -
RSP
,RSI
,RDI
,RBP
,RBX
,R12-15
.x86 Systems -
EBX
,EBP
,ESI
,EDI
,R12-R15D
.
On the other hand, volatile registers do not need to be saved across a function/procedure call:
x64 Systems -
RCX
,RAX
,RDX
,R8-11
.x86 Systems -
ECX
,EAX
,EDX
,R8-11D
.
IMPORTANT: Whenever a value of a non-volatile register is changed by the routine (procedure), the old value has to be saved on the stack prior to changing the register and that value has to be restored before returning.
RSP & RIP Registers
The RIP/EIP register is a special-purpose register that holds the memory address of the next instruction being executed. The processor automatically increments the RIP/EIP register after executing each instruction.
The RSP/ESP register is called the stack pointer register. It holds the memory address of the top of the stack. (The stack is a memory region that's used to store temporary data & function call information. The RSP/ESP keeps track of the stacks current location).
RDI, RSI, RDX Argument Registers
The general purpose registers rdi, rsi, rdx, rcx, r8, and r9 are typically used for parameter passing. These registers are known as "Arguments registers", they hold values that are passed to a function.
int result = add(3,6);
In the example above, the values 3 and 6 would be passed to the add function using registers. rdi might hold 3 and rsi might hold 6.
RFLAGS Register
The RFLAGS (Register Flags) is a special-purpose register that contains several status and control flags that are used by the processor to control program execution.
64-bit machines the RLAG is 64 bits in size, 32-bit: 32 bits. The register comprises several single-bit values, where each bit corresponds to a single flag. A flag is set to 1 when activated and 0 when deactivated.
IMPORTANT: The majority of RFLAGS flags are reserved for kernel-mode functions, they are limited to general users.

The relevant flags are explained below:
Carry Flag (CF) -- This flag is set when an arithmetic operation generates a carry or borrow. It is also used in bitwise operations, where it indicates whether the result of the operation has a carry-out from the most significant bit.
Parity Flag (PF) - This flag is set when the least significant byte of the result of an arithmetic operation has an even number of set bits.
Zero Flag (ZF) - This flag is set when the result of an arithmetic operation is zero.
Sign Flag (SF) - This flag is set when the result of an arithmetic operation is negative.
Overflow Flag (OF) - This flag is set when an arithmetic operation generates a signed overflow, meaning that the result is too large to be represented in the available number of bits.
MASM Assembly Program Structure
Here is a typical MASM program, the semicolon ;
denotes a comment.
; Data section: contains variable and memory values, adding this section is optional
; Variables can be declared below the ".data" directive
.data
; Code section: contains the assembly code/functions
; Assembly functions can be declared below the ".code" directive
.code
; MASM function declaration
main PROC ; Start of function "main"
; Assembly code of "main"
ret ; Return from "main"
main ENDP ; End of function "main"
; The "end" directive marks the end of the source file
end
Declaring Variables
Variables must be declared within the .data section of the program.
VarName directive VarValue
VarName is the variable name you want.
Here is a list of possible directives:
word
- Unsigned 16-bit value (word).sword
- Signed 16-bit integer value.dword
- Unsigned 32-bit value (double word).sdword
- Signed 32-bit integer value.qword
- Unsigned 64-bit value (quad word).sqword
- Signed 64-bit integer value.oword
- 128-bit value (octal word).tbyte
- Unsigned 80-bit value.real4
- 32-bit floating point value.real8
- 64-bit floating point value.real10
- 80-bit floating point value.byte
- Unsigned 8-bit value.sbyte
- Signed 8-bit integer value.
Declaring Value:
VarValue is our value:
WordVariable word 2
sWordVariable sword -2
FloatVariable real8 3.1
Declaring value as Hexadecimal:
We can initialize a value with hexadecimal using the h
suffix.
DwordVariable dword 10h ; this is 10 in hex, which is 16 in decimal
Declaring Strings:
Strings are declared using byte directive
StringVar byte 'This is a string', 0 ; we add "0" to null-terminate the string
The MASM assembler interprets the above string as an array of hexadecimal characters. We can incorprate a new line character as hexademical: 10
StringVar byte 'This is a string with a new line', 10, 0 ; "10" represents the new line character and is equal to 16 in decimal format
Since the byte
directive in MASM assumes that it is dealing with hexadecimal characters, it is unnecessary to include the h
suffix to represent the value of 10.
Assembly Instructions
The following section goes over common Assembly instructions. A full list can be found:
mov
The mov
instruction is the most frequently used instruction in assembly. As the name suggests, it is used to move data between registers or memory locations.
mov destination, source
Both destination or source can be a general purpose register or memory variable. The mov instruction is limited to:
Only one of the source and destination operands can be a memory variable.
Both the source and destination operands must be of the same size. Mixing different operand sizes within a single
mov
instruction will result in a compilation error.
Here is a list of all legal mov instructions

mov rax, 1234 ; move the value 1234 into the RAX register
mov rax, rbx ; move the value in the RBX register into the RAX register
mov al, 5h ; move the value 0x05 into the AL register
mov [ebx], al ; move the value in AL to the memory location pointed to by the EBX register
In assembly language, square brackets []
are utilized to indicate indirect memory access. It points to the source of the memory location. Similar to pointers in C.
add & sub
add & sub insturctions adds and subtracts to operands. They share the same syntax.
add destination, source ; destination = destination + source
sub destination, source ; destination = destination + source
add rax, rbx ; add the value in RBX to the value in RAX and store the result in RAX
add rax, [rcx] ; add the value in the memory location at RCX to the value in RAX and store the result in RAX
add [rax], 10 ; add the value 10 to the memory location at RAX and store the result in that memory location
mov al, 12h ; move the value 0x12 into the AL register
mov bl, 5h ; move the value 0x05 into the BL register
sub al, bl ; subtract the value in BL from the value in AL and store the result in AL. AL's value is now '13'
call & ret
Procedure calls are made with the call
instruction. The ret
instruction is then used to return execution back to the caller, which serves a similar purpose as C/C++.
call ProcedureName ; ProcedureName is the function name we call.
The ret
instruction does not require any parameters / operands. It does not return any value, it's purpose is to indicate that the current function is finished executing. The address that is returned from ret
is determined by the value at the top of the stack.
ret
Example code
Here is an example of ret
& call
instructions.
.code
DummpProc PROC
mov rcx, 3 ; dummy code
add rbx, 2
sub esi, 1
ret ; return execution back to "main"
DummpProc ENDP
main PROC
call DummpProc ; calling "DummpProc"
ret ; function "main" is terminated
main ENDP
end
lea
The Load Effective Address (lea) instruction returns the memory address of a location and load it into a register, without actually accessing the memory itself. It's essentially the &
address-of operator in C/C++.
lea reg64, source ; reg64 represents a 64-bit general-purpose register
Where reg64
(the destination operand) represents any 64-bit general-purpose register that will hold the address of the source memory location.
StringVar byte 'String Variable', 0 ; A dummy string variable
lea rcx, StringVar ; Load the address of the StringVar variable into RCX. RCX is now equal to &StringVar[0]
and, or, xor, not
The logical operators and
, or
, xor
, and not
are all used to perform logical operations on bits.
and
The and
instruction performs a bitwise and operation between two operands and stores the result in the destination operand.
and destination, source
or
The or
instruction performs a bitwise or operation between two operands and stores the result in the destination operand.
or destination, source
xor
The xor
instruction performs an exclusive OR operation between two operands and stores the result in the destination operand. One common use of the xor
instruction is to clear a register, which is achieved by XORing the register with itself. The syntax of the xor
instruction is as follows:
xor destination, source
not
The not
instruction performs a bitwise not operation on the operand and stores the result in the destination operand. The syntax of the not
instruction is as follows:
not destination
jmp
The jmp
instruction, jumps to the destination operand. It can be a memory address, register, or a label. It's used for unconditional branching or jumping.
jmp destination ; Where 'destination' is where to jump
NOTE: In assembly language, a label is a name given to a specific location in the program's code, which is usually defined using a colon (:
) at the end of a name or identifier.
Example jmp
.code
main PROC
add eax, 2 ; dummy code
xor ax, 5
mov bx, ax
jmp LabelName ; Jump to execute 'LabelName'
mov eax, 100 ; These instructions won't get executed
mov ebx, 100
LabelName:
xor eax, eax ; LabelName's code
sub ebx, 2
ret
main ENDP
end
jz & jnz
jz and jnz instructions are conditional jump instructions, which allow for conditional execution of code. These instructions work by checking a specified flag in the RFLAGS register.
jz
, which stands for "jump if zero", jumps if the zero flag is set (1), while jnz
("jump if not zero") executes the jump if the zero flag is clear (0). There are many other conditional jump instructions:
jc
Jump if Carry - Executes the branch if the Carry Flag is set (1).jnc
Jump if Not Carry - Executes the branch if the Carry Flag is not set (0).jo
Jump if Overflow - Executes the branch if the Overflow Flag is set (1).jno
Jump if Not Overflow - Executes the branch if the Overflow Flag is not set (0).js
Jump if Sign - Executes the branch if the Sign Flag is set (1).jns
Jump if Not Sign - Executes the branch if the Sign Flag is not set (0).je
Jump if Equal - Executes the branch if the Zero Flag is set (1).jne
Jump if Not Equal - Executes the branch if the Zero Flag is not set (0).ja
Jump if Above - Executes the branch if the left operand is greater than the right operand.jae
Jump if Above or Equal - Executes the branch if the left operand is greater than or equal to the right operand.jb
Jump if Below - Executes the branch if the left operand is less than the right operand.jbe
Jump if Below or Equal - Executes the branch if the left operand is less than or equal to the right operand.
cmp
The cmp
instruction or "compare" is the most useful instruction to execute prior to a conditional jump instruction.
cmp First, Second
The cmp
instruction subtracts the second operand from the first operand and sets the condition code flags based on the result of the subtraction. NOTE: It does not store the difference back into the first (destination).
The following examples demonstrate how cmp
can set a flag's value based on the value of its operands.
If the first operand is greater than the second operand, the Carry flag is cleared and the Sign flag is set if the result is negative.
If the second operand is greater than the first operand, the Carry flag is set and the Sign flag is cleared.
If the two operands are equal, the Zero flag is set and the Carry and Sign flags are cleared.
The cmp
instruction is usually used in conjuction with a jmp. Here's an example of dissembled C code:
cmp
instruction is usually used in conjuction with a jmp. Here's an example of dissembled C code:#include <stdio.h>
int main() {
int i = rand();
// if "i" is not equal to 10
if (i != 10) {
printf("i != 10 \n");
}
return 0;
}

The following assembly code shows a je
instruction being found directly below a cmp
instruction.
push & pop
The push and pop instructions are used to manipulate the stack.
push
takes a value from a register and pushed it onto the top of the stack.
push Source
pop takes the value at the top of the stack and pops it off, storing it in the destination register or memory location.
pop Destination
leave
The leave
instruction is used to clean up or exit a subroutine or function.
When executed, it first moves the value of the base pointer register (RBP
) to the stack pointer register (RSP
). It then pops the value of the base pointer register from the stack, restoring it to its previous value.
Essentially, the leave
instruction performs the same task as the following instructions:
mov rsp, rbp
pop rbp
Memory Access Specifiers
In assembly, memory access specifiers are used to determine the size and the type of data being accessed in memory. These specifiers act like type-casting in a programming language.
The most commonly used Memory Access Specifiers are:
Quadword Pointer - qword ptr
A quadword pointer is used to access a 64-bit data value stored in memory. It is specified using the qword ptr
specifier. For instance, if you want to access a 64-bit integer value stored in a particular memory location, you can use the qword ptr
specifier with the mov
instruction. Here are two examples:
mov rax, qword ptr [rbx] ; Example 1
mov rax, qword ptr [rsp + 32h] ; Example 2
In the first example, the 64-bit integer value stored at the memory location pointed to by the rbx
register is accessed using the qword ptr
specifier with the mov
instruction. In the second example, the qword ptr
specifier is used with the mov
instruction to access the 64-bit integer value stored at an offset of 32h
bytes from the rsp
register.
Doubleword Pointer - dword ptr
A doubleword pointer is a memory addressing mode that specifies the size of 32-bit data in memory. It is used when manipulating data stored in memory, particularly 32-bit integer values. To access a 32-bit integer value stored at a specific memory location, the dword ptr
specifier should be used in the instruction, as shown in the following examples:
mov dword ptr [ebx], 12345678 ; Example 1: stores a 32-bit integer value in memory
mov eax, dword ptr [edx + 4] ; Example 2: loads a 32-bit integer value from memory into the eax register
Byte Pointer - byte ptr
A byte pointer is used to indicate the size of 8-bit data in memory. To access a single byte of data stored at a specific memory location, the byte ptr specifier is used.
mov al, byte ptr [edx + 2] ; Example 1
mov byte ptr [ebx + 8], 55h ; Example 2
Calling Functions
Calling functions in assembly can happen a couple ways:
Calling assembly function via call
1.) Calling the assembly function via call instruction with ret used to return the caller.
call power
power:
push ebp # save old base pointer
mov esp, ebp # make stack pointer the base pointer
Calling the assembly function from C
We can import an assemly function to a C file. The function prototype is defined with the extern
keyword. This informs the compiler that the function is already in another file, such as an .asm
file.
Example of calling assembly from C:
/*
main.c file
*/
#include <stdio.h>
extern void SimpleAsmFunc(); // SimpleAsmFunc's prototype. Parameters and function return data type is covered in a later section
int main (){
printf("[i] Calling 'SimpleAsmFunc' ... ");
SimpleAsmFunc();
printf("[+] Done");
return 0;
}
; The asm file that includes the definition of 'SimpleAsmFunc'
.code
SimpleAsmFunc PROC
xor rcx, rcx ; SimpleAsmFunc's code
add rcx, 2
ret
SimpleAsmFunc ENDP
end
Calling a C function from within an assembly file.
To do this, the assembly code must first declare the C function using the externdef directive. This tells the MASM assembler that the function is in another file.
externdef symbol_name:type
Here the externdef
is the name of the function and the type
specifies the function type.
externdef foo:proc ; This will tell MASM that "foo" is a procedure
Example calling C code from assembly:
/*
main.c file
*/
#include <stdio.h>
// Dummy C function
void SimpleCFunc() {
int i = 100;
i = i * (i + 7) >> 3;
i += i/2;
if (i > 100)
i -= 20;
else
i += 20;
}
int main() {
// You can port "AsmFunc" here and call it
return 0;
}
; The asm file that calls 'SimpleCFunc'
externdef SimpleCFunc:proc ; Using externdef to declare "SimpleCFunc" as a procedure defined in an other file
.code
AsmFunc PROC
call SimpleCFunc ; Calling SimpleCFunc
ret
AsmFunc ENDP
end
Passing Parameters
The first four parameters (if they exist) are passed through the registers RCX
, RDX
, R8
, and R9
.
NOTE: If a procedure requires more than four parameters, they are pushed onto the stack. These parameters are known as stack parameters, and the stack must be 16-byte aligned to accommodate them.
IMPORTANT: The first stack parameter (5th procedure parameter) is located at a specific offset from the rsp
register, depending on the function's calling convention. In a 64-bit MASM function, the fifth parameter is usually located at an offset of [rsp + 40].
Example passing parameters:
AsmFunc11Parms PROC
; RCX => Parm1
; RDX => Parm2
; R8 => Parm3
; R9 => Parm4
mov rax, qword ptr [rsp + 40] ; Parm5
mov rax, qword ptr [rsp + 48] ; Parm6
mov rax, qword ptr [rsp + 56] ; Parm7
mov rax, qword ptr [rsp + 64] ; Parm8
mov rax, qword ptr [rsp + 72] ; Parm9
mov rax, qword ptr [rsp + 80] ; Parm10
mov rax, qword ptr [rsp + 88] ; Parm11
ret
AsmFunc11Parms ENDP
Calling AsmFunc11Parms
from C is done below
#include <Windows.h>
extern int AsmFunc11Parms(PVOID Parm1, PVOID Parm2, PVOID Parm3, PVOID Parm4, PVOID Parm5, PVOID Parm6, PVOID Parm7, PVOID Parm8, PVOID Parm9, PVOID Parm10, PVOID Parm11);
int main() {
AsmFunc11Parms(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
return 0;
}
Returning Value
When assembly returns a value it is stored in the RAX
register. Before executing the ret
instruction, the procedure saves the value inside the RAX
register. Allowing the function to return a value.
The following AddtwoNumbers
procedure, takes two parameters, to return their sum.
AddtwoNumbers PROC
mov rax, rcx ; Moving the 1st parmeter to RAX
add rax, rdx ; Add the 2nd parmeter to the value in RAX
ret ; return (RAX here is RCX + RDX)
AddtwoNumbers ENDP
Last updated