1. Moving data

Before we see how we move data in ASM, when we read intel manual, we notice a very important point.

In 64-bit mode, the size of the operand determines the number of valid bits in the general destination register:

  • 64-bit operands generate a 64-bit result in the general destination register
  • The 32-bit operand generates a 32-bit result, extended to zero to a 64-bit result in the destination
  • The 8-bit and 16-bit operand generates an 8 or 16-bit result. The higher 56 or 48 bits (respectively) of the destination general register are not modified by the operation. If the result of an 8-bit or 16-bit operation is for 64-bit address computation, explicitly extend the register to the full 64-bit.

Because the higher 32/64 bit general register is not set in 32-bit mode, the 32-bit of any general register is not retained when switching from 64-bit mode to 32-bit mode (in protected mode or compatibility mode) . The software must not depend on these bits to maintain a value after a 64/32 bit mode switch.

All this can be quite vague, but you will understand a little better when you practice, a little further in this tutorial.

1.1 What are the instructions for moving data?

In x32/x64, the most commonly used instruction is instruction:

  • MOV

This instruction moves the data in different ways:

  • mov register, register
  • mov register, immediateValue
  • mov immediateValue, register
  • mov register, memory
  • mov memory, immediateValue
  • mov memory, register

It is also possible to move data using the instruction:

  • XCHG.

This statement, as its name may indicate, allows for a swap:

  • xchg register, memory
  • xchg register, register

Finally, there remains a very important instruction, namely:

  • LEA (Load Effective Address).

This instruction is used to load a pointer referenced by a label.

  • lea rax, [label]

1.2 Practice

To better understand what is explained above, let's test the following code:

global _start

section .text
_start:

    ; mov register, immediate value
    mov rax, 0xaabbccddeeff1122
    mov eax, 0x33445566
    mov ax, 0xaabb
    mov al, 0x88
    mov ah, 0x99

    mov ecx, 0xaabbccdd

    ; mov register, register
    mov rbx, rax
    mov al, dl
    mov dx, bx

    ; mov register, memory
    mov rax, [var1]
    mov ebx, [var2]
    mov cx,[var3]

    mov rbx, 0xbbccddeeff112233
    ; mov memory, register
    mov [var1], rbx
    mov word [var2], cx
    mov dword [var1], ecx

    ; mov memory, immediateValue
    mov byte [var1], 0xbb

    ; lea
    lea rdx, [var1]
    lea eax, [var2]
    lea bx, [var3]

    ; xchg register, register
    mov rax, 0xaabbccddeeff1122
    mov rbx, 0x1122334455667788
    xchg rax, rbx

    ; xchg register, memory
    xchg rax, [var1] 

section .data
    var1: db 0x11, 0x22, 0x33, 0x44, 0x55, 0xAA
    var2: dq 0xaabbccddeeff88
    var3: dw 'hello'

2. How the Stack works ?

We will address here a very important aspect of the functioning of a program, namely the STACK.
The stack is a memory area with a variable size to store a lot of useful information. The main advantage of the stack is the speed of read / write access, which makes it an area of choice for storing temporary information, such as local variables, function parameters (x86).
The stack uses the LIFO (Last In First Out) mode, which means that the last element on the stack is the first element that will come out of it. You should know that the stack points high addresses to low addresses.

2.1 How to manipulate the stack?

To manipulate the stack, there are some assembly instructions:

  • push
  • pop
  • all operation on RSP register (mov, add, sub, ect)

The entry point of the stack (Top of the stack) is pointed by the RSP register. This means that when an operation is performed on the stack, the register will automatically point to the top of the stack.
Here is a small diagram to understand the operation:
image
When adding data to the stack, thanks to the instruction PUSH, here is what happens:
image
And if you delete a data with the POP statement, the RSP pointer will return to the location of the first image.

2.2 Practice

As before it is important to carry out practical work, to do this, here is a small program.

section .text
global _start

_start:
    ; mov register, immediateValue
    mov rax, 0x1122334455667788
    push rax
    push var1
    push qword [var1]
    pop rcx
    pop rbx

section .data
    var1: db 0xaa, 0xbb, 0xcc, 0xdd

3. Procedure + stack frame

Now that you're comfortable with how the stack works, there's another point about the stack, the procedures and the stack frame.

3.1 What is a procedure ?

A procedure is an instruction set that can be compared to a function in a higher level language. Like all functions, the procedure can be called from anywhere in the code.
To call a procedure, you simply have to do:

  • call procedureName

When using procedures, we must not forget the ret statement, which allows you to return after the call and resume the execution of our program.

The ret statement is like making a pop rip. Before going further, we must understand the operation.
When the CPU sees a call in the program, it will automatically add on the Stack the return address, in order to return to the normal execution of our program.

3.1.1. Example

int add(int a, int b)
{
    return a+b;
}

int main(int ac, char **av)
{
    int a = 1;
    int b = 2;
    int c = 0;

    c = add(a, b);
    return 0;
}

We will analyze the operation with gdb:

Dump of assembler code for function main:
   0x00000000004004a9 <+0>: push   rbp
   0x00000000004004aa <+1>: mov    rbp,rsp
   0x00000000004004ad <+4>: sub    rsp,0x20
   0x00000000004004b1 <+8>: mov    DWORD PTR [rbp-0x14],edi
   0x00000000004004b4 <+11>:    mov    QWORD PTR [rbp-0x20],rsi
   0x00000000004004b8 <+15>:    mov    DWORD PTR [rbp-0xc],0x1
   0x00000000004004bf <+22>:    mov    DWORD PTR [rbp-0x8],0x2
   0x00000000004004c6 <+29>:    mov    edx,DWORD PTR [rbp-0x8]
   0x00000000004004c9 <+32>:    mov    eax,DWORD PTR [rbp-0xc]
   0x00000000004004cc <+35>:    mov    esi,edx
   0x00000000004004ce <+37>:    mov    edi,eax
   0x00000000004004d0 <+39>:    call   0x400487 <add> <--------- PUT YOUR BREAKPOINT HERE
   0x00000000004004d5 <+44>:    mov    DWORD PTR [rbp-0x4],eax
   0x00000000004004d8 <+47>:    mov    eax,0x0
   0x00000000004004dd <+52>:    leave
   0x00000000004004de <+53>:    ret
End of assembler dump.

Now let's look at the state of our Stack:

gdb-peda$ x/30wx $rsp
0x7fffffffdc80: 0xffffdd88  0x00007fff  0x004003b0  0x00000001
0x7fffffffdc90: 0xffffdd80  0x00000001  0x00000002  0x00000000
0x7fffffffdca0: 0x004004e0  0x00000000  0xf7a44021  0x00007fff
0x7fffffffdcb0: 0x00040000  0x00000000  0xffffdd88  0x00007fff
0x7fffffffdcc0: 0xf7b9b088  0x00000001  0x004004a9  0x00000000
0x7fffffffdcd0: 0x00000000  0x00000000  0x452aded6  0x688910f0
0x7fffffffdce0: 0x004003b0  0x00000000  0xffffdd80  0x00007fff
0x7fffffffdcf0: 0x00000000  0x00000000

Now, we will enter the add function:

gdb-peda$ x/30wx $rsp
0x7fffffffdc78: 0x004004d5  0x00000000  0xffffdd88  0x00007fff
0x7fffffffdc88: 0x004003b0  0x00000001  0xffffdd80  0x00000001
0x7fffffffdc98: 0x00000002  0x00000000  0x004004e0  0x00000000
0x7fffffffdca8: 0xf7a44021  0x00007fff  0x00040000  0x00000000
0x7fffffffdcb8: 0xffffdd88  0x00007fff  0xf7b9b088  0x00000001
0x7fffffffdcc8: 0x004004a9  0x00000000  0x00000000  0x00000000
0x7fffffffdcd8: 0x452aded6  0x688910f0  0x004003b0  0x00000000
0x7fffffffdce8: 0xffffdd80  0x00007fff

Note that our stack points well on the instruction following the call.
This is a very important notion and will be fully used when developing shellcode.

3.2 How to pass arguments to a procedure ?

The transition from argument to is a procedure is relatively simple. They can be given in several ways:

  • By Register
  • By the Stack (Commonly used in x86)
  • Passed as data structures in memory referenced by a register / on the Stack

3.3 What is Stack Frame ?

To allow many unknowns in the runtime environment, functions are often configured with a "stack frame" to allow access to function parameters, and variables in functions. The idea behind the stack frame is that each subroutine can act independently of its location on the stack, and each subroutine can act as if it were the top of the stack.

When a function is called, a new stack frame is created at the current RSP location. A stack frame acts as a partition on the stack. All elements of the above functions are higher on the stack and should not be changed. Each current function has access to the rest of the stack, from the image of the stack to the end of the stack. The current function always has access to the "top" of the stack, so functions do not need to consider memory usage by other functions or programs.

The stack frame is divided into two parts:

  • Prolog => push rbp; mov rbp, rsp
  • Epilog => mov rsp, rbp; pop rbp

4. Example

Let's compile the following C code and look at how it works:

int add(int a, int b)
{
    return a+b;
}

int main(int ac, char **av)
{
    int a = 1;
    int b = 2;
    int c = 0;

    c = add(a, b);
    return 0;
}

If we look at the function add with gdb, we see:

gdb-peda$ disass add
Dump of assembler code for function add:
   0x0000000000400487 <+0>: push   rbp          ; prologue
   0x0000000000400488 <+1>: mov    rbp,rsp      ; prologue
=> 0x000000000040048b <+4>: mov    DWORD PTR [rbp-0x4],edi
   0x000000000040048e <+7>: mov    DWORD PTR [rbp-0x8],esi
   0x0000000000400491 <+10>:    mov    edx,DWORD PTR [rbp-0x4]
   0x0000000000400494 <+13>:    mov    eax,DWORD PTR [rbp-0x8]
   0x0000000000400497 <+16>:    add    eax,edx
   0x0000000000400499 <+18>:    pop    rbp ; epilog
   0x000000000040049a <+19>:    ret    
End of assembler dump.

5. Practice

Well now we're going to create our own assembler procedure to better understand what's going on:

section .text
global _start

_start:
    mov rax, 0x1122334455667788
    mov rsi, 0x6f6c6c6568 ; hello
    push rsi
    mov rsi, rsp
    call myproc

    ;exit
    mov rax, 60
    syscall

myproc:
    push rbp
    mov rbp, rsp

    xor rax, rax
    mov rax, 0x1
    mov rdi, 0x1
    mov rdx, 0x5
    syscall

    ; mov rsp, rbp
    ; pop rbp

    leave 
    ret