1. How operating system works ?
Before going into the details of assembly language, it is interesting to have a small refresh of the basic operation of the organization of a system.
1.1 Control Unit
The control unit (CU) is a component of a computer's central processing unit (CPU) that directs the operation of the processor. It tells the computer's memory, arithmetic/logic unit and input and output devices how to respond to a program's instructions.
It directs the operation of the other units by providing timing and control signals. Most computer resources are managed by the CU. It directs the flow of data between the CPU and the other devices. In modern computer designs, the control unit is typically an internal part of the CPU with its overall role and operation unchanged since its introduction
1.2 Arithmetic and Logic Unit (ALU)
An arithmetic logic unit (ALU) is a combinational digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on floating point numbers. An ALU is a fundamental building block of many types of computing circuits, including the central processing unit (CPU) of computers, FPUs, and graphics processing units (GPUs). A single CPU, FPU or GPU may contain multiple ALUs.
The inputs to an ALU are the data to be operated on, called operands, and a code indicating the operation to be performed; the ALU's output is the result of the performed operation. In many designs, the ALU also has status inputs or outputs, or both, which convey information about a previous operation or the current operation, respectively, between the ALU and external status registers.
1.3 The register
The register are very fast storage location inside the processor itself.
There are many registers including:
- Memory Address Register (MAR) : hold the address of the location in memory
- Memory Data Register (MDR) : hold data just read from or written to memory
- Program Counter (PC) : Hold the address of the next instruction to be fetched
- Instruction Register (IR) : Hold the current instruction being executed
- General Purpose Register : can be used by programmer
2. Organization of a process.
Now that we have a better understanding of how a CPU works, it is interesting to look at a Linux system organizing a process.
What is a process?
A process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system (OS), a process may be made up of multiple threads of execution that execute instructions concurrently.
A computer program is a passive collection of instructions, while a process is the actual execution of those instructions. Several processes may be associated with the same program; for example, opening up several instances of the same program often means more than one process is being executed.
On a Linux system, there are several ways to view information about a process:
- with GDB
When executing a program, a unique identifier called PID is given to the program by the Kernel. All information regarding this process is identifiable in the "/ proc / $ pid /" folder. Let's take a closer look at the contents of the directory.
$ ping localhost&  31089 $ ls /proc/31089/ attr cgroup comm cwd fd io map_files mountinfo net oom_adj pagemap root setgroups stack status timers wchan autogroup clear_refs coredump_filter environ fdinfo limits maps mounts ns oom_score personality schedstat smaps stat syscall timerslack_ns auxv cmdline cpuset exe gid_map loginuid mem mountstats numa_maps oom_score_adj projid_map sessionid smaps_rollup statm task uid_map
As we can see, the folder contains a lot of files but we will look at the mapping memory of our process.
$ cat /proc/31089/maps 555555554000-555555562000 r-xp 00000000 fe:02 24647249 /usr/bin/iputils-ping 555555761000-555555762000 r--p 0000d000 fe:02 24647249 /usr/bin/iputils-ping 555555762000-555555763000 rw-p 0000e000 fe:02 24647249 /usr/bin/iputils-ping 555555763000-5555557a7000 rw-p 00000000 00:00 0 [heap] 7ffff7024000-7ffff702f000 r-xp 00000000 fe:02 24650579 /usr/lib/libnss_files-2.26.so 7ffff702f000-7ffff722e000 ---p 0000b000 fe:02 24650579 /usr/lib/libnss_files-2.26.so 7ffff722e000-7ffff722f000 r--p 0000a000 fe:02 24650579 /usr/lib/libnss_files-2.26.so 7ffff722f000-7ffff7230000 rw-p 0000b000 fe:02 24650579 /usr/lib/libnss_files-2.26.so 7ffff7230000-7ffff7236000 rw-p 00000000 00:00 0 7ffff7236000-7ffff73e1000 r-xp 00000000 fe:02 24650591 /usr/lib/libc-2.26.so 7ffff73e1000-7ffff75e1000 ---p 001ab000 fe:02 24650591 /usr/lib/libc-2.26.so 7ffff75e1000-7ffff75e5000 r--p 001ab000 fe:02 24650591 /usr/lib/libc-2.26.so 7ffff75e5000-7ffff75e7000 rw-p 001af000 fe:02 24650591 /usr/lib/libc-2.26.so 7ffff75e7000-7ffff75eb000 rw-p 00000000 00:00 0 7ffff75eb000-7ffff75fe000 r-xp 00000000 fe:02 24650563 /usr/lib/libresolv-2.26.so 7ffff75fe000-7ffff77fe000 ---p 00013000 fe:02 24650563 /usr/lib/libresolv-2.26.so 7ffff77fe000-7ffff77ff000 r--p 00013000 fe:02 24650563 /usr/lib/libresolv-2.26.so 7ffff77ff000-7ffff7800000 rw-p 00014000 fe:02 24650563 /usr/lib/libresolv-2.26.so 7ffff7800000-7ffff7802000 rw-p 00000000 00:00 0 7ffff7802000-7ffff79ac000 r-xp 00000000 fe:02 24661577 /usr/lib/libcrypto.so.42.0.0 7ffff79ac000-7ffff7bab000 ---p 001aa000 fe:02 24661577 /usr/lib/libcrypto.so.42.0.0 7ffff7bab000-7ffff7bc8000 r--p 001a9000 fe:02 24661577 /usr/lib/libcrypto.so.42.0.0 7ffff7bc8000-7ffff7bce000 rw-p 001c6000 fe:02 24661577 /usr/lib/libcrypto.so.42.0.0 7ffff7bce000-7ffff7bd2000 rw-p 00000000 00:00 0 7ffff7bd2000-7ffff7bd6000 r-xp 00000000 fe:02 24647647 /usr/lib/libcap.so.2.25 7ffff7bd6000-7ffff7dd6000 ---p 00004000 fe:02 24647647 /usr/lib/libcap.so.2.25 7ffff7dd6000-7ffff7dd7000 r--p 00004000 fe:02 24647647 /usr/lib/libcap.so.2.25 7ffff7dd7000-7ffff7dd8000 rw-p 00005000 fe:02 24647647 /usr/lib/libcap.so.2.25 7ffff7dd8000-7ffff7dfd000 r-xp 00000000 fe:02 24650573 /usr/lib/ld-2.26.so 7ffff7fd8000-7ffff7fdc000 rw-p 00000000 00:00 0 7ffff7ff7000-7ffff7ffa000 r--p 00000000 00:00 0 [vvar] 7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0 [vdso] 7ffff7ffc000-7ffff7ffd000 r--p 00024000 fe:02 24650573 /usr/lib/ld-2.26.so 7ffff7ffd000-7ffff7ffe000 rw-p 00025000 fe:02 24650573 /usr/lib/ld-2.26.so 7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0 7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
|Address||Start and end address of the section|
|Permission||Permission on the section:
|Offset||Offset in file for memory mapped files. 0 otherwise|
|Device||Major – Minor device number of device from where the file is loaded|
Now that we have a little more knowledge about CPU operation and process management, it's time to start our adventure in the assembler world. We will first see the proposed registers for the x86_64 architecture and their utilities.
|64 bits||Lower 32 bits||Lower 16 bits||Higher 8 bits of 16 bits||Lower 8 bits of 16 bits||General purpose|
|RAX||EAX||AX||AH||AL||General purpose / Function return value / Syscall number / Arithmetic operation result|
|RCX||ECX||CX||CH||CL||General purpose / Loop counter / Fourth parameter of function|
|RDX||EDX||DX||DH||DL||General purpose / Rest of the multiplications and divisions / Third parameter of function|
|RSI||ESI||SI||N/A||SIL||String source / Second parameter of function|
|RDI||EDI||DI||N/A||DIL||String destination / First parameter of function|
|RBP||EBP||BP||N/A||BPL||Stack base pointer|
|RIP||N/A||N/A||N/A||N/A||Next instruction to be executed|
|R8||R8D||R8W||N/A||R8B||General purpose / Fifth parameter of function|
|R9||R9D||R9W||N/A||R9B||General purpose / Sixth parameter of function|
EFLAGS is a registry used as a collection of bits representing Boolean values to store the results of operations and the state of the processor.
For example, the FLAG "ZF" is set to 1 when a comparison is made and the results are identical. We will see in more detail the use of the FLAGS during this tutorial.
When using gdb with the PEDA extension, here is what you can see:
Thanks to extensions, it becomes easier to find interesting information.
5. Sections / Segments
The primary purpose of segment registers is to keep the location of specific segments in virtual memory. Each 16-bit register may contain the location of a segment such as the code segment, held by the CS register.
This register can then be used by the processor to find out where the code is in memory and access the offset accordingly. Because segment registers are only 16 bits wide, they are only able to reference the offset of a loading address for a given process. Segmentation is unnecessary in 64-bit systems; however, registries such as FS are important for pointing to the structural data of Windows.
There are 6 segment registers:
|Code Segment (CS)||The code segment (CS) contains the executable instructions of an object file. The CS is sometimes called the text segment. Because the CS has read and execute permissions, but not write permission, multiple instances of the program can run concurrently. The code segment register often points to an offset containing the start address of the executable code for a given process|
|Stack segment (SS)||The Stack Segment (SS) register keeps the location of the stack procedure. Specifically, the SS register generally points to an address in memory on the stack, while the stack pointer (RSP) points to the top of the Stack Frame in use.|
|Data segment (DS)||There are four segment registers with the ability to point to different segments of data. The four registers are the data segments (DS), the additional segments (ES), FS and GS. FS has a notable use with Windows|
An assembler program can be divided into 3 three distinct sections:
|Text section||Contains program code|
|Data section||Contains initialized variables|
|BSS section||Contains uninitialized variables|