This is my first post of the series and like a good book, I’d like to start it off with a brief introduction to everything. I’d like to go ahead and mention that the point of this blog is to spread what I’ve learned or will learn along the way; mainly three subjects: assembly, reverse-engineering and pwn.

Prequisites:

Linux
Basic C

What’s the CPU?

At the most elementary level, the CPU is just the tool that executes the instructions a program is made of.

Why should I learn assembly?

Assembly is, inherently, none-portable. Assembly readability is pretty low. Assembly which is written manually will hardly ever beat the compiler, unless a skilled programmer is the one writing it, of course.

After all these cons, why learn it? Well, a few things that assembly does give you access to: hardware interrupts, memory mappings, etc., but the reality is that you should learn assembly to know how the CPU works.

Instruction Set Architectures, a.k.a ISA

There are a lot of architectures, such as ARM, MIPS, x86, etc. These are all distinct, they have different instruction sets of course. We will be focused on the x86/x86-x64 ISA. Each CPU has their own instruction set, albeit they could be very equal to another.

Registers

To start, I’d like to clear that x86 (32bit) has 8 GPRS, and x64 has 16 GPRS.

Registers

Here are the general purpose registers used by AMD64/x64.

As you might have noticed, the registers divide. For example, let’s take RAX, the full 64bit register. RAX can be segmented into it’s lower half, the 32bit self: EAX. And EAX, while having an upper half, cannot be accessed directly. You will need to shift the bytes right to get a proper value. Although EAX does not have an upper half, it does, however, have a lower-half, which would be AX, and so on.

Why is this important? Because instructions such as movszx, which we will get into later on, overwrite upper halfs of registers.

MoreRegisters

Here we can see each register with their respective sizes, ranging from 64bits to the lower 8bits.

The most important register though is the: RIP register. This register takes control of command execution. It tells the CPU what’s the very next instruction to run. Other note-worthy registers are: RBP, RSP which we’ll talk more about later on, but in essence, the RBP is the base pointer and is usually the initial SP (mov rbp, rsp) so it points to the start of the stack, and the RSP register will be changing forever throughout the program.

The stack, parameters, registers and System V

System V ABI is the “calling convention” used on x64. This is different from x86, but we won’t get as to why. Just learn the reality of things for now; how it all comes down and works. The main things are:

The stack grows downward. Parameters to functions are passed in the registers: rdi, rsi, rdx, rcx, r8, r9, and further values are passed on the stack in reverse order. The stack must be 16byte aligned before a call is made.

The following registers: rbx, rsp, rbp, r12, r13, r14, and r15 are preserved. This means that they’re none-volatile and their value will not change. For example, a register such as AL is volatile, and will always have to reset before use, since it’s the lower 8bits of RAX, a return status will always be in it.

Practice makes perfect

Let’s start off with very basic code and see the assembly. You can use godbolt, the web interface or gdb.

NOTE that Assembly has two styles; AT&T and Intel. I prefer the Intel format and will thus be the one I use.

Let’s start off very easily, and of course, the easiest function is that which does absolutely nothing. The following is C code.

int main(){
 return;
}

The assembly equivalent, generated by GCC is:

main:
        push    rbp
        mov     rbp, rsp
        nop
        nop
        pop     rbp
        ret

Let’s try and break this down. The first line is what you call a label. It’s basically just defining the name of function main here in the code. A label can be anything, either a function like this one, or a variable. Labels do not allocate space, but rather create a symbolic link, or alias, to something. The next step it does is push rbp to the stack to “save” it for later, and then with mov rbp, rsp, it moves the stack pointer into the rbp register, thus “saving the state” of the current stack pointer into rbp.

When this happens, and it happens quite frequently (albeit the ABI states that the frame pointer should not be used), what is set up is called a stack frame. A stack frame is used when local variables and arguments are on the stack so you can access them. RBP = current frame, RBP + 8 = return address.

The next instruction is NOP and is summed up quite nicely by Kip Irvine:

The safest (and the most useless) instruction you can write is called NOP (no operation). It takes up 1 byte of program storage and doesn’t do any work. It is sometimes used by compilers and assemblers to align code to even-address boundaries.

Then it pops rbp, which basically removes the initially saved state from the stack and puts it into the rbp register, thus resetting rbp.

It then finally executes ret which just returns. Easy enough, eh?

What would happen if you had returned a number?

int main(){
 return 0;
}

would generate

main:
        push    rbp
        mov     rbp, rsp
        mov     eax, 0
        pop     rbp
        ret

It spices it up a bit, right? The function prologue kicks in with the first two lines (function prologue = stack frame set up). It then puts the return status (by convention, eax is used for returns), in this case 0, into the eax register. But why? Why eax and not rax? This is because, as mentioned earlier, you can use eax in 64bit mode, too, and it’s quite useful for saving space! Since 0 is a 32-bit literal, it’ll be put into eax, thus being a 5byte instruction instead of a 9byte instruction (mov rax, 0).

On to bigger things

Let’s now go with a typical hello world.

#include <stdio.h>
int main(){
 printf("Hello World!");
 return 0;
}

generates

.LC0:
        .string "hello, world"
main:
        push    rbp
        mov     rbp, rsp
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        mov     eax, 0
        pop     rbp
        ret

The first line is a label. The period at the start of the word indicates it was created by GCC to represent something, in this case, a constant string literal.

The program than proceeds by labeling main and then going along with the function prologue. It then puts into edi a string pointer; our “Hello World”! Why edi over any other register? Because, as stated above, the x64 ABI states in what order arguments/parameters are passed through the stack. The string being the first, it goes into edi. It chooses edi over rdi to save space, just like before. The keen observer might have noticed it does a call to puts, which is a glibc function (we won’t be getting into the dynamic linker right now), instead of calling printf. This is due to the fact that printf with 1 argument, that being a constant pointer string (“Hello World”) in this scenario is pretty much analogous with puts, so it calls it. It then puts status code 0 into eax and then pops rbp and finally executes ret.

The stack with many arguments

#include <stdio.h>
int main() {
    printf("%d %d %d %d %d %d %d %d\n", 1, 2, 3, 4, 5, 6, 7, 8);
    return 0;
}

generates

.LC0:
        .string "%d %d %d %d %d %d %d %d\n"
main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 8
        push    8
        push    7
        push    6
        mov     r9d, 5
        mov     r8d, 4
        mov     ecx, 3
        mov     edx, 2
        mov     esi, 1
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        add     rsp, 32
        mov     eax, 0
        leave
        ret

The first 5 lines, you should know what they do by now so I’m not getting into them…but what does sub rsp, 8 do? Remember that the stack grows in a downward way, so subtraction actually allocates 8 bytes onto the stack. It then pushes the values: 8, 7 and 6 in reverse order onto the stack, and then, in reverse order too, goes putting 5 in r9d, 4 in r8d, and so on as the ABI states. It finally puts 0 in eax for the return status, and does a call to printf so it can print the message. It then adds (which deallocates) 32 bytes from the stack. It then puts 0 into eax, again, to clear the return from printf (printf returns the number of chars printed). It then does leave, which is basically:

mov rsp, rbp
pop rbp
ret

Sets rsp, restrores rbp. This has to happen, since you modify them throughout the program.

Conclusion

You should now have very basic assembly knowledge! No reversing really happened in this tutorial, since I wanted to go from bare-basics to…basics, but I plan on statically and dynamically reversing binaries later on, once I surpass basic assembly!

Exercise if you want

Write the assembly for these:

int main(){
 return 1 + 1;
}

int main(){
 printf("win");
 return 0;
}

int main(){
 int a = 5; // I haven't mentioned variables yet; can you figure it out?
 printf("%d", a);
 return 0;
}