x86–32bit function calls in assembly explained.
Hello there. In this article I will explain how function calls work in assembly from stack point of view i.e. how stack grows and shrinks when functions are called and returned from the function.
The prerequisite for this course is..
#include <stdio.h>
int sub(int a, int b){
return a - b;
}int main(){
sub(8,2);
return 0x1337;
}
You must be able to understand this piece of code and explain what it does as a c-programmer. If you know it in assembly too, it’s cool but understanding as a c-programmer is must.
int main() {
00C21020 push ebp
00C21021 mov ebp,esp
00C21023 mov ecx,offset _5B900AD5_example1@c (0C25000h)
00C21028 call __CheckForDebuggerJustMyCode (0C21050h)
sub(8,2);
00C2102D push 2
00C2102F push 8
00C21031 call sub (0C21000h)
00C21036 add esp,8
return 0x1337;
00C21039 mov eax,1337h
}
00C2103E pop ebp
00C2103F ret
This is how the code above is compiled down into a assembly level. I’m talking about only the main function. Sub is not included here.
So before we move into analyzing this code, I want you to understand these things.
Registers
Registers are small memory storage areas which lives in the processor i.e. volatile memory. There are 10 registers in 32bit processor ( including flags register and instruction pointer) while it may vary from processor to processor. And by the name, you may get all registers can store values upto 32bits.
Some registers are explained below.
- eip ( extended instruction pointer) : This register stores the address of next instruction to execute by the processor.
- ebp ( ext. base ptr) : This register stores the base address or the address of stack on the memory space from where the currently executing function can pull data from. All other memory places are undefined ( well technically they may be defined) but a program should not be able to access it.
- esp ( ext. stack ptr) : This register points to the end of the stack i.e. last used memory address to the stack i.e. whenever a new item is pushed to the stack it automatically decrements to point to the last item.
These are other common register which is helpful to know and also one thing to know these below are conventions it’s okay for a compiler not to use this convention but it makes other fellow devs and (we reverse engineers) to understand what it does.
- eax : ( extended accumulator register) : This register is generally used to store function return values. i.e. return 3; is equivalent to mov eax,3.
- ecx : (extended counter register): As the name suggests it’s used to count the indexes in the loops and strings operations.
Okay, I mentioned about the instruction pointer but not the flags register. Basically, after each operations these flags get either set or unset based on the operation for e.g. there’s a flag called SF which is 0 if the result of operation is positive value else 1 which indicates a negative value. Different flags i.e. (1 and 0) are named as the different bits of eflags.
This is confusing, so I attached one image below.
Some refreshers to assembly instructions.
mov
This instruction is used to move data from register to register, or memory to register or register to memory. Keep in mind, memory to memory can’t be done.
- mov eax,3 ; move 3 to register eax.
- mov eax,ebx; move value from ebx to register eax
- mov [eax], ebx; move value from ebx to memory address stored in the register eax.
But this is not allowed.
- mov [eax] , [ebx] ; memory to memory data flow is not allowed.
So either you can move data from memory -> register -> memory or either just change the memory address.
push
This is a very simple instruction which is responsible to move the value to stack and decrement the stack pointer to point to the new value.
- push 3; push 3 to the stack and decrement the stack pointer by 4 bytes as integers are 4bytes.
- push ebp; push value of ebp to the stack and …. as above.
pop
This removes the last inserted element from the stack and saves in the register also incrementing the esp by 4.
- pop ebp; remove last value from stack and put it in the ebp register.
add / sub
Basically just addition and subtraction to the two memory address values / register values.
- add eax , 3; add 3 to the current value of eax.
- sub eax, 3: subtract 3 from current value of eax.
call and ret
This instruction has a bit complicated job to do. What it needs to do is to jump into another code or a function but the control must be able to continue from where it was when the anther code block calls ret.
So first of all, the address of next instruction to execute after the function call is pushed on the stack and then the eip points to the function’s first statement.
And whenever the callee returns using ret statement the first value from stack pops off and that value is the next instruction address to execute.
Now we’ll see some stack diagrams to understand how it works i.e. function calls.
- Stack frame refers to the currently accessible memory space in the stack which is accessible by the current function.
- First of all I want you to know except the global memory and dynamically allocated memory all the other variables are stored in the stack. i.e. local variables are always stored in the stack. Now, if this is our main function the stack will look like this.
All the local variables you define in the function are pushed on top of the stack. and then there’s something called caller-saved registers which we won’t discuss here. After it, the arguments are pushed from left to right into the stack. just like this. in the above code block.
006B102D push 2
006B102F push 8
006B1031 call sub (06B1000h)
The left hex value is instruction address and the right values are instruction in assembly. As we had done sub(8,2), you can see it’s evaluated in rtl order.
If we look at the complete source code, we see this first.
int main() {
006B1020 push ebp
006B1021 mov ebp,esp
What this is doing is ebp is the base pointer of whatever the function is calling our main function. Then our main function needs to save their stack pointer and make a new stack frame the current stack pointer i.e.
their ebp is pushed into the stack and ebp is changed to esp so that now this function will evaluate all it’s stack from current stack pointer so it doesnt modify the previous function’s values and also it’s popped at end so that the calling function get’s to access it’s stack after this function does its job.
Now we have clear understanding of the main function. Now what the sub function does is essentially this…
int sub(int a, int b) {
00051001 push ebp
00051002 mov ebp, esp
return a — b;
0005100D mov eax,dword ptr [a]
00051010 sub eax,dword ptr [b]
}
00051013 pop ebp
00051014 ret
So it’s creating it’s own stack frame by moving the previous stack base to the stack and changing base pointer to new stack frame. Now as I had said previously eax is responsible for saving the return value to the previous function. You can see the mov and sub instruction are just saving a-b to the eax regiter.
Then at the end ebp is popped off from the stack so the caller knows it’s stack limits and the control is returned to the next instruction from the caller function.
Reference:
Professional Assembly Language, Richard Blum.