X86 calling conventions

From Wikipedia, the free encyclopedia

The correct title of this article is x86 calling conventions. The initial letter is shown capitalized due to technical restrictions.

This article describes the calling conventions used on the x86 architecture.

Contents

[edit] cdecl

The cdecl calling convention is used by many C and C++ systems for the x86 architecture. In cdecl, function parameters are passed on the stack in a right-to-left order. Function return values are returned in the EAX register. Registers EAX, ECX, and EDX are available for use in the function.

For instance, the following C code function prototype and function call:

int function(int, int, int);
int a, b, c, x;
...
x = function(a, b, c);

will produce the following x86 Assembly code (written in MASM syntax):

push c
push b
push a
call function
add esp, 12 ;Stack clearing
mov x, eax

The calling function cleans the stack after the function call returns.

The cdecl calling convention is usually the default calling convention for x86 C compilers, although many compilers provide options to automatically change the calling conventions used. To manually define a function to be cdecl, some support the following syntax:

void _cdecl function(params);

The _cdecl modifier must be included in the function prototype, and in the function declaration to override any other settings that might be in place.

[edit] Pascal

The Pascal calling convention is the reverse of the C calling convention. The parameters are pushed on the stack in left-to-right order and the callee is responsible for balancing the stack before return.

The callee balances the stack by the assembly code: "ret freestack", where freestack is a constant integer.

[edit] Register (fastcall)

The Register or fastcall calling convention is compiler-specific for historical reasons. In general, however, it states that the few first arguments that fit into a processor's register (i.e. with a size up to 32 bits for x86 architecture) will be passed via registers instead of being put onto the stack. The remaining arguments are passed right-to-left on the stack (like in cdecl). Return values are passed through the AL, AX, or EAX register. The stack is usually callee-cleared unless the function takes a variable number of parameters. Most RTL functions, however, take a small number of parameters, so they don't have to clear the stack at all.

  • Microsoft or GCC [1] __fastcall[2] convention (aka __msfastcall) passes first TWO arguments via ECX and EDX;
  • Borland __fastcall convention passes first THREE arguments via EAX, EDX, ECX;
  • Watcom __fastcall convention passes first FOUR arguments via EAX, EDX, EBX and ECX, thus kicking out the most perfomance gain of all the three versions while still having enough spare registers to operate freely.

The Watcom C/C++ compiler also uses the #pragma aux[3] directive that allows you to specify your own calling convention. According to its manual, "Very few users are likely to need this method, but if it is needed, it can be a lifesaver".

[edit] stdcall

The stdcall[4] calling convention is the de facto standard calling convention for the Microsoft Windows NT application programming interface. Function parameters are passed right-to-left. Registers EAX, ECX, and EDX are preserved for use within the function. Return values are stored in the EAX register. Unlike cdecl, the called function cleans the stack, instead of the calling function. Because of this fact, stdcall functions cannot support variable-length argument lists.

On a Microsoft Windows system, a function may be declared to be stdcall using the following syntax in the function prototype, and in the function declaration:

void __stdcall function(params);

Stdcall functions are easy to recognize in ASM code because those functions will all unwind the stack prior to returning. The x86 ret instruction allows an optional byte parameter that specifies the number of stack locations to unwind before returning to the caller. Such code looks like this:

ret 12

[edit] safecall

In Borland Delphi on Microsoft Windows, the safecall calling convention encapsulates COM (Component Object Model) error handling, so that exceptions aren't leaked out to the caller, but are reported in the HRESULT return value, as required by COM/OLE. When calling a safecall function from Delphi code, Delphi also automatically checks the returned HRESULT and raises an exception if necessary. Together with language-level support for COM interfaces and automatic IUnknown handling (implicit AddRef/Release/QueryInterface calls), the safecall calling convention makes COM/OLE programming in Delphi very nice and elegant.

[edit] thiscall

This calling convention is used for calling C++ non-static member functions. There are two primary versions of thiscall used depending on the compiler and whether or not the function uses variable arguments.

For the GCC compiler, thiscall is almost identical to cdecl: the calling function cleans the stack, and the parameters are passed in right-to-left order. The difference is the addition of the this pointer, which is pushed onto the stack last, as if it were the first parameter in the function prototype.

On the Microsoft Visual C++ compiler, the this pointer is passed in ECX and it is the callee that cleans the stack, mirroring the stdcall convention used in C for this compiler and in Windows API functions. When functions use a variable number of arguments, it is the caller that cleans the stack (cf. cdecl).

The thiscall calling convention can only be explicitly specified on Microsoft Visual C++ 2005 and later. On any other compiler thiscall is not a keyword. (Disassemblers like IDA, however, have to specify it anyway. So IDA uses keyword __thiscall__ for this)

[edit] Intel ABI

The Intel Application Binary Interface is a computer programming standard that most compilers and languages follow. According to the Intel ABI, the EAX, EDX, and ECX are to be free for use within a procedure or function, and need not be preserved.

[edit] Microsoft x64 calling convention

The x64 calling convention takes advantage of additional register space in the AMD64 / Intel EM64T platform. The registers RCX, RDX, R8, R9 are used for integer and pointer arguments, and XMM0, XMM1, XMM2, XMM3 are used for floating point arguments. Additional arguments are pushed onto the stack. The return value is stored in RAX.

[edit] AMD64 ABI convention

The calling convention of the AMD64 application binary interface is followed on Linux and other non-Microsoft operating systems. The registers RDI, RSI, RDX, RCX, R8 and R9 are used for integer and pointer arguments while XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments. As in the Microsoft x64 calling convention, additional arguments are pushed onto the stack and the return value is stored in RAX.

[edit] Standard Exit and Entry Sequences

The Standard Entry Sequence to a function is as follows:

_function:
    push ebp       ;store the old base pointer
    mov ebp, esp   ;make the base pointer point to the current stack location - at
                   ;the top of the stack is the old ebp, followed by the return
                   ;address and then the parameters.
    sub esp, x     ;x is the size, in bytes, of all "automatic variables"
                   ;in the function

This sequence preserves the original base pointer ebp; points ebp to the current stack pointer (which points at the old ebp, followed by the return address and then the function parameters); and then creates space for automatic variables on the stack. Local variables are created on the stack with each call to the function, and are cleaned up at the end of each function. This behavior allows for functions to be called recursively. In C and C++, variables declared "automatic" are created in this way.

The Standard Exit Sequence goes as follows:

   mov esp, ebp   ;reset the stack to "clean" away the local variables
   pop ebp        ;restore the original base pointer
   ret            ;return from the function

The following C function:

int _cdecl MyFunction(int i){ 
    int k;
    return i + k;
}

would produce the equivalent asm code:

   ;entry sequence
   push ebp
   mov ebp, esp
   sub esp, 4     ;create function stack frame

   ;function code
   mov eax, [ebp + 8] 
                  ;move parameter i to accumulator
   add eax, [ebp - 4]
                  ;add k to i
                  ;answer is returned in eax

   ;exit sequence
   mov esp, ebp
   pop ebp
   ret

[edit] External links