Compilation stages

Julian Torres
3 min readJun 10, 2020

The compile process involves four successive stages: preprocessing, compilation, assembly, and binding. To move from a human-written source program to an executable file it is necessary to perform these four stages successively. The gcc and g++ commands are able to perform the whole process at once, in this we will use gcc, and the four stages are:

Preprocessed:
At this stage, directives are interpreted to the preprocessor. Among other things, variables initialized with #define are replaced in code with their value in all places where their name appears.
We will use this simple program as an example:

#define PI 3.1416main()
{
float area, radio;
radius 10;
area - PI * (radio * radio);
printf("Circle.");
printf("%s%f-n-n", "Circle area radius 10: ", area);
}

Preprocessing can be ordered with any of the following commands; cpp specifically refers to the preprocessor.

 $ gcc -E circle.c > circle.pp
$ cpp circle.c > circle.pp

If we examine circle.pp it can be seen that the PI variable has been replaced by its value, 3.1416, as set out in the judgment #define.

Compilation:

What compilation does is transform the C code into the assembly language of our machine’s processor.

$ gcc -S circle.c

Assembly:

What the assembly does is transform the program written in assembly language to an object code, a binary file in executable machine language by the processor.
The assembler is called as:

$ as -o circle.o circle.s

creates the file in circle.o object code from the circle.s assembly language file. It is not common to perform only the assembly; it is usual to perform all the above stages until you get the object code like this:

$ gcc -c circulo.c

This creates the circle.o file from circle.c.
In large programs, where many source files are written in C code, it is very common to use gcc or g++ with the -c option to compile each source file separately, and then bind all the created object modules. These operations are automated by placing them in a file called make file, interpretable by the make command, which takes care of making the necessary minimum updates whenever any portion of code is modified in any of the source files.

4. Linked
The last stage is to gather one or more modules in object code with the code that exists in the libraries.
To bind we use the ld command, to get an executable we should use something like this:

$ ld -o cirle /usr/lib/gcc-lib/i386-linux/2.95.2/collect2 -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o circle /usr/lib/crt1.o /usr/lib/crti.o /usr/lib/gcc-lib/i386-linux/ 2.95.2/crtbegin.o -L/usr/lib/gcc-lib/i386-linux/2.95.2 circle.o -lgcc -lc -lgcc /usr/lib/gcc-lib/i386-linux/2.95.2/crtend.o /usr/lib/crtn.o

How to do it all in one step?

In program with a single source file the whole above process can be done in one step:

$ gcc -o circle circle.c

The circle.o file is not created; the intermediate object code is created and destroyed without seeing it by the operator, but the executable program appears there and works.
It is instructive to use the -v option of gcc to get a detailed report of all build steps:

$ gcc -v -o circle.c

--

--