Detailed explanation of program compilation, link and preprocessing

preface When the program changes from test.c file (source file / source code) to executable test.exe file, it goes throu...
1. Several stages of compilation:
2. Operating environment
2.1 predefined symbols
2.2 #define preprocessing instruction
2.3 macro parameters with side effects
2.4 macro and function comparison
Comparison table of macros and functions
Naming convention
2.5 #undef
2.6 command line definition
2.7 conditional compilation
2.8 documents include

preface

When the program changes from test.c file (source file / source code) to executable test.exe file, it goes through the stages of compilation and link, and finally gets the results after running. There are two different environments for the program:
The first is the translation environment, in which the source code is converted into executable machine instructions. The translation environment is divided into two processes: compilation (including pre compilation, compilation and assembly) and linking.
The second is the execution environment, which is used to actually execute code. The execution process includes the operation process.

1, Program translation environment and execution environment

The program compilation process is shown in the following figure:

  • Each source file constituting a program is converted into object code through the compilation process.
  • Each object file is bound together by a linker to form a single and complete executable program.
  • The linker also introduces any function used by the program in the standard C function library, and it can search the programmer's personal program library and link the functions it needs to the program.

1. Several stages of compilation:

precompile

Generate the test.i file, only complete the preprocessing (precompiling), and stop after preprocessing. The pretreatment process includes:

  • Inclusion of header file #include.
  • #define defines the substitution of symbols.
  • Delete comments.

The above three points are all text operations.

compile

Generate test.s file. The compilation stage is the process of converting c language code into assembly code. This stage mainly completes syntax analysis, semantic analysis, lexical analysis, symbol summary, etc. Symbol summary summarizes global symbols.

assembly

Generate the test.o file. This stage is mainly to convert the assembly code into binary instructions (machine instructions). This stage also completes the formation of symbol table.

link

This stage mainly completes:

  • Merge segment tables.
  • Merging and relocation of symbol tables.

After completing the above stages, the binary file of a.out is finally generated.

2. Operating environment

Process of program execution:

  1. The program must be loaded into memory. In an operating system environment: This is usually done by the operating system. In a stand-alone environment, the program
    The loading of must be arranged manually, or it may be completed by putting executable code into read-only memory.
  2. The execution of the program begins. The main function is then called.
  3. Start executing program code. At this time, the program will use a run-time stack to store the local variables and return addresses of the function. Programs can also use static memory. Variables stored in static memory keep their values throughout the execution of the program.
  4. Terminate the procedure. The main function may be terminated normally or unexpectedly.
2, Precompiled explanation

2.1 predefined symbols

__ FILE__ // Source file for compilation
__ LINE__ // The current line number of the file
__ DATE__ // Date the file was compiled
__ TIME__ // The time the file was compiled
__ STDC__ // If the compiler follows ANSI C, its value is 1, otherwise it is undefined
These predefined symbols are built into the language.

#include <stdio.h> int main() { //printf("%s\n", __FILE__); // Print out the absolute path and name of the current file //printf("%d\n", __LINE__); //printf("%s\n", __DATE__); //printf("%s\n", __TIME__); int i = 0; int arr[10] = { 0 }; FILE* pf = fopen("log.txt", "w"); for (i = 0; i < 10; i++) { arr[i] = i; fprintf(pf, "file:%s line:%d date:%s time:%s i=%d\n", __FILE__, __LINE__, __DATE__, __TIME__, i); //Print out the relevant information of the file during execution printf("%s\n", __FUNCTION__);//Print out the function name where the file is executed } fclose(pf); pf = NULL; for (i = 0; i < 10; i++) { printf("%d ", arr[i]); } return 0; }

2.2 #define preprocessing instruction

define definition identifier

#define MAX 1000 #define reg register / / create a short name for the keyword register #define do_forever for(;) / / replace an implementation with a more vivid symbol #define CASE break;case / / when writing a case statement, the break is automatically written. // If the defined stuff is too long, it can be written in several lines. Except for the last line, a backslash (continuation character) is added after each line. #define DEBUG_PRINT printf("file:%s\tline:%d\t \ date:%s\ttime:%s\n" ,\ __FILE__,__LINE__ , \ __DATE__,__TIME__ )

When defining an identifier in define, it is best not to add; at the end;.

define macro

#The define mechanism includes a provision that allows parameters to be replaced with text. This implementation is often called macro or define macro.
Macro declaration method:
#Define name (argument list) stuff, where argument list is a comma separated symbol table, which may appear in the stuff.
[note]:
The left parenthesis of the parameter list must be immediately adjacent to name. If there is any blank space between the two, the parameter list is interpreted as part of the stuff.

#define SQUARE(X) X*X int main() { int ret = SQUARE(5); printf("%d\n", ret);//ret = 5*5=25 return 0; }

Macros are replaced, not passed parameters.

#define SQUARE(X) X*X int main() { int ret = SQUARE(5 + 1);//5+1*5+1 = 11 printf("%d\n", ret);//11 return 0; }

Since the multiplication operation precedes the addition operation defined by the macro, it is necessary to add parentheses to replace the past numbers.

#define SQUARE(x) (x) + (x)

The code becomes as follows:

#define SQUARE(X) (X)*(X) int main() { int ret = SQUARE(5 + 1);//(5+1)*(5+1) = 36 printf("%d\n", ret);//36 return 0; }

Macro definitions used to evaluate numeric expressions should be bracketed in this way to avoid unexpected interactions between operators in parameters or adjacent operators when using macros.
Let's take another example:

#define DOUBLE(X) X+X int main() { int a = 5; int ret = 10 * DOUBLE(a);//10*5+5 printf("%d\n", ret);//55 return 0; }

When the macro is not parenthesized, the result we get after executing the program is: 55.

#define DOUBLE(X) ((X)+(X)) int main() { int a = 5; int ret = 10 * DOUBLE(a);//10*((5)+(5)) printf("%d\n", ret);//100 return 0; }

After adding parentheses to the macro, we get the result: 100. Therefore, when using macros, we should add parentheses appropriately to achieve the desired effect.

define replacement rule

When extending #define definition symbols and macros in the program, several steps need to be involved:

  1. When calling a macro, first check the parameters to see if they contain any symbols defined by #define. If so, they first
    Replaced.
  2. The replacement text is then inserted into the position of the original text in the program. For macros, parameter names are replaced by their values.
  3. Finally, scan the result file again to see if it contains any symbols defined by #define. If so, repeat the above process.
    [note]:
  4. Variables defined by other #define can appear in macro parameters and #define definitions. But for macros, recursion cannot occur.
  5. When the preprocessor searches for #define defined symbols, the contents of string constants are not searched.

#And##

  • Using #, you can change a macro parameter into a corresponding string ("" string content "").
#define PRINT(X) printf("the value of "#X" is %d\n", X); int main() { int a = 5; int b = 20; PRINT(a); //printf("the value of ""a"" is %d\n", a); PRINT(b); //printf("the value of ""b"" is %d\n", b); return 0; }

The operation results are as follows:

##Role of:

  • ##You can combine the symbols on both sides of it into one symbol.
  • It allows macro definitions to create identifiers from separate text fragments.

Example:

#define CAT(X,Y) X##Y int main() { int a99 = 20; printf("%d\n", CAT(a, 99));//20 //printf("%d\n", a##99); //printf("%d\n", a99); return 0; }

2.3 macro parameters with side effects

When a macro parameter appears more than once in the macro definition, if the parameter has side effects, you may be in danger when using this macro, resulting in unpredictable consequences. A side effect is a permanent effect that occurs when an expression is evaluated.

#define MAX(X,Y) ((X)>(Y)?(X):(Y)) int main() { int a = 10; int b = 11; int max = MAX(a++, b++);//12 //int max =((a++)>( b++)?(a++):(b++)); printf("%d\n", max);//12 printf("%d\n", a);//11 printf("%d\n", b);//13 return 0; }

In the above program, the program executes to int max = MAX(a++, b + +); Namely: int max = ((a + +) > (b + +)? (a + +): (b + +); If the previous (a + +) > (b + +) is false, execute int max = b + +; First assign b to max, because b has been self added during the previous comparison, so the value of Max is 12 at this time. Then b performs self addition operation, and finally obtains the value of b as 13.

2.4 macro and function comparison

Macros are usually used to perform simple operations. For example, find the larger of the two numbers.

#define MAX(a, b) ((a)>(b)?(a):(b))

Then why not use functions to do this?
There are two reasons for this:

  1. The code used to call and return from functions may take more time than it actually takes to perform this small computational work. So macros are better than functions in terms of program size and speed.
  2. More importantly, the parameters of a function must be declared as a specific type. Therefore, functions can only be used on expressions of the appropriate type. On the contrary, this macro can be used for integer, long integer, floating-point and other types that can be compared with >. Macros are type independent.

Of course, compared with macros, functions also have disadvantages:

  1. Each time you use a macro, a copy of the macro definition code will be inserted into the program. Unless the macro is short, it may greatly increase the length of the program.
  2. Macros cannot be debugged.
  3. Macros are not rigorous enough because they are type independent.
  4. Macros may cause operator priority problems, resulting in error prone procedures.
    Macros can sometimes do things that functions can't.
    For example: * * macro parameters can have types, but the function cannot.

Comparison table of macros and functions

attribute#define macrofunctionCode length Macro code is inserted into the program every time it is used. Except for very small macros, the length of the program will increase significantly Function code only appears in one place; Every time you use this function, you call the same code in that place Execution speed Faster There is extra overhead for function calls and returns, so it is relatively slow Operator precedence The evaluation of macro parameters is in the context of all surrounding expressions. Unless parentheses are added, the priority of adjacent operators may have unpredictable consequences. Therefore, it is recommended that macros be written with more parentheses. The function parameter is evaluated only once when the function is called, and its result value is passed to the function. The evaluation result of the expression is easier to predict Parameters with side effects Parameters can be replaced at multiple locations in the macro body, so parameter evaluation with side effects can produce unpredictable results Function parameters are evaluated only once when passing parameters, and the results are easier to control Parameter type Macro parameters are independent of type. As long as the operation on parameters is legal, it can be used for any parameter type. The parameters of a function are type dependent. If the types of parameters are different, different functions are required, even if they perform different tasks. debugging Macros are inconvenient to debug Functions can be debugged statement by statementrecursion Macros cannot be recursive Functions are recursive

Naming convention

Generally speaking, the usage syntax of functions and macros is very similar. So language itself can't help us distinguish between the two.
Then one of our usual habits is:
1. Capitalize all macro names.
2. Do not capitalize all function names.

2.5 #undef

#undef NAME //If an existing name needs to be redefined, its old name must first be removed

This instruction is used to remove a macro definition.

2.6 command line definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process.
For example, this feature is useful when we want to compile different versions of a program according to the same source file. (suppose a program declares an array of a certain length. If the machine memory is limited, we need a small array, but the other machine memory is larger, we need an array that can be larger.)

#include <stdio.h> int main() { int array [ARRAY_SIZE]; int i = 0; for(i = 0; i< ARRAY_SIZE; i ++) { array[i] = i; } for(i = 0; i< ARRAY_SIZE; i ++) { printf("%d " ,array[i]); } printf("\n" ); return 0; }

Compiling instructions in Linux Environment:

gcc -D ARRAY_SIZE=10 programe.c

2.7 conditional compilation

When compiling a program, it is very convenient for us to compile or give up a statement (a group of statements) because we have conditional compilation instructions.
For example, debugging code can be deleted, retained and hindered, so we can compile it selectively.

#include <stdio.h> #define BUG int main() { int arr[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 }; int i = 0; for (i = 0; i < 10; i++) { arr[i] = 0; #ifdef BUG printf("%d ", arr[i]);//To see if the array assignment is successful. #endif } return 0; }

When the BUG is not defined, the print statement will not be executed. Only when the BUG statement is defined, the print statement will be executed.

Common conditional compilation instructions

#if constant expression //... #endif

If the constant expression is true, the following statement is executed. If the constant expression is false, the following statement will not be executed.

int main() { int arr[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 }; int i = 0; for (i = 0; i < 10; i++) { arr[i] = 0; #If 0 / / 0 is false, the following statement will not be executed printf("%d ", arr[i]);//To see if the array assignment is successful. #endif } return 0; }

2. Conditional compilation of multiple branches

#if constant expression //... #elif constant expression //... #else //... #endif

Example:

int main() { #if 1==1 printf("nihao\n"); //Print out this sentence #elif 2==1 printf("haha\n"); #else printf("12\n"); #endif return 0; }

3. Judge whether it is defined

#if defined(symbol) #ifdef symbol #if !defined(symbol) #ifndef symbol

Example:

#define DEBUG int main() { #if defined(DEBUG) //#ifdef DEBUG printf("haha\n"); #endif return 0; }

4. Nested instructions

#if defined(OS_UNIX) #ifdef OPTION1 unix_version_option1(); #endif #ifdef OPTION2 unix_version_option2(); #endif #elif defined(OS_MSDOS) #ifdef OPTION2 msdos_version_option2(); #endif #endif

2.8 documents include

We already know that the #include directive enables another file to be compiled, just as it actually appears in the place of the #include directive.
This replacement is simple:
The preprocessor first deletes this instruction and replaces it with the contents of the containing file.
If such a source file is included 10 times, it will actually be compiled 10 times.

2.8.1 how header files are included

Local file contains

#include "filename"

Search strategy: first search in the directory where the source file is located. If the header file is not found, the compiler looks for the header file in the standard location like looking for the library function header file.
Library file contains

#include <filename.h>

Find the header file directly to the standard path. If it cannot be found, it will prompt a compilation error.
Library files can also be included in the form of "", but this is less efficient. Of course, it is not easy to distinguish between library files and local files.

2.8.2 nested file contains

Write at the beginning of each header file:

#ifndef __TEST_H__ #define __TEST_H__ //Contents of header file #endif //__TEST_H__

Or:

#pragma once

The repeated introduction of header files can be avoided.
above.

3 October 2021, 18:51 | Views: 9672

Add new comment

For adding a comment, please log in
or create account

0 comments