Super detailed! Process of program preprocessing

● 🧑 Personal homepage: You're handsome. You say it first
● 📃 Welcome to praise 👍 follow 💡 Collection 💖
● 📖 Having chosen the distance, he only cares about the wind and rain.
● 🤟 If you have any questions, please feel free to write to me!
● 🧐 Copyright: This article is original by [you Shuai, you say first.] and launched by CSDN. Infringement must be investigated.

1. Translation environment and execution environment of the program

In any implementation of ANSIC, there are two different environments.

The first is the translation environment, in which the source code is converted into executable machine instructions.
The second is the execution environment, which is used to actually execute code.

That's too abstract. Let's explain it with a diagram.

2. Detailed compilation + link

2.1 translation environment

Each source file constituting a program is converted into object code through the compilation process.
Each object file is bound together by a linker to form a single and complete executable program.
The linker also introduces any function used by the program in the standard C function library, and it can search the programmer's personal program library and link the functions it needs to the program.

2.2 compilation phase

Look directly at the code

#include <stdio.h>
extern int Add(int x, int y);
int main()
	int a = 10;
	int b = 20;
	int ret = 0;
	ret = Add(a, b);
	printf("ret = %d\n", ret);

	return 0;


int Add(int x, int y)
	return x + y;

Next, let's illustrate what happened to this program

Here is only a rough explanation of these processes. If you want to understand the detailed processes, you can see the book "programmer's self-cultivation". These processes are explained in great detail here.

2.3 operating environment

Process of program execution:

  1. The program must be loaded into memory. In an operating system environment: This is usually done by the operating system. In an independent environment, the loading of the program must be arranged manually, or it may be completed by putting the executable code into the read-only memory.
  2. The execution of the program begins. The main function is then called.
  3. Start executing program code. At this time, the program will use a run-time stack to store the local variables and return addresses of the function. Programs can also use static memory. Variables stored in static memory keep their values throughout the execution of the program.
  4. Terminate the procedure. Normally terminate the main function; It can also be terminated unexpectedly.

3. Detailed explanation of pretreatment

3.1 predefined symbols

FILE / / source FILE for compilation
LINE / / the current LINE number of the file
DATE / / the DATE the file was compiled
TIME / / the TIME when the file was compiled
STDC / / if the compiler follows ANSI C, its value is 1, otherwise it is undefined

For example 🌰:

3.2 #define

3.2.1 defining identifiers

We are already familiar with this operation. We have talked about it in the previous articles. We won't explain it again here.
Here are some special definitions

#define reg register / / create a short name for the keyword register
#define do_forever for() / / replace an implementation with a more vivid symbol
#define CASE break;case / / when writing a case statement, the break is automatically written.
//If the defined stuff is too long, it can be written in several lines. Except for the last line, a backslash (continuation character) is added after each line.
printf("file:%s\tline:%d\t \
date:%s\ttime:%s\n" ,\

When defining identifier constants, be careful not to add;, For example, #define MAX 100; At this time, all MAX will be replaced with 100 during the replacement process;, The replaced statement will have two semicolons.

3.2.2 defining macros

Here is how macros are declared:

#define name( parament-list ) stuff

The argument list is a comma separated symbol table, which may appear in the stuff.
for example

#define SQUARE( x ) (x) * (x)
SQUARE( 5 );
At this time, the compiler will calculate 5*5

The end result will be 25
I'm sure some people will wonder why two x's should be added (), for example 🌰.

#define SQUARE( x ) x*x
int main()
	int a=5
The calculation in the compiler is like this
 So add()Ensure that the parameters passed in are a whole.

3.2.3#define replacement rules

Extending #define to define symbols and macros in a program involves several steps.

  1. When calling a macro, first check the parameters to see if they contain any symbols defined by #define. If so, they are replaced first.
  2. The replacement text is then inserted into the position of the original text in the program. For macros, parameter names are replaced by their values.
  3. Finally, scan the result file again to see if it contains any symbols defined by #define. If so, repeat the above process.

1. Variables defined by other #define can appear in macro parameters and #define definitions. But for macros, recursion cannot occur.
2. When the preprocessor searches #define defined symbols, the contents of string constants are not searched.

3.2.4# and##

First #, # is to change a macro parameter into a corresponding string.

#define PRINT(n) printf("the value of "#n" is %d\n", n)
int main()
	int a = 10;
	here#n will be replaced by the string "a", and the final result is the value of a is 10
	int b = 20;
	here#n will be replaced by the string "b", and the final result is the value of a is 20
	return 0;

Then ##, ## you can combine the symbols on both sides of it into one symbol.
For example 🌰:

3.2.5 macro parameters with side effects

When a macro parameter appears more than once in the macro definition, if the parameter has side effects, you may be in danger when using this macro, resulting in unpredictable consequences. A side effect is a permanent effect that occurs when an expression is evaluated.
For example:

x+1;//No side effects
x++;//With side effects
#define MAX(X,Y) ((X)>(Y)?(X):(Y))
int main()
	int a = 5;
	int b = 8;
	//Macro parameters are replaced directly without calculation
	//Participate in the operation after replacement
	int m = MAX(a++, b++);
	//int m = MAX(a++, b++);
	//int m = ((a++) > (b++) ? (a++) : (b++));
	//           5       8       6       9
	//So the final b is equivalent to + + twice, and the final b is 10
	return 0;

3.2.6 macro and function comparison

#define MAX(a, b) ((a)>(b)?(a):(b))

Take this example just now
Although there are side effects, it is better to use macros than functions.

1. The code used to call and return from the function may take more time than actually performing this small calculation. So macros are better than functions in terms of program size and speed.
2. More importantly, the parameters of the function must be declared as specific types. Therefore, functions can only be used on expressions of the appropriate type. On the contrary, how can this macro be applied to types such as integer, long integer, floating point, etc. that can be compared with >. Macros are type independent.

Of course, compared with macros, functions also have disadvantages:

  1. Each time you use a macro, a copy of the macro definition code will be inserted into the program. Unless the macro is short, it may greatly increase the length of the program.
  2. Macros cannot be debugged.
  3. Macros are not rigorous enough because they are type independent.
  4. Macros may cause operator priority problems, resulting in error prone procedures.

But macros can do things that functions can't. For example, macro parameters can have types, but functions cannot.
For example 🌰:

#define MALLOC(num, type) \

int main()
	int* p = MALLOC(100, int);
	//int* p = (int*)malloc(100 * sizeof(int));
	return 0;

A comparison between macros and functions

Naming convention
Generally speaking, the usage syntax of function macros is very similar. So language itself can't help us distinguish between the two.
Then one of our usual habits is:

Capitalize all macro names
 Function names should not be capitalized


This instruction is used to remove a macro definition.

#undef NAME
//If an existing name needs to be redefined, its old name must first be removed.

3.4 command line definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process.
For example, this feature is useful when we want to compile different versions of a program according to the same source file. (suppose a program declares an array of a certain length. If the machine memory is limited, we need a small array, but if the other machine memory is capitalized, we need an array that can be capitalized.)

#include <stdio.h>
int main()
    int array [SZ];
    int i = 0;
    for(i = 0; i< SZ; i ++)
        array[i] = i;
    for(i = 0; i< SZ; i ++)
        printf("%d " ,array[i]);
    printf("\n" );
    return 0; }

Compile instruction

gcc -D SZ=10 programe.c

3.5 conditional compilation

Compile only when the conditions are met

int main()
#if 1 / / compile if the condition is true
	printf("hello world\n");
	return 0;

Common conditional compilation instructions:

#if constant expression
//Constant expressions are evaluated by the preprocessor.
For example:
#define __DEBUG__ 1
#if __DEBUG__
2.Conditional compilation of multiple branches
#if constant expression
#elif constant expression
3.Determine whether it is defined
#if defined(symbol)
#ifdef symbol
#if !defined(symbol)
#ifndef symbol
4.Nested instruction
#if defined(OS_UNIX)
 #ifdef OPTION1
 #ifdef OPTION2
#elif defined(OS_MSDOS)
 #ifdef OPTION2

3.6 documents include

Local file contains

#include "filename"

Search strategy: first search in the directory where the source file is located. If the header file is not found, the compiler looks for the header file in the standard location like looking for the library function header file.
If it cannot be found, it will prompt a compilation error.
Library file contains

#include <filename.h>

Search strategy: search the header file directly to the standard path. If it cannot be found, it will prompt a compilation error.
According to this logic, you can also use "" for the inclusion of library files. The answer is yes, but it is not recommended. This is less efficient. Of course, it is not easy to distinguish between library files and local files.

Nested file contains

comm.h and comm.c are common modules.
test1.h and test1.c use common modules.
test2.h and test2.c use common modules.
test.h and test.c use test1 and test2 modules.
In this way, two copies of comm.h will appear in the final program. This results in duplication of file contents.
How to solve it?
The first method is conditional compilation

#ifndef __TEST_H__
#define __TEST_H__
//Contents of header file
#endif   //__TEST_H__

The second method

#pragma once

The repeated introduction of header files can be avoided.

Seeing this, I'm finished.

Before you leave, don't forget to like it 👍 follow 💡 Collection 💖 (long press) 👍 But one button three times, don't whore for nothing, the ball)

Tags: C

Posted on Sun, 10 Oct 2021 02:04:55 -0400 by kattar