[Nginx source code analysis] analysis of Nginx configuration file

In depth analysis of nginx configuration file parsing, configuration storage, and configuration search.

Before reading this article, readers can think about two questions:

  • 1. Code like this can be seen everywhere in the nginx source code.
//Obtain current limiting related configuration
lrcf = ngx_http_get_module_loc_conf(r, ngx_http_limit_req_module);
//Get fastcgi related configuration
flcf = ngx_http_get_module_loc_conf(r, ngx_http_fastcgi_module);

Why can I get current limiting and fastcgi related configurations in this way?

  • 2. There can be multiple location configurations in the server configuration block. What is the matching priority of location configuration? For example, three locations are configured: "location ^~ /a {}", "location /a/b {}" and "location ~ /a / * {}". The request url is / a/b. which location configuration block can be matched in the end?

I believe that after learning this article, these two problems will be free.

Before learning the parsing process of nginx configuration file, you need to understand some basic knowledge of nginx modules and instructions.

nginx configuration instructions can be divided into two categories: instruction blocks (such as events, http, server and location) and single instructions (such as worker_processes, root, rewrite, etc.).

nginx specifies that instruction blocks can be nested (for example, server instruction can be nested in http block and location instruction can be nested in server block), and instructions can appear in different instruction blocks at the same time (for example, root instruction can appear in http, server and location instruction blocks at the same time).

The complexity of this level of configuration file leads to the complexity of configuration file parsing and storage.

1.1 nginx module

Structure ngx_module_t is used to define an nginx module. Here we need to focus on the following fields.

struct ngx_module_s {
    ngx_uint_t            ctx_index; //Used to number modules of the same type
    ngx_uint_t            index;  //Used to number all modules

    void                 *ctx;  //Module context; Very important; Different types of modules usually point to different types of structures, which usually contain several function pointers
    ngx_command_t        *commands; //Instruction array
    ngx_uint_t            type;  //Module type code
}

Parameter Description:
The type field indicates the module type code.
ctx points to the module context structure, and different types of modules usually point to different types of structures, which usually contain several function pointers.

nginx common modules can be divided into such categories: core module, event module and http module (conf module and mail module are not considered temporarily). See the table below

It can be seen from the three types of module context structures listed in the above table:

  • 1) The core module context structure has only three fields: name indicates the name of the core module; create_conf is used to create the module configuration structure; init_conf is used to initialize the module configuration structure;
  • 2) The first three fields of the event module context structure are the same as those of the core module, but one more type is NGX_ event_ actions_ Fields of T structure; The structure also contains several function pointers, indicating several API s provided by the event module, such as adding events and deleting events, which are not detailed here;
  • 3) As we all know, HTTP related configurations can be divided into three types: http instruction block, server instruction block and location instruction block. The corresponding configuration structure is called main_conf,srv_conf and loc_conf; Corresponding create_conf and init_ The conf method is used to create and initialize related configuration structures.

The preconfiguration and postconfiguration of the http module context structure are used to initialize the operations related to the http processing process

1.2 nginx configuration instruction

The combination of each module of nginx forms its powerful processing ability, and each module only realizes a specific function.
For example, the current limiting function is provided by the module ngx_http_limit_conn_module or module implementation ngx_http_limit_req_module; The fast CGI forwarding function is provided by the NGX module_ http_ fastcgi_ Module implementation;
The proxy forwarding function is provided by ngx_http_proxy_module (of course, the implementation of forwarding function must also have a module ngx_http_upstream_module)

When we configure the instruction proxy_pass or fastcgi_pass, which module should parse the instruction? Obviously, it should be parsed by the module that implements this function. That is, the parsing of nginx configuration files is distributed to each module.

Each module has a commands array that stores all configuration instructions that can be parsed by the module. The instruction structure is defined by ngx_command_t:

struct ngx_command_s {
    ngx_str_t             name;
    ngx_uint_t            type;
    char               *(*set)(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);
    ngx_uint_t            conf;
    ngx_uint_t            offset;
    void                 *post;
};
  • Name: the name of the configuration instruction, such as "proxy_pass";
  • Type: instruction type, which can be divided into two types,
    1) Describe the location where the instruction can appear, such as the configuration file (only in the outermost layer of the configuration file, not in any instruction block), http instruction block, or events instruction block, or server instruction block, or location instruction block;

  • set: processing function, which will be executed when the configuration instruction is read;
  • conf and offset actually represent offsets, but they are of different uses. They will be described in detail when parsing instructions. They are skipped here for the time being.
  • post can point to a variety of structures. Different instructions may be different. Most of them are NUll. They will be described in detail when parsing to specific instructions. The same will be skipped here.

The following figure shows the basic classification of instructions (distinguished by color, the text of various colors describes the instruction type and which type of module the instruction can only be parsed):

Notes: module - > type is divided into modules; CMD - > type main_conf,event conf,http conf(main,srv,loc)

1.3 scheme design of configuration storage format

This section focuses on the scheme design of http related configuration storage.

As mentioned earlier, each module is responsible for parsing and storing configuration instructions of its own concern, that is, each module should have a structure that can store configuration through the function create of the module context structure_ conf,create_main_conf,create_srv_conf or create_loc_conf creation.

For example, the following table:

The problem is that each module creates its own configuration structure, and the storage is completely decentralized. How can we quickly find these configuration structures?

The easiest thing to think of is to declare a void * array. The number of array elements is the number of modules. The index field of the module is used as the index of the array. Each element of the array points to the configuration structure of the corresponding module.

But don't forget that the nginx configuration file has a hierarchical structure. For example, multiple single instructions and multiple server instruction blocks can be declared in the http instruction block, multiple single instructions and multiple location instruction blocks can be declared in the server instruction block, and multiple single instructions can be declared in the location configuration.

We can design it this way:

  • 1) The configuration file can contain multiple instructions, and the instruction block can also contain multiple instructions. Therefore, we can define the instruction scope or instruction context;

 

  • 2) The nesting of instruction blocks is equivalent to the nesting of context, and the context is expressed as a type of structure, so the nesting of instruction blocks can be realized through the mutual reference of structures;
  • 3) An instruction or instruction block can only be parsed by a specific type of module. For example, all instructions contained in the configuration file context can only be parsed by the core module (NGX_CORE_MODULE); All instructions contained in the events instruction block can only be parsed by the event module (NGX_EVENT_MODULE);

All instructions or instruction blocks contained in the HTTP instruction block can only be parsed by the HTTP module (NGX_HTTP_MODULE).

  • 4) The http module can parse the instructions of http instruction block, server instruction block and location instruction block; Therefore, the instruction structure of http module is divided into three types: main_conf,srv_conf and loc_conf, which is created by the function create_main_conf,create_srv_conf and create_loc_conf creation.

Referring to these four designs, we can simply draw the http configuration storage structure diagram:

 

This structure seems to be possible, but we have forgotten one thing: some instructions can appear in http instruction block, server instruction block and location instruction block at the same time.

That is, the instruction type in the http block can be NGX_HTTP_MAIN_CONF, or NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF, or NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONFNGX_HTTP_LOC_CONF;

The instruction type in the server block can be NGX_HTTP_SRV_CONF, or NGX_HTTP_SRV_CONFNGX_HTTP_LOC_CONF. (bit or operation indicates that it belongs to multiple types at the same time)

For example, the type bit of the instruction root is NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF, the configuration should be stored in loc_conf configuration structure, but it may be configured in http instruction block, server instruction block or location instruction block.

To this end, we modify the above structure as follows:

  notes: each void * points to each specific item of the conf, eg: listen 80 in the srv conf. 3 layers of main SRV loc, 2 layers of SRV loc, and 1 layer of loc

Above, we have analyzed the possible storage formats of all instructions inside the http instruction block. The internal storage format of the events instruction block is much simpler. Readers can try to draw a picture.

So is this the storage format adopted by nginx? It can be said that it is very similar to the figure above. The configuration storage format designed by nginx is shown in the figure below. Two questions are left here for the time being:

  • 1) How to implement http_ctx nested srv_ctx,srv_ CTX nested loc_ctx;
  • 2) When an instruction appears in http instruction block, server instruction block and location instruction block at the same time, which configuration shall prevail.

  notes: don't you understand how to nest?

Source code analysis

1.1 configuration analysis process

The entry function for parsing the configuration is ngx_conf_parse(ngx_conf_t   cf, ngx_str_t   filename), whose input parameter filename represents the path of the configuration file. If it is NULL, it indicates that the instruction block is being parsed at this time.

So what is cf? Let's first look at its structure declaration:

struct ngx_conf_s {
    char                 *name; //Currently read instruction name
    ngx_array_t          *args; //Currently read instruction parameters
    ngx_cycle_t          *cycle; //Point to global cycle
    ngx_pool_t           *pool;  //Memory pool
    ngx_conf_file_t      *conf_file; //configuration file
    void                 *ctx;   //context
    ngx_uint_t            module_type; //Module type
    ngx_uint_t            cmd_type;   //Instruction type
    ngx_conf_handler_pt   handler; //Generally, it is NULL. It doesn't matter for the time being
};

Focus on these fields:

  • 1) name and args store the currently read instruction information;
  • 2) ctx context is the instruction context mentioned above. Imagine if there is no ctx, we can get the final storage location of the instruction;
  • 3)module_type and cmd_type indicates module type and instruction type respectively; When reading an instruction, you need to traverse the instruction array of all modules. Through these two fields, you can filter some modules and instructions that should not be parsed.

Function ngx_conf_parse logic is relatively simple, that is, read the complete instruction and call the function ngx_conf_handler handles instructions
Function NGX_ conf_ The main logic of handler is that the traversal type is CF - > module_ Type, and find the module whose type is CF - > CMD in the instruction array of the module_ Type instruction; If the print error log is not found and an error is returned; If found, it is also necessary to verify whether the command parameters are legal; The last is to call the set function to set.
These processes are relatively simple. The difficulty is how to obtain the final storage location of the configuration according to ctx.
The following code needs to be analyzed in combination with the above figure. The configuration must be stored in a structure, so you need to find the corresponding structure through ctx.

if (cmd->type & NGX_DIRECT_CONF) {
    //This type of CF - > CTX will only be conf_ctx, directly get the index element, indicating that the array element has pointed to a structure
    conf = ((void **) cf->ctx)[ngx_modules[i]->index]; 
 
} else if (cmd->type & NGX_MAIN_CONF) {
    //This type of CF - > CTX will only be conf_ctx, the address of the index element is obtained because the array element points to NULL at this time
    conf = &(((void **) cf->ctx)[ngx_modules[i]->index]);
 
} else if (cf->ctx) {  //In this case, CF - > CTX may be events_ctx,http_ctx,srv_ctx or loc_ctx
 
    //Suppose CF - > CTX is http_ctx, CMD - > conf is the field main_conf,srv_conf or loc_conf in structure NGX_ http_ conf_ ctx_ Offset in t
    confp = *(void **) ((char *) cf->ctx + cmd->conf);
 
    if (confp) {
        conf = confp[ngx_modules[i]->ctx_index]; //The same is the CTX to get the array_ Index is an element. At this time, it must point to a structure
    }
}
 
rv = cmd->set(cf, cmd, conf); //Call the set function to set. Note that the parameter conf is entered here

notes:  NGX_DIRECT_CONF& NGX_MAIN_CONF are conf_ctx(config module)
            CF - > CTX may be events_ctx,http_ctx,srv_ctx or loc_ctx

1.2 analysis of configuration file

Function ngx_init_cycle calls ngx_conf_parse starts the parsing of the configuration file.

To parse the configuration file, you first need to create the configuration file context and initialize the structure ngx_conf_t;

//Create a profile context and initialize the context array elements
cycle->conf_ctx = ngx_pcalloc(pool, ngx_max_module * sizeof(void *));//ngx_max_module is the total number of modules
//You need to traverse all core modules and call their create_conf creates a configuration structure and stores it in the context array
for (i = 0; ngx_modules[i]; i++) {
    if (ngx_modules[i]->type != NGX_CORE_MODULE) {
        continue;
    }
    module = ngx_modules[i]->ctx;
    if (module->create_conf) {
        rv = module->create_conf(cycle);
        cycle->conf_ctx[ngx_modules[i]->index] = rv;
    }
}
//Initialize structure ngx_conf_t
conf.ctx = cycle->conf_ctx;
conf.module_type = NGX_CORE_MODULE;
conf.cmd_type = NGX_MAIN_CONF;

Readers can look up the code to see which core modules have create_conf method. After performing this step, you can draw the following figure:

  Combined with the code logic shown in Section 2.2, it is easy to know that the core module ngx_core_module configuration instructions are all with NGX_DIRECT_CONF identified, conf_ The 0th element of CTX array points to its configuration structure ngx_core_conf_t.

if (cmd->type & NGX_DIRECT_CONF) {
    conf = ((void **) cf->ctx)[ngx_modules[i]->index];
}
rv = cmd->set(cf, cmd, conf);

To configure the worker_processes (set the number of worker processes) as an example, and its instruction structure is defined as follows:

{ ngx_string("worker_processes"),
  NGX_MAIN_CONF|NGX_DIRECT_CONF|NGX_CONF_TAKE1,
  ngx_set_worker_processes,
  0,
  0,
  NULL }

Notice the function NGX at this point_ set_ worker_ The third parameter of the processes input parameter already points to the structure ngx_core_conf_t. So you can cast

static char * ngx_set_worker_processes(ngx_conf_t *cf, ngx_command_t *cmd, void *conf){
    ngx_core_conf_t  *ccf;
    ccf = (ngx_core_conf_t *) conf;
}

1.3 parsing of events instruction block

ngx_ events_ The events instruction structure is defined in the module (core module), as follows:

{ ngx_string("events"),
  NGX_MAIN_CONF|NGX_CONF_BLOCK|NGX_CONF_NOARGS,
  ngx_events_block,
  0,
  0,
  NULL }

The events configuration instruction processing function is ngx_events_block; According to its type, you can know in NGX_ conf_ When calling this function, the handler takes the following branches:

else if (cmd->type & NGX_MAIN_CONF) {
    conf = &(((void **) cf->ctx)[ngx_modules[i]->index]);  //At this time, CF - > CTX is still conf_ctx
}
 
rv = cmd->set(cf, cmd, conf);

That is, the function NGX_ events_ The third input parameter to block is conf_ The address of the index element of CTX array, which points to NULL.

Function ngx_events_block mainly needs to deal with three things:
1) Create events_ctx context;
2) Call create for all event modules_ Conf method to create configuration structure;
3) Modify CF - > CTX (note that the configuration context will change when parsing the events block), CF - > module_ Type and CF - > CMD_ Type and call NGX_ conf_ The parse function parses the configuration in the events block

notes: two key variables: module type and comand type

static char *
ngx_events_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    //Create configuration context events_ctx is just a void * structure
    ctx = ngx_pcalloc(cf->pool, sizeof(void *));
    //Array, pointing to the configuration structure created by all time modules; ngx_event_max_module is the number of event modules
    *ctx = ngx_pcalloc(cf->pool, ngx_event_max_module * sizeof(void *));   
    //Conf is conf_ The address of an element of the CTX array; That is, let the element point to the configuration context events_ctx
    *(void **) conf = ctx; 
    //Traverse all event modules to create a configuration structure
    for (i = 0; ngx_modules[i]; i++) {
        if (ngx_modules[i]->type != NGX_EVENT_MODULE) {
            continue;
        }
        m = ngx_modules[i]->ctx; 
        if (m->create_conf) {
            (*ctx)[ngx_modules[i]->ctx_index] = m->create_conf(cf->cycle);
        }
    } 
    //Modify cf's configuration context, module type and instruction type; The pcf variable is temporarily present in the original cf
    pcf = *cf;
    cf->ctx = ctx;
    cf->module_type = NGX_EVENT_MODULE;
    cf->cmd_type = NGX_EVENT_CONF;
    //Parsing the configuration in the events block
    rv = ngx_conf_parse(cf, NULL); 
    //Restore cf
    *cf = pcf;
}

On linux machines, nginx code is compiled with the default option, and the event module usually has only ngx_core_module and ngx_event_module, and both modules have create_conf method. After executing the above code, you can draw the following configuration storage structure diagram:

In NGX_ event_ core_ Take the configuration connections in the module as an example (set the number of connections in the connection pool), and its structure is defined as follows:

{ ngx_string("connections"),
  NGX_EVENT_CONF|NGX_CONF_TAKE1,
  ngx_event_connections,
  0,
  0,
  NULL }

The connections configuration instruction processing function is ngx_event_connections; According to its type, you can know in NGX_ conf_ When calling this function, the handler takes the following branches:

else if (cf->ctx) {  //In this case, CF - > CTX is events_ctx
  
    //Conf P is the first address of the array
    confp = *(void **) ((char *) cf->ctx + cmd->conf);
 
    if (confp) {
        conf = confp[ngx_modules[i]->ctx_index];  //Get array elements
    }
}
 
rv = cmd->set(cf, cmd, conf); //ngx_ event_ core_ CTX of module_ When the index is 0, conf points to the structure ngx_event_conf_t

Function NGX_ event_ The implementation of connections is relatively simple. You only need to give NGX to the structure_ event_ conf_ T the corresponding field can be assigned; Note that the input parameter conf points to the structure ngx_event_conf_t. You can cast types directly.

static char *
ngx_event_connections(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    ngx_event_conf_t  *ecf = conf;
}

1.4 parsing of HTTP instruction block

Above, I learned the parsing of events instruction block. The parsing of http instruction block, server instruction block and location instruction block are very similar

ngx_ http_ The HTTP instruction structure is defined in the module (core module), as follows:

{ ngx_string("http"),
  NGX_MAIN_CONF|NGX_CONF_BLOCK|NGX_CONF_NOARGS,
  ngx_http_block,
  0,
  0,
  NULL }

The processing function of HTTP configuration instruction is ngx_http_block, you can know in NGX according to its type_ conf_ When calling this function, the handler takes the following branches:

else if (cmd->type & NGX_MAIN_CONF) {
    conf = &(((void **) cf->ctx)[ngx_modules[i]->index]);    //At this time, CF - > CTX is still conf_ctx
}
  
rv = cmd->set(cf, cmd, conf);

That is, the function NGX_ http_ The third input parameter to block is conf_ The address of the index element of CTX array, which points to NULL.

Function ngx_http_block mainly needs to deal with three things:
1) Create http_ctx context;
2) Call create of all http modules_ main_ conf,create_srv_conf and create_loc_conf method to create configuration structure
3) Modify CF - > CTX (note that the configuration context will change when parsing http blocks), CF - > module_ Type and CF - > CMD_ Type and call NGX_ conf_ The parse function parses the configuration in the http block
Notes: the three-step implementation of config file is the same

static char * ngx_http_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf){
    //1 create http_ctx configuration manager
    ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));
    //Conf is conf_ The address of an element of the CTX array, that is, the element points to http_ctx configuration context
    *(ngx_http_conf_ctx_t **) conf = ctx;
    //Initialize main_conf array, srv_conf array and loc_conf array; ngx_http_max_module is the number of HTTP modules
    ctx->main_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
    ctx->srv_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
    ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
    //2 call the create of all http modules_ main_ Conf method, create_srv_conf method and create_loc_conf create corresponding configuration structure
    for (m = 0; ngx_modules[m]; m++) {
        module = ngx_modules[m]->ctx;
        mi = ngx_modules[m]->ctx_index;
 
        if (module->create_main_conf) {
            ctx->main_conf[mi] = module->create_main_conf(cf);
        }
 
        if (module->create_srv_conf) {
            ctx->srv_conf[mi] = module->create_srv_conf(cf);   
        }
 
        if (module->create_loc_conf) {
            ctx->loc_conf[mi] = module->create_loc_conf(cf);
        }
    }
 
    //3. Modify cf's configuration context, module type and instruction type; The pcf variable is temporarily present in the original cf
    pcf = *cf;
    cf->ctx = ctx;
    cf->module_type = NGX_HTTP_MODULE;
    cf->cmd_type = NGX_HTTP_MAIN_CONF;
    //4 parsing the configuration in the http block
    rv = ngx_conf_parse(cf, NULL);
    //5 restore cf
    *cf = pcf;
}

After executing the above code, you can draw the following configuration storage structure diagram:

http_ctx configuration context type is structure ngx_http_conf_ctx_t. It has only three fields main_conf,srv_conf and loc_conf, pointing to an array, each element of the array points to the corresponding configuration structure.

For example, ngx_http_core_module is the first HTTP module and its create_ main_ The configuration structure created by the conf method is ngx_http_core_main_conf_t.

In ngx_http_core_module configuration keepalive_timeout (the configuration can appear in the location block, server block and HTTP block. Suppose the configuration is added in the HTTP block) as an example, the instruction structure is defined as follows:

{ ngx_string("keepalive_timeout"),
  NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE12,
  ngx_http_core_keepalive,
  NGX_HTTP_LOC_CONF_OFFSET,
  0,
  NULL }
 
#define NGX_HTTP_LOC_CONF_OFFSET   offsetof(ngx_http_conf_ctx_t, loc_conf)

You can see that the fourth parameter of the instruction structure is LOC instead of 0_ The conf field is in the structure NGX_ http_ conf_ ctx_ Offset in t.

keepalive_ The timeout configuration instruction processing function is ngx_http_core_keepalive; According to its type, you can know in NGX_ conf_ When calling this function, the handler takes the following branches:

else if (cf->ctx) {  //In this case, CF - > CTX is http_ctx
  
    //CMD - > conf is LOC_ The conf field is in the structure NGX_ http_ conf_ ctx_ Offset in t; Conf P is loc_conf array first address
    confp = *(void **) ((char *) cf->ctx + cmd->conf);
 
    if (confp) {
        conf = confp[ngx_modules[i]->ctx_index];  //Get array elements
    }
}
  
rv = cmd->set(cf, cmd, conf); //ngx_ http_ core_ CTX of module_ When the index is 0, conf points to the structure ngx_http_core_loc_conf_t

Function NGX_ http_ core_ The implementation of keepalive is relatively simple and will not be detailed here.

1.5 parsing of server instruction block

ngx_ http_ core_ The server instruction structure is defined in the module, as follows:

{ ngx_string("server"),
  NGX_HTTP_MAIN_CONF|NGX_CONF_BLOCK|NGX_CONF_NOARGS,
  ngx_http_core_server,
  0,
  0,
  NULL }

The processing function of the server configuration instruction is ngx_http_core_server, you can know in NGX according to its type_ conf_ When calling this function, the handler takes the following branches:

else if (cf->ctx) {  //In this case, CF - > CTX is http_ctx
  
    //CMD - > conf is 0; Conf P is main_conf array first address
    confp = *(void **) ((char *) cf->ctx + cmd->conf);
 
    if (confp) {
        conf = confp[ngx_modules[i]->ctx_index];  //Get array elements
    }
}
  
rv = cmd->set(cf, cmd, conf); //ngx_ http_ core_ CTX of module_ When the index is 0, conf points to the structure ngx_http_core_main_conf_t

Function NGX_ http_ core_ The server mainly needs to handle four things:
1) Create srv_ctx context;
2) Call create of all http modules_ srv_ Conf and create_loc_conf method to create configuration structure;
3) SRV_ Add CTX context to http_ctx configuration context;
4) Modify CF - > CTX (note that the configuration context will change when parsing http blocks), CF - > module_ Type and CF - > CMD_ Type and call NGX_ conf_ The parse function parses the configuration in the server block

static char * ngx_http_core_server(ngx_conf_t *cf, ngx_command_t *cmd, void *dummy){
    //1 create srv_ctx configuration context
    ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));
 
    //CF - > CTX is http_ctx configuration context
    http_ctx = cf->ctx;
 
    //2 main_conf share the same (there will be no NGX_HTTP_MAIN_CONF type configuration in the server block, so main_conf is not required)
    ctx->main_conf = http_ctx->main_conf;
    ctx->srv_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
    ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
     
    //3 traverse all http modules and call their create_srv_conf method and create_loc_conf create corresponding configuration structure
    for (i = 0; ngx_modules[i]; i++) {
        module = ngx_modules[i]->ctx;
 
        if (module->create_srv_conf) {
            mconf = module->create_srv_conf(cf);
            ctx->srv_conf[ngx_modules[i]->ctx_index] = mconf;
        }
 
        if (module->create_loc_conf) {
            mconf = module->create_loc_conf(cf);
            ctx->loc_conf[ngx_modules[i]->ctx_index] = mconf;
        }
    }
 
    //4 note that the SRV is implemented here_ Add CTX context to http_ctx configuration context; The code is difficult to understand. Please refer to the following diagram
    //ngx_http_core_module module is the first HTTP module. Gets the SRV it created_ Conf type configuration structure ngx_http_core_srv_conf_t; Point its ctx field to srv_ctx configuration context
    cscf = ctx->srv_conf[ngx_http_core_module.ctx_index];
    cscf->ctx = ctx;
 
    //main_conf is http_ Array of CTX contexts; Gets the main for which it was created_ Conf type configuration structure ngx_http_core_main_conf_t;
    //And, SRV_ Configuration structure of CTX configuration context ngx_http_core_srv_conf_t add to http_ctx configuration context ngx_http_core_main_conf_t the servers array of the configuration structure
    cmcf = ctx->main_conf[ngx_http_core_module.ctx_index];
    cscfp = ngx_array_push(&cmcf->servers);
    *cscfp = cscf;
 
    //5. Modify cf's configuration context, module type and instruction type; The pcf variable is temporarily present in the original cf
    pcf = *cf;
    cf->ctx = ctx;
    cf->cmd_type = NGX_HTTP_SRV_CONF;
 
    //Analyze the configuration in the server block; Note that the configuration context is srv_ctx
    rv = ngx_conf_parse(cf, NULL);
 
    //Restore cf
    *cf = pcf;
}

After executing the above code, you can draw the following configuration storage structure diagram. Here, only HTTP is drawn_ CTX and SRV_ Schematic diagram of CTX configuration context:

Note the red arrow in the figure above. According to the reference of the red arrow, it can be accessed from http_ctx configuration context found srv_ctx configuration context; You may feel that the storage structure here is very complex. Don't worry. When parsing the location instruction block, the diagram will be more complex. But don't worry. This is only the storage structure during parsing, and some optimization will be done eventually. The search is not based on this structure. As for the internal configuration of the server instruction block, it is relatively simple and will not be described in detail here.

1.6 parsing of location instruction block

ngx_ http_ core_ The location instruction structure is defined in the module, as follows:

{ ngx_string("location"),
  NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_BLOCK|NGX_CONF_TAKE12,
  ngx_http_core_location,
  NGX_HTTP_SRV_CONF_OFFSET,
  0,
  NULL }
 
#define NGX_HTTP_SRV_CONF_OFFSET   offsetof(ngx_http_conf_ctx_t, srv_conf)

It can be seen that the location instruction can appear in the server instruction block and the location instruction block (that is, the location itself can be nested); Location configuration can consist of one or two parameters; Note that the fourth parameter of the instruction structure is SRV instead of 0_ The conf field is in the structure NGX_ http_ conf_ ctx_ Offset in t; The instruction processing function is ngx_http_core_location.
According to its type, you can know in NGX_ conf_ When calling this function, the handler takes the following branches:

else if (cf->ctx) {  //In this case, CF - > CTX is srv_ctx
  
    //CMD - > conf is SRV_ The conf field is in the structure NGX_ http_ conf_ ctx_ Offset in t; Conf P is srv_conf array first address
    confp = *(void **) ((char *) cf->ctx + cmd->conf);
 
    if (confp) {
        conf = confp[ngx_modules[i]->ctx_index];  //Get array elements
    }
}
  
rv = cmd->set(cf, cmd, conf); //ngx_ http_ core_ CTX of module_ When the index is 0, conf points to the structure ngx_http_core_srv_conf_t

Function ngx_http_core_server mainly needs to handle three things:
1) Create loc_ctx context;
2) Call create of all http modules_ loc_ Conf method to create configuration structure;
3) Set loc_ Add CTX context to srv_ctx configuration context;
4) Modify CF - > CTX (note that the configuration context will change when parsing http blocks), CF - > module_ Type and CF - > CMD_ Type and call NGX_ conf_ The parse function parses the configuration in the location block

static char * ngx_http_core_location(ngx_conf_t *cf, ngx_command_t *cmd, void *dummy){
    //Create loc_conf context
    ctx = ngx_pcalloc(cf->pool, sizeof(ngx_http_conf_ctx_t));
    //CF - > CTX points to srv_conf context
    pctx = cf->ctx;
    //main_conf and srv_conf and srv_ctx context public;
    //(there will be no NGX_HTTP_MAIN_CONF and NGX_HTTP_SRV_CONF configurations in the location block, so main_conf and srv_conf are not required.)
    ctx->main_conf = pctx->main_conf;
    ctx->srv_conf = pctx->srv_conf;
    ctx->loc_conf = ngx_pcalloc(cf->pool, sizeof(void *) * ngx_http_max_module);
 
    //Traverse all http modules and call their create_ loc_ The conf method creates the corresponding configuration structure
    for (i = 0; ngx_modules[i]; i++) {
        if (ngx_modules[i]->type != NGX_HTTP_MODULE) {
            continue;
        }
 
        module = ngx_modules[i]->ctx;
 
        if (module->create_loc_conf) {
            ctx->loc_conf[ngx_modules[i]->ctx_index] = module->create_loc_conf(cf);
        }
    }
 
    //ngx_http_core_module is the first HTTP module; Get loc_ LOC of CTX configuration context_ The first element of the conf array, ngx_http_core_loc_conf_t structure
    //The LOC of this structure_ The conf field points to LOC_ LOC of CTX configuration context_ Conf array first address
    clcf = ctx->loc_conf[ngx_http_core_module.ctx_index];
    clcf->loc_conf = ctx->loc_conf;
 
    obtain srv_ctx Configuration context loc_conf The first element of the array, i.e ngx_http_core_loc_conf_t structure
    pclcf = pctx->loc_conf[ngx_http_core_module.ctx_index];
 
    //Set loc_ctx configuration context ngx_http_core_loc_conf_t structure added to srv_ctx configuration context ngx_http_core_loc_conf_t's locations field
    //locations is a two-way linked list. The linked list structure is also very interesting. Interested readers can study it
    if (ngx_http_add_location(cf, &pclcf->locations, clcf) != NGX_OK) {
    
    }
}

After executing the above code, you can draw the following configuration storage structure diagram, and only SRV is drawn here_ CTX and loc_ Schematic diagram of CTX configuration context:

 

Note the red arrow in the figure above. According to the reference of the red arrow, it can be used from SRV_ LOC found in CTX configuration context_ CTX configuration context; In fact, this sentence is not rigorous, to be exact, from SRV_ The CTX configuration context can only find LOC_ LOC of CTX configuration context_ Conf array

The reason is that all configurations are actually stored in main_conf array, srv_conf array and loc_conf array.
And loc_ Main of conf configuration context_ Conf array and SRV_ The conf array is actually not configured.
So just need LOC_ LOC of conf configuration context_ Conf array.
There is also a problem left here, the resolution of the location parameter, which is also the focus of our attention, which will be described in section 2.9. As for the internal configuration of the location instruction block, it is relatively simple and will not be described in detail here.

1.7 configuration consolidation

At this step, the configuration file has been parsed, but the http related storage structure is too complex.

And there is another problem:
http_ctx configuration and SRV_ The CTX configuration context has srv_conf and store NGX at the same time_ HTTP_ SRV_ Configuration of conf type
And http_ctx,srv_ctx and loc_ Both CTX configuration contexts have loc_conf array
Store NGX at the same time_ HTTP_ LOC_ Conf type configuration. Then, when the configuration appears in multiple configuration contexts at the same time, how to deal with it, which shall prevail?

Observe the introduction of nginx module in Section 1.1. Most http modules have these two methods: merge_srv_conf and merge_loc_conf, which is used to merge the same configuration of different configuration contexts.

The configuration merge here is actually two srvs_ Conf array or loc_conf array merge.

ngx_ http_ After parsing all the internal configurations of the HTTP block in the block function, perform the merge operation.

//ctx here is http_ctx configuration context. If you don't understand, you can refer to the above schematic diagram.
cmcf = ctx->main_conf[ngx_http_core_module.ctx_index];
cscfp = cmcf->servers.elts;
//Traverse all http modules (in fact, traverse each element of the combined srv_conf and loc_conf array)
for (m = 0; ngx_modules[m]; m++) {
    module = ngx_modules[m]->ctx;
    mi = ngx_modules[m]->ctx_index; 
    //init_main_conf is the default value of initialization configuration. Some configurations need to initialize the default value when there is no assignment
    if (module->init_main_conf) {
        rv = module->init_main_conf(cf, ctx->main_conf[mi]);
    } 
    //merge
    rv = ngx_http_merge_servers(cf, cmcf, module, mi);     
}

The merge operation is performed by the function ngx_http_merge_servers implementation:

static char * ngx_http_merge_servers(ngx_conf_t *cf, ngx_http_core_main_conf_t *cmcf,
    ngx_http_module_t *module, ngx_uint_t ctx_index) {
 
    //ngx_http_core_srv_conf_t array
    cscfp = cmcf->servers.elts;
    //1 CF - > CTX points to http_ctx configuration context
    ctx = (ngx_http_conf_ctx_t *) cf->ctx;
    saved = *ctx;
 
    //2 traverse multiple ngx_http_core_srv_conf_t (multiple server configurations)
    for (s = 0; s < cmcf->servers.nelts; s++) {
        //Through ngx_http_core_srv_conf_t each SRV can be found_ SRV of CTX configuration context_ Conf array
        ctx->srv_conf = cscfp[s]->ctx->srv_conf;
        //Merge http_ SRV of CTX configuration context_ Config to SRV in conf array_ SRV of CTX configuration context_ Conf array
        if (module->merge_srv_conf) {
            rv = module->merge_srv_conf(cf, saved.srv_conf[ctx_index],cscfp[s]->ctx->srv_conf[ctx_index]);
        }
 
        if (module->merge_loc_conf) {
            //Through ngx_http_core_srv_conf_t each SRV can be found_ LOC of CTX configuration context_ Conf array
            ctx->loc_conf = cscfp[s]->ctx->loc_conf;
            //Merge http_ LOC of CTX configuration context_ Config to SRV in conf array_ LOC of CTX configuration context_ Conf array
            rv = module->merge_loc_conf(cf, saved.loc_conf[ctx_index],
                                        cscfp[s]->ctx->loc_conf[ctx_index]);          
            //Merge SRV_ LOC of CTX configuration context_ Configure LOC in conf array_ LOC of CTX configuration context_ Conf array
            clcf = cscfp[s]->ctx->loc_conf[ngx_http_core_module.ctx_index];
            rv = ngx_http_merge_locations(cf, clcf->locations, cscfp[s]->ctx->loc_conf, module, ctx_index);
        }
    }
}

Function NGX_ http_ merge_ Implementation of locations and function ngx_http_merge_servers are basically similar and will not be detailed here. The merging diagram is as follows:

The final http related configuration is stored in an http_ Main of CTX configuration context_ Conf array, multiple srvs_ SRV of CTX configuration context_ Conf array, multiple loc_ LOC of CTX configuration context_ Conf array; Is the shaded part in the figure.

http_ctx,srv_ctx and loc_ The reference relationship between CTX refers to the red arrow.

The problem is how to find multiple srvs_ SRV of CTX configuration context_ Conf array, multiple loc_ LOC of CTX configuration context_ Conf array, which will be described in Section 3.

1.8 location configuration optimization

The syntax rule of location configuration is: location [=|~|~*|^~] /uri / {...},
location configuration can be simply divided into three types: exact matching, maximum prefix matching and regular matching.

The classification rules are as follows:
1) Those starting with "=" are exact matches;
2) Starting with "~" and "~ *" are case sensitive regular matching and case insensitive regular matching respectively;
3) Starting with "^ ~" is the maximum prefix matching;
4) Only / uri of the parameter is the maximum prefix match.

You can see that both type 3 and type 4 are maximum type matches. What is the difference between the two? You can see when looking for a matching location.

So when we configure multiple locaiton s and the request uri can meet the matching rules of multiple locations, which configuration should we choose? Different location types have different matching priorities.

Let's first look at the classification of location configuration. Obviously, it can be classified according to the first character. The parameters and types of location configuration are stored in NGX_ http_ core_ loc_ conf_ Tthe following fields:

struct ngx_http_core_loc_conf_s {
    ngx_str_t     name;   //Name, which is the uri parameter of location configuration
    ngx_http_regex_t  *regex;  //A compiled regular expression that identifies type 2
  
    unsigned      exact_match:1;  //Identify location configuration starting with =, type 1
    unsigned      noregex:1;   //Useful for finding matching location configurations. After the identifier matches the location, no attempt is made to match the locaiton of regular type; Type 3 has this identification
 
    ngx_http_location_tree_node_t   *static_locations; //Through naming, you can see that this is a tree (storing locaiton configurations of types 1, 3, and 4)
    ngx_http_core_loc_conf_t       **regex_locations;   //Store all regular matches
}

When parsing the location instruction block in Section 2.7, it is mentioned that SRV_ LOC of CTX context_ Conf array, the first element points to NGX_ http_ core_ loc_conf_ The locations field of the structure is a two-way linked list that stores all locations configured within the current server instruction block.

Two way linked list nodes are defined as follows:

typedef struct {
    ngx_queue_t                      queue; //Two way linked list unified head; The structure maintains prev and next pointers;
    ngx_http_core_loc_conf_t        *exact; //location configurations of types 1 and 2 are stored in this field of the linked list node
    ngx_http_core_loc_conf_t        *inclusive; //location configurations of types 3 and 4 are stored in this field of the linked list node
} ngx_http_location_queue_t;

Locations have been marked by type and stored in a two-way linked list. In order to achieve efficient priority lookup of locations, it is necessary to sort the location configurations and form multiple locations into a tree at the same time.

These operations are performed by the function ngx_http_block, and after parsing all configurations within the HTTP block.

static char * ngx_http_block(ngx_conf_t *cf, ngx_command_t *cmd, void *conf){
     
    //The directions are messy. You need to refer to the red arrow in the diagram above.
 
    //1 ctx pointing to http_ctx configuration context
    cmcf = ctx->main_conf[ngx_http_core_module.ctx_index];
    cscfp = cmcf->servers.elts;
 
    //2 traverse all srvs_ CTX context
    for (s = 0; s < cmcf->servers.nelts; s++) {
 
        clcf = cscfp[s]->ctx->loc_conf[ngx_http_core_module.ctx_index];
 
        //This method realizes the sorting of location configuration and cuts out the regular location configuration in the two-way linked list
        if (ngx_http_init_locations(cf, cscfp[s], clcf) != NGX_OK) {
            return NGX_CONF_ERROR;
        }
 
        //Only the location configurations of types 1, 3 and 4 are left in the two-way linked list
        if (ngx_http_init_static_location_trees(cf, clcf) != NGX_OK) {
            return NGX_CONF_ERROR;
        }
    }
}

The following analyzes the sorting of location configuration, the clipping of regular type location configuration, and the formation of location tree:

  • 1) The location configuration is sorted by the function ngx_queue_sort(ngx_queue_t   queue, ngx_int_t (cmp)(const ngx_queue_t  , const ngx_queue_t  )) The input parameter queue is a two-way linked list, and CMP is the comparison function of the linked list nodes. The ngx_queue_sort function sorts from small to large, and uses a stable sorting algorithm (when two elements are equal, the sorted order is the same as the original order before sorting)

The comparison function of locations two-way linked list node is ngx_http_cmp_locations. You can know the sorting rules of location configuration through this function. The implementation is as follows:

//Like the general comparison function, returning 1 means that one is greater than two; 0 means that the two are equal; - 1 means that one is less than two
static ngx_int_t ngx_http_cmp_locations(const ngx_queue_t *one, const ngx_queue_t *two) {
    rc = ngx_filename_cmp(first->name.data, second->name.data,
                        ngx_min(first->name.len, second->name.len) + 1);
    return rc;
}

After sorting according to the rules of the above comparison function, the location configuration of the regular type must be arranged at the end of the two-way linked list;
Exact matching and maximum prefix matching are first arranged in alphabetical order of URIs, and when the two uri prefixes are the same, the exact matching type is arranged in front of the maximum prefix matching.

  • 2) After the first step, the location has been arranged in order, and the regular types are arranged at the end of the two-way linked list, so it is easy to cut out the location configuration of all regular types. You only need to traverse the two-way linked list from beginning to end until you find the location configuration of the regular type, and split the two-way linked list from this position.
  • 3) In the two-way linked list, only the location configurations of exact matching type and maximum prefix matching type are left, and they are sorted in alphabetical order of uri. These configurations will be organized into a tree for easy search.

The tree formed is a trigeminal tree. Each node node has three child nodes, left, tree and right. Left must be less than node; right must be greater than node; tree has the same prefix as node, and the uri length of tree node must be greater than that of node node.

Note that only the configuration with the largest prefix matching has a tree node.

Think about why there are tree nodes, and only when the maximum prefix matches can there be tree nodes? After the node matches successfully, the tree nodes may still match successfully.

The tree formation process is not detailed here. Interested readers can study the implementation of the function ngx_http_init_static_location_trees

So far, the configuration file has been parsed. The configurations related to http, server and location are finally stored in main_conf, multiple srv_conf and multiple loc_conf arrays. However, when the server receives the client request, how to find the corresponding srv_conf array and loc_conf array

Tags: Operation & Maintenance Nginx

Posted on Sat, 06 Nov 2021 14:09:33 -0400 by flameche