First knowledge of cgroups principle

cgroups subsystem

  • cpu subsystem, which is used to limit the cpu utilization of processes
  • cpuacct subsystem can count the usage reports of processes in cgroups
  • cpuset subsystem, which allocates separate cpu nodes or memory nodes for processes in cgroups
  • The memory subsystem can limit the memory usage of the process
  • blkio subsystem, which can limit the block device io of the process
  • Devices subsystem, which can control processes and access some devices
  • net_cls subsystem can mark the network packets of processes in cgroups, and then use tc module to control the packets
  • freezer subsystem, which can suspend or resume processes in cgroups
  • ns subsystem can make processes under different cgroups use different namespace s

After the cgroups structure is created, processes can be added to the control tasks of a node. All processes in the control list of a node will be limited by the resources of the current node. At the same time, a process can also be added to nodes of different cgroups hierarchy.

Each process descriptor has a pointer to an auxiliary data structure CSS_ The process of set (cgroups sub system set) will be added to the current CSS_ In the process linked list of set, a process belongs to only one css_set, a CSS_ A set can contain multiple processes

Then css_set can perform many to many association with cgroups nodes through auxiliary data structures, notably css_set cannot manage multiple nodes under the cgroups hierarchy at the same time, because cgroups cannot have multiple restricted configurations for the same resource

A CSS_ When set is associated with multiple cgroups hierarchy nodes, it indicates that the current CSS needs to be updated_ The process under set indicates that the CSS needs to be_ A cgroups node is associated with multiple CSS_ Set, indicating multiple CSS_ The process list under set is restricted by the same resource

cgroups is a process that provides functions to the user state through VFS, and provides a unified API interface to the user state

VFS

The common file model contains four metadata structures

  • The superblock object stores the registered file system information, such as ext2,ext3 and other basic disk file systems, as well as the socket file system used to read and write sockets and the cgroups file system used to read and write cgroups
  • inode object stores the information of specific files. For general disk file systems, inode node stores the storage block information of files in the hard disk. For socket file systems, inode will store the relevant attributes of sockets. For special file systems such as cgroups, inode will store the relevant attribute information of cgroups node
  • File object: a file object represents a file opened in the process. The file object is stored in the file descriptor table of the process. Similarly, the most important part of this file is file_operations structure, which describes the read-write implementation of the specific file system. When a process performs a read-write operation on a file descriptor, for an ordinary disk file system, file_operations are ordinary write/read operations. For socket file system, file_operation refers to send/recv and other operations. For cgroups special file system, file_operation is the concrete implementation of cgroups structure
  • In each file system, when the kernel looks for a file in a certain path, it will generate a directory item object for each component on the kernel path
//Structure of cgroups file system type
static struct file_system_type cgroup_fs_type = {
    .name = "cgroup",
    .mount = cgroup_mount,
    .kill_sb = cgroup_kill_sb,
}

//Operations defined by cgroups superblock object
static const struct super_operations cgroup_ops = {
    .statfs = simple_statfs,
    .drop_inode = generic_delete_inode,
    .show_options = cgroup_show_options,
    .remount_fs = cgroup_remount,
}
  
//Special meaning defined by inode object and file object
static const struct inode_operations cgroup_dir_inode_operations = {
    .lookup = cgroup_lookup,
    .mkdir = cgroup_mkdir,
    .rmdir = cgroup_rmdir,
    .rename = cgroup_rename,
};

static const struct file_operations cgroup_file_operations = {
    .read  = cgroup_file_read,
    .write = cgroup_file_write,
    .llseek = generic_file_llseek,
    .open = cgroup_file_open,
    .release = cgroup_file_release,
}

In Linux, users can mount cgroups file system through mount command

//For example, mount cpuset, CPU, cpuacct and memory subsystem s to / cgroup/cpu_and_mem directory, you can use 
mount -t cgroup -o remount,cpu,cpuset,memory cpu_and_mem /cgroup/cpu_and_mem

Actual use case:

docker implements resource isolation and control between different containers through cgroup,chroot,namespace and other technologies. For processes in the same container, resource limitation can be realized by adding process PID to a byte point of cgroups

//Example of limiting CPU resources to 50%
First in cpu One is created under the subsystem halfapi Child nodes of: cgcreate abc:abc -g cpu:halfapi

Then write the configuration data in the configuration file: echo 50000 > /cgroup/cpu/halfapi/cpu.cfs_quota_us 
cpu.cfs_quota_us The default value in is 100000, and writing 50000 means that only 50 can be used%of cpu Run time.

Finally, in this cgroups Start this task in: cgexec -g "cpu:/halfapi" php halfapi.php half >/dev/null 2>&1

Before cgroups was introduced into the kernel, if you want to control CPU resources, you can adjust the priority through the nice command and limit the CPU utilization of the process through the cpulimit command

Tags: Linux kernel

Posted on Fri, 24 Sep 2021 08:09:40 -0400 by tearrek