[Linux high concurrency server] memory mapping

[Linux high concurrency server] memory mapping

reference resources:
Niuke C + + project course
Carefully analyze mmap: what is it, why and how to use it
Linux Process Communication: Memory Map
The second reference blog is very useful. I suggest you take a look.

Memory mapping

Memory-mapped I/O
Memory mapping is to map the data of a disk file into memory. Users can modify disk files by modifying memory, eliminating read/write and other operations

Advantages of memory mapping

From the second reference blog

1. The file reading operation crosses the page cache, reduces the number of copies of data, replaces I/O reading and writing with memory reading and writing, and improves the file reading efficiency.

2. It realizes the efficient interaction between user space and kernel space. The respective modification operations of the two spaces can be directly reflected in the mapped area, so as to be captured by the other space in time.

3. Provides a way for processes to share memory and communicate with each other. Both parent-child processes and unrelated processes can map their own user space to the same file or anonymously to the same area. Thus, by changing the mapping area, the purpose of inter process communication and inter process sharing can be achieved.

At the same time, if both process A and process B map area C, when A reads C for the first time, it copies the file page from the disk to the memory through the missing page; However, when B reads the same page of C again, although page missing exceptions will also occur, it is no longer necessary to copy files from disk, but can directly use the file data already saved in memory.

4. It can be used to realize efficient large-scale data transmission. Insufficient memory space is an aspect that restricts big data operation. The solution is often to supplement the lack of memory with the help of hard disk space. However, further will cause a large number of file I/O operations, which will greatly affect the efficiency. This problem can be solved by mmap mapping. In other words, mmap can work whenever you need to use disk space instead of memory.

Memory mapping related calls

mmap function is mainly used for memory mapping, and mumap function is used to release memory mapping

/*
    #include <sys/mman.h>
    void *mmap(void *addr, size_t length, int prot, int flags,int fd, off_t offset);
        - Function: map the data of a file or device to memory
        - Parameters:
            - void *addr: NULL, Specified by kernel
            - length : The length of the data to be mapped. This value cannot be 0. The length of the file is recommended.
                    Get the length of the file: stat lseek
            - prot : Operation permission on the requested memory mapping area
                -PROT_EXEC : Executable permissions
                -PROT_READ : Read permission
                -PROT_WRITE : Write permission
                -PROT_NONE : No permission
                To manipulate mapped memory, you must have read permission.
                PROT_READ,PROT_READ|PROT_WRITE
            - flags :
                - MAP_SHARED : The data in the mapping area will be automatically synchronized with the disk file. This option must be set for inter process communication
                - MAP_PRIVATE : Out of synchronization, the data in the memory mapping area is changed. The original file will not be modified and a new file will be created. (copy on write)
            - fd: The file descriptor of the file that needs to be mapped
                - Get through open. Open is a disk file
                - Note: the file size cannot be 0, and the permissions specified by open cannot conflict with the prot parameter.
                    prot: PROT_READ                open:Read only / read write 
                    prot: PROT_READ | PROT_WRITE   open:Reading and writing
            - offset: Offset, generally not used. Must specify an integer multiple of 4k, 0 means not cheap.
        - Return value: returns the first address of the created memory
            Failed to return MAP_FAILED,(void *) -1

    int munmap(void *addr, size_t length);
        - Function: free memory mapping
        - Parameters:
            - addr : The first address of the memory to be freed
            - length : The size of memory to be freed should be the same as the value of the length parameter in the mmap function.
*/

Related interprocess communication

1. Relational process (parent-child process)
-When there are no child processes
-Create a memory mapped area first through a unique parent process
-Once you have a memory mapped area, create a child process
-The parent and child processes share the created memory mapped area

#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <wait.h>

int main() {

    // 1. Open a file
    int fd = open("test.txt", O_RDWR);
    int size = lseek(fd, 0, SEEK_END);  // Gets the size of the file

    // 2. Create a memory mapping area
    void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if(ptr == MAP_FAILED) {
        perror("mmap");
        exit(0);
    }

    // 3. Create sub process
    pid_t pid = fork();
    if(pid > 0) {
        wait(NULL);//Blocking, waiting for the child process to write, recycling
        // Parent process reads data
        char buf[64];
        strcpy(buf, (char *)ptr);
        printf("read data : %s\n", buf);
       
    }else if(pid == 0){
        // Subprocess (write data)
        strcpy((char *)ptr, "nihao a, son!!!");
    }

    // Close the memory mapping area
    munmap(ptr, size);

    return 0;
}

Unrelated interprocess communication

2. Unrelated inter process communication
-Prepare a disk file with a size other than 0
-Process 1 creates a memory mapped area from a disk file
-Get a pointer to the memory
-Process 2 creates a memory mapped area from a disk file
-Get a pointer to the memory
-Use memory mapped area communication

Reference: the third blog

Write end

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
 
typedef struct _data {
    int a;
    char b[64];
} Data;
 
 
int main() {
    Data *addr;
    Data data = { 10, "Hello World\n" };
    int fd;
 	
 		//Prepare a file that is not 0
    fd = open("mmap_temp_file", O_RDWR|O_CREAT|O_TRUNC, 0644);
    if (fd == -1) {
        perror("open failed\n");
        exit(EXIT_FAILURE);
    }
    ftruncate(fd, sizeof(data));
 
    // Create a memory mapped area using fd
    addr = (Data *)mmap(NULL, sizeof(data), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    if (addr == MAP_FAILED) {
        perror("mmap failed!\n");
        exit(EXIT_FAILURE);
    }
    close(fd); // After mapping, the file can be closed
 
    memcpy(addr, &data, sizeof(data)); // Write data to the mapping area
    munmap(addr, sizeof(data)); // Release mapping area
    return 0;
}

Read end

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
 
typedef struct _data {
    int a;
    char b[64];
} Data;
 
 
int main() {
    Data *addr;
    int fd;
 
    fd = open("mmap_temp_file", O_RDONLY);
    if (fd == -1) {
        perror("open failed\n");
        exit(EXIT_FAILURE);
    }
 
    // Create a memory mapped area using fd
    addr = (Data *)mmap(NULL, sizeof(Data), PROT_READ, MAP_SHARED, fd, 0);
    if (addr == MAP_FAILED) {
        perror("mmap failed!\n");
        exit(EXIT_FAILURE);
    }
    close(fd); // After mapping, the file can be closed
 
    printf("read form mmap: a = %d, b = %s\n", addr->a, addr->b); // Write data to the mapping area
    munmap(addr, sizeof(Data)); // Release mapping area
    return 0;
}

Case: file copy by memory mapping

// Use memory mapping to realize the function of file copy
/*
    Idea:
        1.Memory mapping of the original file
        2.Create a new file (expand the file)
        3.Map the data of the new file to memory
        4.Copy the memory data of the first file to the new file memory through memory copy
        5.Release resources
*/
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main() {

    // 1. Perform memory mapping on the original file
    int fd = open("english.txt", O_RDWR);
    if(fd == -1) {
        perror("open");
        exit(0);
    }

    // Gets the size of the original file
    int len = lseek(fd, 0, SEEK_END);

    // 2. Create a new file (expand the file)
    int fd1 = open("cpy.txt", O_RDWR | O_CREAT, 0664);
    if(fd1 == -1) {
        perror("open");
        exit(0);
    }
    
    // Expand on newly created files
    truncate("cpy.txt", len);
    write(fd1, " ", 1);

    // 3. Do memory mapping respectively
    void * ptr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    void * ptr1 = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd1, 0);

    if(ptr == MAP_FAILED) {
        perror("mmap");
        exit(0);
    }

    if(ptr1 == MAP_FAILED) {
        perror("mmap");
        exit(0);
    }

    // Memory Copy 
    memcpy(ptr1, ptr, len);
    
    // Release resources
    munmap(ptr1, len);
    munmap(ptr, len);

    close(fd1);
    close(fd);

    return 0;
}

It should be noted that when releasing, those opened first are released later, and those opened later are released first

anonymous mapping

Anonymous mapping: File entity processes do not need a memory mapping
Anonymous mapping cannot be used by unrelated processes

#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <sys/wait.h>

int main() {

    // 1. Create an anonymous memory mapping area
    int len = 4096;
    void * ptr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    if(ptr == MAP_FAILED) {
        perror("mmap");
        exit(0);
    }

    // Parent child interprocess communication
    pid_t pid = fork();

    if(pid > 0) {
        // Parent process
        strcpy((char *) ptr, "hello, world");
        wait(NULL);
    }else if(pid == 0) {
        // Subprocess
        sleep(1);
        printf("%s\n", (char *)ptr);
    }

    // Free memory mapped area
    int ret = munmap(ptr, len);

    if(ret == -1) {
        perror("munmap");
        exit(0);
    }
    return 0;
}

matters needing attention

1. If you perform a + + operation (PTR + +) on the return value (ptr) of mmap, can munmap succeed?
void * ptr = mmap(...);
ptr++; You can perform + + operations on it
munmap(ptr, len); // Error, to save the address

2. If open, O_ When rdonly, MMAP, the prot parameter specifies prot_ READ | PROT_ What happens to write?
Error, return MAP_FAILED
The permission suggestion in the open() function is consistent with the permission of the prot parameter.

3. What happens if the file offset is 1000?
The offset must be an integer multiple of 4K, and map is returned_ FAILED

4. Under what circumstances will MMAP call fail?
-Second parameter: length = 0
-The third parameter: prot
-Only write permissions are specified
- prot PROT_READ | PROT_WRITE
The fifth parameter fd is o specified when passing the open function_ RDONLY / O_ WRONLY

5. O when it can be open ed_ Create a new file to create the mapping area?
-Yes, but if the size of the created file is 0, it certainly won't work
-New files can be extended
- lseek()
- truncate()

6. Does closing the file descriptor after mmap affect the mmap mapping?
int fd = open("XXX");
mmap(,fd,0);
close(fd);
The mapping area still exists, and the fd creating the mapping area is closed without any impact. Because the mapping is the address of the disk, not the file itself, and has nothing to do with the file handle. At the same time, the effective address space available for inter process communication is not completely limited by the size of the mapped file, because it is mapped by page.

7. What happens to ptr cross-border operation?
void * ptr = mmap(NULL, 100,);
4K
The out of bounds operation is an illegal memory - > segment error

8. The process communication realized by memory mapping is non blocking

9. A key point to note when using mmap is that the size of the mmap mapping area must be an integral multiple of the physical page_size (usually 4k bytes in 32-bit systems). The reason is that the minimum granularity of memory is pages, and the mapping of process virtual address space and memory is also in pages. In order to match memory operations, the mapping of mmap from disk to virtual address space must also be pages.

Tags: Linux Operation & Maintenance Back-end server Project

Posted on Sat, 23 Oct 2021 10:49:50 -0400 by PinkFloyd007