executive summary
Last [network programming] IO multiplexing select I talked about the use of IO multiplexing and select. This article attempts to use epoll.
In terms of implementation mechanism, epoll object has two core data structures: eventpoll event pool (red black tree) and rdlist ready list (two-way linked list). In terms of use, epoll provides epoll for external use_ create,epoll_ctl,epoll_wait has three interfaces:
epoll_create is used to create an epoll object;
epoll_ctl is used to add or delete events to the event pool and modify existing events in the event pool;
epoll_wait is used to check the ready list and obtain ready IO events;
Interface epoll_create​
int epoll_create(int size);
epoll_create has only one parameter, size, which is intended to inform the kernel of the number of file descriptors to be added to the epoll object. In fact, after Linux version 2.6.8, this parameter is no longer required by the kernel, and the reference will be ignored. However, for backward compatibility, the reference should still be greater than 0. man epoll_create:
If creating an epoll object fails, epoll_create returns - 1; If the creation is successful, epoll_create return Handle of epoll object (also a file descriptor, which should be close d after use, otherwise the handle will be leaked).
Interface epoll_ctl
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
epoll_ The first parameter of CTL is epoll_ Handle of the epoll object returned by create (each epoll object has an independent event pool. To operate the event pool, you need to specify which epoll object's event pool is).
The second parameter is the type of operation to be performed on the event pool, EPOLL_CTL_ADD,EPOLL_CTL_MOD,EPOLL_CTL_DEL corresponds to adding events, deleting events and modifying existing events to the event pool.
The third parameter passes in the handle to listen.
The fourth parameter passes in the event to listen to. The type is epoll_ Pointer to the event structure.
If the function is executed successfully, 0 is returned; - 1 if execution fails.
epoll_event
epoll_event structure in epoll_ctl and epoll_event is useful in both interfaces. It is used to register the events of interest and return the pending events, epoll_ The event structure contains two members: events and data:
struct epoll_event { __uint32_t events; // Events of interest and epoll's working mode epoll_data_t data; // User data variable };
Events are the events registered with the epoll object (the events of interest to listen to) and the working mode of epoll, which are represented by corresponding enumeration values: EPOLLIN indicates that the event of interest is the file descriptor passed in by the third parameter, EPOLLOUT indicates that the event of interest is the file descriptor passed in by the third parameter, which is readable, EPOLLOUT indicates that the event of interest is the file descriptor passed in by the third parameter, which is writable, EPOLLET specifies that the event is in edge trigger mode. When parameters are actually passed, several enumeration values are bitwise or later passed to the events member, eg
// Register events with the epoll object, pay attention to whether the handle is readable, and set the working mode of epoll to edge trigger stEpEvent.events= EPOLLIN | EPOLLET; stEpEvent.data.fd = iLsnFd_5197; epoll_ctl(iEpObj, EPOLL_CTL_ADD, iLsnFd_5197, &stEpEvent);
data is a consortium, in which FD plays the role of passing socket handle - telling epoll the handle we are concerned about when registering events; When the event is ready, we know which handle has a ready event according to fd.
typedef union epoll_data { void *ptr; int fd; __uint32_t u32; __uint64_t u64; } epoll_data_t;
Working mode
Epoll has two working modes: horizontal trigger and edge trigger. Horizontal trigger and edge trigger refer to epoll's two reminder methods for ready events:
Level Trigger (LT) is the default working mode of epoll. As long as there is a ready event on the file descriptor, epoll will be called every time_ Wait will return. The bad effect is that if there are a large number of ready file descriptors that do not need to be read or written, epoll will be called every time_ Wait will return a ready event, which affects the application to find and process the file descriptor of interest.
Edge Trigger (ET) is "high-speed mode". It will notify only when the state of the socket changes (the receiving buffer changes from no data to data, and the sending buffer changes from full to dissatisfied). That is, in et mode, the ready event will be notified only once. That means that in the ET mode, you must "finish things at one time" -- read all readable data and send all data -- or call epoll next time_ Wait won't have a reminder.
Interface epoll_wait
int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout);
epoll_ The first parameter of wait is epoll_ Handle of epoll object returned by create;
The second parameter is used to return the array of pending events. The type is epoll_ Pointer to the event structure;
The third parameter passes in the number of events that can be processed each time, that is, it tells the kernel the maximum number of file descriptors returned this time;
The fourth parameter is passed in the timeout time for waiting for I/O events, in milliseconds. Pass 0 to indicate immediate return and - 1 to indicate permanent blocking;
When the event is ready, it is added to the rdlist (ready linked list). epoll_ When wait checks whether an event occurs, you only need to check whether there is data in the rdlist. If there is data in the rdlist (there are events ready), the number of ready events is returned; If there is no data in the rdlist, the application will be blocked and wait; If no event is ready after waiting for the timeout, it returns 0; If an error occurs during interface execution, - 1 is returned.
epoll_ The wait function can only obtain whether a registered event has occurred, but it does not know what the event is and which socket it is. This involves using the epoll mentioned above_ The data.fd in the event structure determines whether it is the socket of interest (there may be many sockets registered with epoll and events of interest, but not all the ready events of registered sockets are concerned at present), and then FD reads and writes the data.
Main process structure
epoll can not only monitor whether the connection handle is readable and writable, but also monitor whether there are connections that can be accept ed on the listening handle. So, in [network programming] IO multiplexing select On the basis of the experiment, let the server listen to one more port 5198, increase the number of clients to four, let two clients connect 5197 ports and two clients connect 5198 ports, create an epoll object in the server listening thread, monitor the two ports 5197 and 5198 at the same time, and check whether there are available connections. In the server worker thread, create another epoll object to monitor whether there is data readable on the established connection. After the listening thread accept s a connection, it adds the connection handle to the event pool of the epoll object created by the worker thread. The process structure of the main body is as follows:
experimental result
The listening thread on the server detects available connections twice, and the working thread detects four events that can be received at one time:
Complete code implementation
Header file
#include <stdio.h> #include <unistd.h> #include <errno.h> #include <string.h> #include <pthread.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <sys/epoll.h> #include <sys/syscall.h>
Macro definition
#define LOCAL_IP_ADDR "127.0.0.1" #define SERVER_LISTEN_PORT_5197 5197 #define SERVER_LISTEN_PORT_5198 5198 #define MAX_LISTEN_EVENTS 16 #define NET_MSG_BUF_LEN 128 #define CLINET_SEND_MSG "Hello Server~" #define SERVER_SEND_MSG "Hello Client~"
Function to get thread ID
pid_t gettid(void) { return syscall(SYS_gettid); }
Client thread entry function
void* client(void* param) { int iRes = 0; int iConnFd = 0; int iNetMsgLen = 0; int servLsnPort = *(int *)param; pthread_t thdId = gettid(); char szNetMsg[NET_MSG_BUF_LEN] = {0}; struct sockaddr_in stServAddr; iConnFd = socket(AF_INET, SOCK_STREAM, 0); if (-1 == iConnFd) { printf("Client[%u] failed to create socket, err[%s]\n", thdId, strerror(errno)); return NULL; } // Fill in the target address structure and specify the protocol family, target port and target host IP address stServAddr.sin_family = AF_INET; stServAddr.sin_port = htons(servLsnPort); stServAddr.sin_addr.s_addr = inet_addr(LOCAL_IP_ADDR); // 1. Pass the socket handle, 2. Pass the pointer of the target address structure to be connected, and 3. Pass the size of the address structure while (1) { iRes = connect(iConnFd, (struct sockaddr *)&stServAddr, sizeof(stServAddr)); if (0 != iRes) { printf("Client[%u] failed to connect to[%s:%u], err[%s]\n", thdId, LOCAL_IP_ADDR, servLsnPort, strerror(errno)); sleep(2); continue; } else { printf("Client[%u] succeeded to connect to[%s:%u]\n", thdId, LOCAL_IP_ADDR, servLsnPort); break; } } iNetMsgLen = send(iConnFd, CLINET_SEND_MSG, strlen(CLINET_SEND_MSG), 0); if (iNetMsgLen < 0) { printf("Client[%u] failed to send msg to server, err[%s]\n", thdId, strerror(errno)); close(iConnFd); return NULL; } iNetMsgLen = recv(iConnFd, szNetMsg, sizeof(szNetMsg), 0); if (iNetMsgLen < 0) { printf("Client[%u] failed to read from network, err[%s]\n", thdId, strerror(errno)); } else { printf("Client[%u] recv reply[%s]\n", thdId, szNetMsg); } close(iConnFd); return NULL; }
Function functions handle readable events
int recvEventProc(int iEventNum, struct epoll_event *pastEvents) { int iRes = 0, iIndex = 0, iNetMsgLen = 0, iConnFd = 0; char szNetMsg[NET_MSG_BUF_LEN] = {0}; for (iIndex = 0; iIndex < iEventNum; iIndex++) { // Simplify code with temporary variables iConnFd = pastEvents[iIndex].data.fd; // Receive client message iNetMsgLen = recv(iConnFd, szNetMsg, sizeof(szNetMsg), 0); if (iNetMsgLen < 0) { printf("Server work failed to recv from network, err[%s]\n", strerror(errno)); break; } printf("Server work recv msg[%s]\n", szNetMsg); // Reply to client iNetMsgLen = send(iConnFd, SERVER_SEND_MSG, strlen(SERVER_SEND_MSG), 0); if (iNetMsgLen < 0) { printf("Server work failed to reply client, err[%s]\n", strerror(errno)); break; } close(iConnFd); } // An exception occurs and exits the for loop in advance if (iIndex < iEventNum) { close(iConnFd); iRes = -1; } return iRes; }
Server worker thread entry function
void* serverWork(void* param) { int iRes = 0, iEventNum = 0; struct epoll_event astEvents[MAX_LISTEN_EVENTS] = {0}; int *piEpConnFd = (int *)param; *piEpConnFd = epoll_create(MAX_LISTEN_EVENTS); if (-1 == *piEpConnFd) { printf("Server work failed to create epoll obj, err[%s]\n", strerror(errno)); return NULL; } while (1) { printf("Server work start wait recv event.\n"); // -1 indicates indefinite blocking iEventNum = epoll_wait(*piEpConnFd, astEvents, MAX_LISTEN_EVENTS, -1); if (-1 == iEventNum) { printf("Server work failed to get recv event, err[%s]\n", strerror(errno)); break; } printf("Server work get [%u] recv event\n", iEventNum); iRes = recvEventProc(iEventNum, astEvents); if (-1 == iRes) { printf("Server work failed to proc recv event\n"); break; } } return NULL; }
The function handles available connections
int connEventProc(int iEventNum, struct epoll_event *pastEvents, int iEpConnFd) { int iRes = 0, iIndex = 0, iConnFd = 0; socklen_t iSockAddrLen = 0; struct sockaddr_in stCliAddr = {0}; struct epoll_event stEpConnEvent = {0}; for (iIndex = 0; iIndex < iEventNum; iIndex++) { // Before each accept, the iSockAddrLen needs to be restored, otherwise an error is reported and the parameter is illegal iSockAddrLen = sizeof(stCliAddr); // 1. Pass in the listening handle, 2. Pass in the address structure pointer to receive the client's address // 3 parameter incoming address structure size iConnFd = accept(pastEvents[iIndex].data.fd, (struct sockaddr*)&stCliAddr, &iSockAddrLen); if (-1 == iConnFd) { printf("Server lsn failed to accept conn request, err[%s]\n", strerror(errno)); iRes = -1; break; } printf("Server lsn accept connect request from[%s:%u]\n", inet_ntoa(stCliAddr.sin_addr), ntohs(stCliAddr.sin_port)); // Registration event stEpConnEvent.events= EPOLLIN; stEpConnEvent.data.fd = iConnFd; iRes = epoll_ctl(iEpConnFd, EPOLL_CTL_ADD, iConnFd, &stEpConnEvent); if (-1 == iRes) { printf("Server lsn failed to add epoll event, err[%s]\n", strerror(errno)); break; } } return iRes; }
Function start port listening
int serverStartLsn(int iLsnPort, int *piLsnFd) { int iRes = 0, iReusePort = 0; struct sockaddr_in stLsnAddr; // Create socket *piLsnFd = socket(AF_INET, SOCK_STREAM, 0); if (-1 == *piLsnFd) { printf("Server lsn failed to create socket, err[%s]\n", strerror(errno)); return -1; } // Set port multiplexing iReusePort = 1; iRes = setsockopt(*piLsnFd, SOL_SOCKET, SO_REUSEPORT, &iReusePort, sizeof (iReusePort)); if (-1 == iRes) { printf("Server lsn failed set reuse attr, err[%s]\n", strerror(errno)); close(*piLsnFd); return -1; } stLsnAddr.sin_family = AF_INET; stLsnAddr.sin_port = htons(iLsnPort); stLsnAddr.sin_addr.s_addr = INADDR_ANY; // Binding port iRes = bind(*piLsnFd, (struct sockaddr*)&stLsnAddr, sizeof(stLsnAddr)); if (-1 == iRes) { printf("Server lsn failed to bind port[%u], err[%s]\n", iLsnPort, strerror(errno)); close(*piLsnFd); return -1; } else { printf("Server lsn succeeded to bind port[%u], start listen.\n", iLsnPort); } iRes = listen(*piLsnFd, MAX_LISTEN_EVENTS); if (-1 == iRes) { printf("Server lsn failed to listen port[%u], err[%s]\n", iLsnPort, strerror(errno)); close(*piLsnFd); return -1; } return 0; }
Server listening thread entry function
void* serverLsn(void* param) { int iRes = 0; int iLsnFd_5197 = 0, iLsnFd_5198 = 0; int iEpObj = 0, iEventNum = 0; int *piEpConnFd = (int *)param; struct epoll_event stEpEvent; struct epoll_event astEvents[MAX_LISTEN_EVENTS] = {0}; // Ensure that the epoll object of the work thread is valid if (NULL == piEpConnFd || -1 == *piEpConnFd) { printf("Server lsn get invalid ep conn fd\n"); return NULL; } while (0 == *piEpConnFd) { sleep(1); continue; } iRes = serverStartLsn(SERVER_LISTEN_PORT_5197, &iLsnFd_5197); if (-1 == iRes) { printf("Server lsn failed to start lsn port[%u]\n", SERVER_LISTEN_PORT_5197); return NULL; } iRes = serverStartLsn(SERVER_LISTEN_PORT_5198, &iLsnFd_5198); if (-1 == iRes) { printf("Server lsn failed to start lsn port[%u]\n\n", SERVER_LISTEN_PORT_5198); close(iLsnFd_5197); return NULL; } // Create an epoll object for listening iEpObj = epoll_create(MAX_LISTEN_EVENTS); if (-1 == iEpObj) { printf("Server lsn failed to create lsn epoll obj, err[%s]\n", strerror(errno)); close(iLsnFd_5197); close(iLsnFd_5198); return NULL; } // Register events with epoll objects memset(&stEpEvent, 0, sizeof(stEpEvent)); stEpEvent.events= EPOLLIN; stEpEvent.data.fd = iLsnFd_5197; epoll_ctl(iEpObj, EPOLL_CTL_ADD, iLsnFd_5197, &stEpEvent); // Register events with epoll objects memset(&stEpEvent, 0, sizeof(stEpEvent)); stEpEvent.events= EPOLLIN; stEpEvent.data.fd = iLsnFd_5198; epoll_ctl(iEpObj, EPOLL_CTL_ADD, iLsnFd_5198, &stEpEvent); // monitor while (1) { printf("Server lsn start wait conn event.\n"); memset(astEvents, 0, sizeof(astEvents)); iEventNum = epoll_wait(iEpObj, astEvents, MAX_LISTEN_EVENTS, -1); if (-1 == iEventNum) { printf("Server lsn failed to wait conn event, err[%s].\n", strerror(errno)); break; } printf("Server lsn get [%u] conn event\n", iEventNum); iRes = connEventProc(iEventNum, astEvents, *piEpConnFd); if (0 != iRes) { printf("Server lsn failed to proc conn event.\n"); close(iLsnFd_5197); close(iLsnFd_5198); return NULL; } } close(iLsnFd_5197); close(iLsnFd_5198); return NULL; }
Main function
int main() { // Thread ID, which is essentially an unsigned long integer pthread_t thdServerWork = 101; pthread_t thdServerLsn = 102; pthread_t thdClient1 = 1; pthread_t thdClient2 = 2; pthread_t thdClient3 = 3; pthread_t thdClient4 = 4; // epoll object handle used to monitor whether the socket is readable int iEpConnFd = 0; // Connection port int serverPort_5197 = 5197; int serverPort_5198 = 5198; // 1 reference thread ID, 2 reference thread attribute, // Parameter 3 specifies the thread entry function, and parameter 4 specifies the parameters passed to the entry function pthread_create(&thdServerWork, NULL, serverWork, &iEpConnFd); pthread_create(&thdServerLsn, NULL, serverLsn, &iEpConnFd); pthread_create(&thdClient1, NULL, client, &serverPort_5197); pthread_create(&thdClient2, NULL, client, &serverPort_5197); pthread_create(&thdClient3, NULL, client, &serverPort_5198); pthread_create(&thdClient4, NULL, client, &serverPort_5198); // The 1 Parameter passes in the thread ID, and the 2 parameter is used to receive the return value of the thread entry function. If the return value is not required, set it to NULL pthread_join(thdServerWork, NULL); pthread_join(thdServerLsn, NULL); pthread_join(thdClient1, NULL); pthread_join(thdClient2, NULL); pthread_join(thdClient3, NULL); pthread_join(thdClient4, NULL); return 0; }