1, Get data
When we do network security or data detection work, we often use packet capturing. Familiar tools include tcpdump, wireshark, etc. Here we introduce how to use C program raw socket to grab IP link layer packets on linux system.
First of all, we are familiar with a very important function socket. We can find the description of socket function through the man Manual of linux
#include <sys/socket.h> int socket(int domain, int type, int protocol); DESCRIPTION socket() creates an endpoint for communication and returns a descriptor. The domain parameter specifies a communications domain within which com- munication will take place; this selects the protocol family which should be used. These families are defined in the include file <sys/socket.h>. The currently understood formats are PF_LOCAL Host-internal protocols, formerly called PF_UNIX, PF_UNIX Host-internal protocols, deprecated, use PF_LOCAL, PF_INET Internet version 4 protocols, PF_ROUTE Internal Routing protocol, PF_KEY Internal key-management function, PF_INET6 Internet version 6 protocols, PF_SYSTEM System domain, PF_NDRV Raw access to network device The socket has the indicated type, which specifies the semantics of com- munication. Currently defined types are: SOCK_STREAM SOCK_DGRAM SOCK_RAW A SOCK_STREAM type provides sequenced, reliable, two-way connection based byte streams. An out-of-band data transmission mechanism may be sup- ported. A SOCK_DGRAM socket supports datagrams (connectionless, unreli- able messages of a fixed (typically small) maximum length). SOCK_RAW sockets provide access to internal network protocols and interfaces. The type SOCK_RAW, which is available only to the super-user. The protocol specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family. However, it is possible that many protocols may exist, in which case a particular protocol must be speci- fied in this manner. The protocol number to use is particular to the communication domain in which communication is to take place; see protocols(5). Sockets of type SOCK_STREAM are full-duplex byte streams, similar to pipes. A stream socket must be in a connected state before any data may be sent or received on it. A connection to another socket is created with a connect(2) or connectx(2) call. Once connected, data may be transferred using read(2) and write(2) calls or some variant of the send(2) and recv(2) calls. When a session has been completed a close(2) may be performed. Out-of-band data may also be transmitted as described in send(2) and received as described in recv(2). The communications protocols used to implement a SOCK_STREAM insure that data is not lost or duplicated. If a piece of data for which the peer protocol has buffer space cannot be successfully transmitted within a reasonable length of time, then the connection is considered broken and calls will indicate an error with -1 returns and with ETIMEDOUT as the specific code in the global variable errno. The protocols optionally keep sockets ``warm'' by forcing transmissions roughly every minute in the absence of other activity. An error is then indicated if no response can be elicited on an otherwise idle connection for a extended period (e.g. 5 minutes). A SIGPIPE signal is raised if a process sends on a broken stream; this causes naive processes, which do not handle the sig- nal, to exit. SOCK_DGRAM and SOCK_RAW sockets allow sending of datagrams to correspon- dents named in send(2) calls. Datagrams are generally received with recvfrom(2), which returns the next datagram with its return address. An fcntl(2) call can be used to specify a process group to receive a SIGURG signal when the out-of-band data arrives. It may also enable non- blocking I/O and asynchronous notification of I/O events via SIGIO. The operation of sockets is controlled by socket level options. These options are defined in the file <sys/socket.h>. Setsockopt(2) and getsockopt(2) are used to set and get options, respectively.
Briefly, the socket function creates a socket descriptor, and how to create this descriptor depends on its three parameters
Parameter Description:
-
Domain -- specifies which protocol cluster to choose for communication. If you have done tcp or udp communication, you believe that the "PF_INET is no stranger. It means that socket communication is in the IPV4 network layer. If you want to use IPV6, you can choose PF_INET6 ", I want to say that if we want to get MAC address here, we need to specify communication domain at link layer, and we can select" pf "_ PACKET”
-
Type -- specifies the communication type. If we are TCP communication, we can choose the data flow "sock"_ Stream, select "sock" in case of UDP_ Dgram "means" sock "_ Raw "it provides an interface for internal network access. Here we need to use it for packet capturing
-
protocol - usually it can be 0, because we choose the first two parameters, here we choose "eth"_ P_ IP”
Here is the code for creating a sock
#include <sys/socket.h> #include <linux/if_ether.h> #include <netinet/in.h> #include <unistd.h> #include <arpa/inet.h> int main() { int sock; if ((sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP))) < 0) { perror(strerror(errno)); fprintf(stdout, "create socket error\n"); exit(0); } }
From function description to SOCK_RAW sockets can receive data through recvfrom, which is how we get IP data
bzero(buffer, sizeof(buffer)); n_read = recvfrom(sock, buffer, 2048, 0, NULL, NULL); if (n_read <0) { exit(0); }
At this time, the data in the buffer we catch is the link layer data
2, tcp packet analysis
As shown in the red box, the data we grab includes:
- Link layer package header: including "destination address + source address + type"
- Link layer data: IP packets
- CRC verification
However, we need to analyze the TCP data. The TCP data packet belongs to the transmission layer data and is included in the IP data packet of the network layer, so we need to analyze the data from the bottom to the upper layer
According to the description of the transport layer, the data format of a link layer packet header is defined to facilitate code analysis
//MAC header typedef struct { unsigned char DesMacAddr[6]; //6-byte destination MAC address unsigned char SrcMacAddr[6]; //6-byte source MAC address short LengthOrType; //Two byte network type }__attribute__((packed)) MAC_HEADER, *PMAC_HEADER;
IP packets
typedef struct { unsigned char hdr_len: 4; unsigned char version: 4; unsigned char tos; unsigned short total_len; unsigned short identifier; unsigned short frag_and_flags; unsigned char ttl; unsigned char protocol; unsigned short checksum; unsigned int source_ip; unsigned int dest_ip; }__attribute__((packed)) IP_HEADER, *PIP_HEADER;
TCP packets
//IP header typedef struct { unsigned char hdr_len: 4; unsigned char version: 4; unsigned char tos; unsigned short total_len; unsigned short identifier; unsigned short frag_and_flags; unsigned char ttl; unsigned char protocol; unsigned short checksum; unsigned int source_ip; unsigned int dest_ip; }__attribute__((packed)) IP_HEADER, *PIP_HEADER;
Here is the code of the parsing process for each part
/********************mac header*******************/ PMAC_HEADER pmacHeader = (MAC_HEADER *) buffer; printf("Source Mac:"); for (i = 0; i < 6; ++i) { printf("%02x", pmacHeader->SrcMacAddr[i]); } printf(" "); printf("Dest Mac:"); for (i = 0; i < 6; ++i) { printf("%02x", pmacHeader->DesMacAddr[i]); } printf("\n"); /********************ip header**********************/ PIP_HEADER pipHeader = (PIP_HEADER) (buffer + MAC_HEADER_SIZE); int total_len = ntohs(pipHeader->total_len); ip_header_len = pipHeader->hdr_len * 4; if (ip_header_len > 20 || ip_header_len > 60) { exit(0); } memcpy(&des_addr, &pipHeader->dest_ip, 4); memcpy(&src_addr, &pipHeader->source_ip, 4); int proto = pipHeader->protocol; switch (proto) { case IPPROTO_ICMP: printf("ICMP\n"); break; case IPPROTO_IGMP: printf("IGMP\n"); break; case IPPROTO_IPIP: printf("IPIP\n"); break; case IPPROTO_TCP : printf("TCP:"); PTCP_HEADER tcpHeader = (PTCP_HEADER) (buffer + MAC_HEADER_SIZE + ip_header_len); tcp_header_len = ((tcpHeader->m_uiHeadOff & 0xf0) >> 4) * 4; int data_len = total_len - ip_header_len - tcp_header_len; printf("%s.%d-->%s.%d Len:%d\n", inet_ntoa(src_addr), tcpHeader->m_sSourPort, inet_ntoa(des_addr), tcpHeader->m_sDestPort, data_len); int tcp_data_index = MAC_HEADER_SIZE + ip_header_len + tcp_header_len; unsigned char *p = buffer + tcp_data_index; if (data_len > 0) { printf("Data:"); for (int k = 0; k < n_read - tcp_data_index; ++k) { printf("%02x ", p[k]); } //printf("\n"); for (int k = 0; k < n_read - tcp_data_index; ++k) { printf("%c", p[k]); } printf("\n"); } break; case IPPROTO_UDP : printf("UDP\n"); break; case IPPROTO_RAW : printf("RAW\n"); break; default: printf("Unkown\n"); }
Operation result:
IP Source Mac:9ca615de20d0 Dest Mac:94c6919aa8f4 TCP:222.131.155.252.56539-->222.131.155.252.5632 Len:0 IP Source Mac:b888e3dc810e Dest Mac:ffffffffffff UDP IP Source Mac:9ca615de20d0 Dest Mac:94c6919aa8f4 TCP:222.131.155.252.56539-->222.131.155.252.5632 Len:52 Data:c7 e4 e1 18 5c b8 44 91 34 bc a2 2d b8 da ae 64 52 8f ab 3d f8 70 db db 65 2c 2d 2c 9a cd 2d 02 54 e5 db d2 5c 54 8c 7d 1d fe 05 1e c7 d8 e4 b9 23 d8 09 fc ���\�D�4��-�ڮdR��=�p��e,-,��-T���\T�}����#� �
Click to get Github source code