The capture_session structure in Wireshark represents a capture process. By default, when a user chooses a network card to start wireshark's capture data frame function, a dumpcap subprocess is started, which transmits messages to the Wireshark main program through a custom protocol on the pipeline, including notifications and capture that inform the main process of the path where the capture file is located.Notifications of changes in work status (e.g., pause, stop capture, error), new captured packets, etc.
(1) Initialize capture session
The capture_session_init() function is called at program startup to initialize its basic internal member functions with the following call stack:
capture_session_init(...)
capture_input_init(capture_session *cap_session, capture_file *cf)
MainWindow::MainWindow(...)
The cap_session passed to the capture_input_init() function here is a private member of MainWindow and the cf is a global object. The main things to do in capture_session_init() are as follows:
- Set cap_session->cf as the global variable passed in by capture_input_init parameter 2.
- Initialize capture child process with pid -1
- Initialization and ipe-related fd are -1
- The current sesseion state is set to CAPTURE_STOPPED
- Capture packet count set to 0
- session is about to restart marked as FALSE
- Set up several callback functions to call when the corresponding event occurs. These functions are defined in capture.c.
- capture_session.new_file=capture_input_new_file
- capture_session.new_packets=capture_input_new_packets
- capture_session.drops=capture_input_drops
- capture_session.error=capture_input_error
- capture_session.cfilter_error=capture_input_cfilter_error
- capture_session.closed=capture_input_closed
(2) Start capture (capture_start())
When the user chooses the network card interface to start capture, the call stack is as follows:
Wireshark.exe!capture_start(...)
Wireshark.exe!MainWindow::startCapture()
The main things you do in the capture_start function are the following:
- Set the session state to capture_session.state = CAPTURE_PREPARING in preparation for capture;
- Reset the packet count capture_session.count = 0;
- Get a temporary file name based on capture_options and assign capture_session.cf.source
- Call sync_pipe_start, in this function:
- Set the command line parameters of the dumpcap subprocess with the capture_options object as an option.
- Create a pipe to communicate with the dumpcap subprocess
- Create a dumpcap subprocess in sync_pipe_start.
- Save subprocess handle to capture_session.fork_child
- Save pipe write FD to capture_session.signal_pipe_write_fd
- Reset the child process exitcode. capture_session.fork_child_status = 0;
- Save the capture_options object to capture_session.capture_opts
- Initialize capture_session.cap_data_info to point to MainWindow::info_data_, which should be used for statistical package-related functions.
- Call pipe_input_set_handler, set the pipe read event handler to sync_pipe_input_cb, and call it later when a message is fetched from the pipe.
- Initialize capture_session.rec using the wtap_rec_init() function, which is used in the read routine capture_session.new_packets. (Users read the metadata of the packet, followed by a detailed description of the wtap_rec structure)
- Initialize capture_session.buf with ws_buffer_init() and request 1514 bytes for it (data link frame MTU1500 bytes + 14 bytes?), which is used to store packets and is used in the read routine capture_session.new_packets.
(3) Opening data capture files
After the start capture is performed (after the dumpcap subprocess starts), the subprocess sends a SP_FILE message through the pipeline (see the sync_pipe_input_cb() function for processing details)This message carries a data capture file name into which the dumpcap subprocess writes packets in the agreed format. When processing this message, the capture_session.new_file method is called to initialize the capture_file structure.
Do the following things in capture_session.new_file (implemented by capture_input_new_file():
- Check session status to ensure CAPTURE_PREPARING or CAPTURE_RUNNING
- Determine capture_opts to confirm whether it is temporary file mode. If the wireshark program has no settings to save to a file, this is temporary file mode (usually temporary file mode). Set capture_session.cf.is_tempfile based on its value.
- Set the value of capture_opts.save_file based on the file name from the pipeline.
- Execute cf_open, open the temporary file, and set session to read from it.
- Execute the wtap_open_offline() function, open the file, allow, and initialize a wtap structure. During this process, the wiretap library will confirm the specific type of file and set the corresponding processing function in the wtap structure. (For example, for the default pacpng capture program, these values are pcapng_read, pcapng_seek_read, pcapng_close, see "About wtap_open_offline" belowSection)
- Close capture_file with cf_close() and reset the state associated with a stack of files.
- Initialize Record Metadata (capture_file.rec, type wtap_rec) using wtap_rec_init().
- Initialize capture_file.buf using ws_buffer_init()
- Setting capture_file to FILE_READ_IN_PROGRESS indicates that we are about to start reading the file
- Give the wtap structure pointer returned by wtap_open_offline() to capture_file.provider.wth
- Initialize f_datalen, filename, is_tempfile, unsaved_changes, computed_elapsed, cd_t, open_type, etc. members of capture_file
- Initialize capture_file.provider.frames, where all data frames read from the file are stored (this is a four-level radix tree with 1024 child nodes per node, so the first, second, and third levels of the tree can have 1024, 1024*1024, 1024*1024*1024*1024 nodes, respectively)
- Create a new epan (packet parsing module handle) with capture_file as the parameter.
- Notify UI of package list redrawing.
- Set additional callbacks for name resolution such as wtap_set_cb_new_ipv4, wtap_set_cb_new_ipv6, wtap_set_cb_new_secrets (pcapng only applies).
- Set session state to CAPTURE_RUNNING
- Call capture_callback_invoke() to execute the callback of the UI settings to update the corresponding UI.
About wtap_open_offline
This function opens a file, allows, and prepares a wtap structure. If "do_random"For TRUE, the file is opened twice; the second open allows the application to perform a lookup offset for random access to I/O without moving sequential I/O, which displays the protocol tree of the data package when it is selected. Wireshark uses this offset to perform sequential I/O on the capture file written as a new data package.
- alloc a wtap
- Execute file_open and assign the result to wtap.fh
- If random reading is allowed (do_random=TRUE), then file_open is executed again, giving the result to wtap.random_fh
- Initialize some members of wtap
- Set timestamp precision file_tsprec=WTAP_TSPREC_USEC
- Initialize the shb_hdrs chain table to add the first wtap_block to it.
- Set the path name to the data capture file created for dumpcap.
- Initialize ispipe, file_encap, subtype_sequential_close, subtype_close, priv, wslua_data, and so on to default values.
- Initialize interface_data, which is an array containing a list of interfaces. This is required for pcapng_open and erf_open (and libpcap_open for ERF encapsulation types). Always initialize here to save time checking NULL ptr later.
- Initialize next_interface_data, which is the next interface data that wtap_get_next_interface_description() will return.
- If random reading is allowed, file_set_random_access() is called to set the function of random reading of files.
- Executing a loop attempts to iterate through all open_routines in the open_routines chain table to see which function can open the file in order to determine the type of capture file. Here, open_routines are initialized from an open_info_base[] array (see init_open_routines), which defines the characteristics of various types of data capture files (typically magic numbers, extensions)Also, according to the Convention of the libwiretap library, processing functions related to certain file formats, such as wtap.subtype_read (sequential read), wtap.subtype_seek_read (random read), wtap.subtype_close (for example, for the pacpng capture file format used by wireshare default, these values are pcapng_read(), pcapng_seek_read(), pcapng_close().So when wtap.subtype_read is executed, the pcapng_read() routine corresponding to the pcapng file format is actually executed.
(4) Reading data packages from files
By default, wireshark data is collected through the pcapng(pcap next gen) library through a pipeline that notifies the wireshark main module when a child process dumpcap has new capture band data (which is saved to a temporary file). The final main process calls the capture_input_new_packets() function in the capture.c file to read and parse these packets, as defined below:
/* capture The subroutine tells us that there are new packets to read */ static void capture_input_new_packets(capture_session *cap_session, int to_read);
The first parameter is the packet session, and the second parameter is how many new packets have arrived that need to be read.
if(capture_opts->real_time_mode) { /* Reads a record from a capture file, the number being a subroutine tells us how many records it adds (Pass by to_read parameter) cap_session->rec and cap_session->buf are buffers that provide new packets*/ switch (cf_continue_tail( (capture_file *)cap_session->cf, to_read, &cap_session->rec, &cap_session->buf, &err)) { ... } }
It is declared as follows:
/** * Read the package from the last (current) end of the captured file. * * @param cf Capture file to read from * @param to_read Number of packets to read * @param rec Pointer to the wtap_rec structure for reading * @param buf A pointer to a Buffer structure, used for reading * @param err the error code, if an error had occurred * @return one of cf_read_status_t */ cf_read_status_t cf_continue_tail(capture_file *cf, volatile int to_read, wtap_rec *rec, Buffer *buf, int *err);
This function is defined in the file.c file.
The key points of this function are:
// Omit some code... epan_dissect_t edt; // Omit some code... /*Initialize an epan_dissect_t*/ epan_dissect_init(&edt, cf->epan, create_proto_tree, FALSE); TRY { gint64 data_offset = 0; column_info *cinfo; /* If any tap listeners require the columns, construct them. */ cinfo = (tap_flags & TL_REQUIRES_COLUMNS) ? &cf->cinfo : NULL; // Loop to_read packets here while (to_read != 0) { wtap_cleareof(cf->provider.wth); // Read this package temporarily to rec and buf if (!wtap_read(cf->provider.wth, rec, buf, err, &err_info, &data_offset)) { break; } if (cf->state == FILE_READ_ABORTED) { /* User decides to exit wireshark and loop */ break; } // Read the contents of the package and parse (and filter by dfcode) using epan_dissect_t if (read_record(cf, rec, buf, dfcode, &edt, cinfo, data_offset)) { newly_displayed_packets++; } to_read--; } } CATCH(OutOfMemoryError) { ...... //Error handling, exit the program if serious. } ENDTRY; // ...omit part of the code // After parsing is complete epan_dissect_cleanup(&edt);
There are two key calls to wtap_read() and read_record().
The role of wtap_read() is as follows:
- Initialize a wtap_rec structure, then read the metadata of the data package from the data capture file to the wtap_rec structure. Metadata refers to the information that records the offset of the data package in the file, the length of the data package, and so on.
- Reads a packet from a file into a Buffer structure. Buffer.data points to the actual data.
The role of read_record():
- If the program has a filter set, it tries to filter the packet, and pass directly and returns FALSE if the filter condition is met.
- Build frame_data from the wtap_rec structure and add it to the cardinality tree capture_file.provider.frames (cardinality 1024), which is the only place to save frames. Note that frame_data does not contain packets, and it is clear that if each package is saved it will run out of memory very quickly, so capture_file.provider.frames is an index only.
- Parse this frame by passing packets such as Buffer and wtap_rec to epan_dissect_run_with_taps for execution.
- Call package_list_append to add this frame to the package list (add to the PacketListModel object so the UI can draw it, see PacketListModel::appendPacket()).
With regard to the metadata wtap_rec, it is defined as follows:
typedef struct { guint rec_type; /* Record type, usually package type REC_TYPE_PACKET */ guint32 presence_flags; /* Which values are meaningful WTAP_HAS_TS(1), WTAP_HAS_CAP_LEN(2), WTAP_HAS_INTERFACE_ID(4) combinations always have these three items at the same time for Ethernet frames, that is, the value should be 7 */ nstime_t ts; /* time stamp */ int tsprec; /* Timestamp precision, WTAP_TSPREC_XXX*/ union { wtap_packet_header packet_header; // Valid for REC_TYPE_PACKET wtap_ft_specific_header ft_specific_header; //Not in charge for now wtap_syscall_header syscall_header; //Not in charge for now wtap_systemd_journal_export_header systemd_journal_export_header; //Not in charge for now wtap_custom_block_header custom_block_header; //Not in charge for now } rec_header; wtap_block_t block ; /* packet block; holds comments and verdicts in its options */ gboolean block_was_modified; /* TRUE if ANY aspect of the block has been modified */ /* * Buffer used to save record related options can be temporarily left unattended. */ Buffer options_buf; /* file-type specific data */ } wtap_rec typedef struct { guint32 caplen; /* The length of the data in the file caplen is always <=len */ guint32 len; /* The length of the data in the wire, if not compressed, should be equal to caplen */ int pkt_encap; /* WTAP_ENCAP_XXX Encapsulation type, this value should be WTAP_ENCAP_ETHERNET for Ethernet */ guint32 interface_id; /* Network Card Interface ID */ union wtap_pseudo_header pseudo_header; /* Pseudo head, not in use for now */ } wtap_packet_header; /*Description of a data block that is assigned when reading data from a file. I don't know what to do yet*/ struct wtap_block { wtap_blocktype_t* info; // See the description below void* mandatory_data; GArray* options; gint ref_count; #ifdef DEBUG_COUNT_REFS guint id; #endif }; /* Describes the type of data block. I don't know what to do yet */ typedef struct { wtap_block_type_t block_type; /* Block type, common value is WTAP_BLOCK_PACKET */ const char *name; /* Block name, common value is "EPB/SPB/PB" */ const char *description; /* Readable block name, common value is "Packet Block" */ wtap_block_create_func create; wtap_mand_free_func free_mand; wtap_mand_copy_func copy_mand; GHashTable *options; /**< hash table of known options */ } wtap_blocktype_t;
The definition of frame_data is as follows:
typedef struct _frame_data { guint32 num; /**< Frame number corresponding to the value of the first column "No." in the default list view of the main interface */ guint32 pkt_len; /**< Packet length, corresponding to column 6 */ guint32 cap_len; /**< Actual capture length, which is usually equal to pkt_len */ guint32 cum_bytes; /**< Cumulative number of capture bytes */ gint64 file_off; /**< The offset in the file is important. You need to read the file with this offset when you need to parse the package */ /* These two are pointers, meaning 64-bit on LP64 (64-bit UN*X) and LLP64 (64-bit Windows) platforms. Put them here, one after the other, so they don't require padding between them. */ GSList *pfd; /**< Per frame proto data */ const struct _color_filter *color_filter; /**< Per-packet matching color_filter_t object */ guint16 subnum; /**< subframe number, for protocols that require this */ /* Keep the bitfields below to 16 bits, so this plus the previous field are 32 bits. */ unsigned int passed_dfilter : 1; /**< 1 = display, 0 = no display */ unsigned int dependent_of_displayed : 1; /**< 1 if a displayed frame depends on this frame */ /* Do NOT use packet_char_enc enum here: MSVC compiler does not handle an enum in a bit field properly */ unsigned int encoding : 1; /**< Character encoding (ASCII, EBCDIC...) */ unsigned int visited : 1; /**< Has this packet been visited yet? 1=Yes,0=No*/ unsigned int marked : 1; /**< 1 = marked by user, 0 = normal */ unsigned int ref_time : 1; /**< 1 = marked as a reference time frame, 0 = normal */ unsigned int ignored : 1; /**< 1 = ignore this frame, 0 = normal */ unsigned int has_ts : 1; /**< 1 = has time stamp, 0 = no time stamp */ unsigned int has_phdr_block : 1; /** 1 = there's a block (possibly with options) for this packet */ unsigned int has_modified_block : 1; /** 1 = block for this packet has been modified */ unsigned int need_colorize : 1; /**< 1 = need to (re-)calculate packet color */ unsigned int tsprec : 4; /**< Timestamp accuracy */ nstime_t abs_ts; /**< Time stamp corresponding to the value of the second column "Time" in the default list view of the main interface */ nstime_t shift_offset; /**< Timestamp offset, temporarily ignored */ guint32 frame_ref_num; /**< Previous reference frame (0 if this is one) */ guint32 prev_dis_num; /**< Previous displayed frame (0 if first one) */ } frame_data;
When a frame_data needs to be exposed to the interface, the program calls either cf_read_record_no_alert() or cf_read_record(), which takes the metadata wtap_rec and the package Buffer from the frame_data, and then executes the epan_dissect_run() function to parse the package protocol tree, the logic of which is contained in the member function PacketListRecord::dissect().
By default, the main interface list displays [No., Time, Source, Destination, Protocol, Length, Info]. The sequence number, time, and length of these items are already in frame_data, and the remaining items need to be accessed by executing the epan_dissect_run() function.
To summarize:
- Wishark grabs data packets through subprocesses and receives notifications from subprocesses through a pipeline.
- When a new packet is received, the temporary file that the read subprocess writes to the packet.
- wireshark reads data packages through the wiretap library, which can read data capture files in various formats. By default, wireshark uses the pcapng format.
- The wiretap library reads a packet through wtap_read() and returns metadata (wtap_rec) and packet data (Buffer).
- The data base type displayed on the interface package list is the frame_data structure.Re-encapsulate a wtap_rec into frame_data, where the filter takes effect. If the package is of the type to be filtered, the frame_data is discarded here directly, otherwise it is added to the cardinality tree capture_file.provider.frames and inserted into the model of the tree control.