Audio and video development journey (60) - debugging and analyzing common structures of FFmpeg (unpacking part)


  1. Breakpoint debugging of ffplay
  2. (unpacking part) analysis of common structures and their relationships
  3. data
  4. harvest

If you want to do a good job, you must first sharpen your tools. Breakpoint debugging is very important for us to sort out the process and troubleshoot problems. You can debug ffmpeg and conduct convenient debugging and Analysis on ide s such as XCode, VS code and QT. In this article, taking XCode as an example, we first introduce the breakpoint debugging of ffplay and analyze it with ffmpeg 4.4.

1, Breakpoint debugging of ffplay

First download and compile ffmpeg. For details, please refer to Audio and video development journey (33) - cross compile FFmpeg(3.x and 4.x) used by android The difference is that we are not cross compiling this time, but compiling, installing and debugging on the Mac.

./configure --enable-static --disable-shared --enable-debug --disable-doc --disable-x86asm --enable-nonfree  --enable-libvpx --enable-gpl  --enable-opengl --enable-libx264  --enable-libx265 --enable-libvmaf
make -j8
sudo make install

After the compilation is successful, we will see several important executable files ffmpeg_g, ffprobe_g and ffplay_g, which will be used in the next operation and debugging. How to configure and debug ffmpeg source code under Xcode? Please refer to:

We analyze the structure used in the read_thread process of ffplay at the break point of the main function of ffplay.c.

Open media streaming

VideoState *stream_open(const char *filename,const AVInputFormat *iformat)

Structure involved: AVInputFormat

Start readthread to start reading

    is->read_tid     = SDL_CreateThread(read_thread, "read_thread", is);

Allocate AVFormatContext memory

 AVFormatContext   ic = avformat_alloc_context();

Open streaming media file

int avformat_open_input(AVFormatContext **ps, const char *filename,
                        const AVInputFormat *fmt, AVDictionary **options)

Structures involved: AVFormatContext, AVInputFormat, AVDictionary

Get stream information

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options)

Structure involved: AVStream AVCodecParameters AVRational

Loop reading frame data

 for (;;) {
     int av_read_frame(AVFormatContext *s, AVPacket *pkt)

Structures involved: AVFormatContext, AVPacket, etc

The unpacking process is here first. It can be seen that if you want to learn the source code of ffplay, you must first understand the main process and the key structures involved in the process. In the next section, we will analyze these structures in detail.

3, (unpacking part) common structures and their relationship analysis common structures and their relationship analysis

3.1 common structures and their relationships

FFMPEG There are many structures in. The most critical structures can be divided into the following categories:
a)        Solution protocol( http,rtsp,rtmp,mms)

AVIOContext,URLProtocol,URLContext It mainly stores the type and status of the protocol used by video and audio. URLProtocol Store the packaging format used for input video and audio. Each protocol corresponds to one URLProtocol Structure. (Note: FFMPEG The file is also used as a protocol“ file")

b)        Unpacking( flv,avi,rmvb,mp4)

AVFormatContext It mainly stores the information contained in the video and audio packaging format; AVInputFormat Stores the packaging format used by the input video and audio. Each video and audio packaging format corresponds to one AVInputFormat Structure.

c)        Decode( h264,mpeg2,aac,mp3)

each AVStream Store a video/Relevant data of audio stream; each AVStream Corresponding to one AVCodecContext,Store this video/Relevant data of audio stream using decoding method; each AVCodecContext Corresponding to one in AVCodec,Include this video/The decoder corresponding to the audio. Each decoder corresponds to one AVCodec Structure.

d) Save data

For video, each structure usually stores one frame; for audio, there may be several frames

Data before decoding: AVPacket

Decoded data: AVFrame

Quoted from:

Their relationship is as follows:

Picture from: The relationship between the most critical structures in FFMPEG

3.2. AVFormatContext This structure is defined in libavformat/Avformat.h. It is a data structure that runs through the whole process and is used as a parameter in many functions. The functions of several main variables are as follows:

struct AVInputFormat *iformat: Encapsulation format of input data
struct AVOutputFormat *oformat: Encapsulation format of output data

AVIOContext *pb: Cache of input data

unsigned int nb_streams: Number of video and audio streams

AVStream **streams: Video audio stream

char filename[1024]: file name

int64_t duration: Duration in microseconds us,Convert to seconds (divide by 1000000)

int bit_rate: Bit rate (unit) bps,Convert to kbps (you need to divide by 1000)

AVDictionary *metadata: metadata

3.3 AVInputFormat This structure is also defined in libavformat/Avformat.h. It is the main variable of the unpacker object. Its functions are as follows

const char *name: Name of the format
const char *mime_type:  mime Type such as video/avc video/hevc audio/aac etc.

And a series of function pointers
int (*read_probe)(const AVProbeData *);
int (*read_packet)(struct AVFormatContext *, AVPacket *pkt);
int (*read_close)(struct AVFormatContext *);
int (*read_seek)(struct AVFormatContext *,
                     int stream_index, int64_t timestamp, int flags);
int (*read_play)(struct AVFormatContext *);

int (*read_pause)(struct AVFormatContext *);
int (*read_seek2)(struct AVFormatContext *s, int stream_index, int64_t min_ts, int64_t ts, int64_t max_ts, int flags);

3.4 AVStream Each AVStream stores relevant data of a video / audio stream; it is a stream object separated by the unpacker, that is, the product of unpacking, which is saved in AVFormatcontext.

The structure is also defined in libavformat/Avformat.h. The main variables are as follows:

int index;  Stream index
int id; flow id
void *priv_data; Stream data
AVRational time_base; Time base, through which you can PTS,DTS Into real time; PTS*time_base=Real time
int64_t duration: Flow length
AVRational sample_aspect_ratio; sampling rate
AVRational avg_frame_rate: Frame rate
AVCodecContext *codec: Point to this video/Audio streaming AVCodecContext(They are one-to-one correspondence)

AVStream is not only the output of the unpacking link, but also the input of the decoding link. Each AVStream corresponds to an AVCodecContext, which stores the relevant data of the decoding method used by the video / audio stream; each AVCodecContext corresponds to an AVCodec, including the decoder corresponding to the video / audio. Each decoder corresponds to an AVCodec structure. The data structure of the decoding part is analyzed and studied in the next article.

3.5 AVPacket A structure that stores information related to compressed encoded data, and stores data before decoding after understanding encapsulation, as well as PTS, DTS, Duration, streamId and other information The structure definition is located in libavcodec/Packet.h. The main variables are as follows:

  uint8_t *data; about H.264 Come on. One AVPacket of data Usually corresponds to one NAL. 
int   size: data Size of
int64_t pts: presentation time stamp 
int64_t dts: Decoding timestamp
AVPacketSideData *side_data;Additional information

3, Information

Android audio and video development - Chapter 8 Xcode debugging ffmpeg source code (XV) The relationship between the most critical structures in FFMPEG FFMPEG structure analysis: AVFormatContext FFMPEG structure analysis: AVStream FFMPEG structure analysis: AVPacket

4, Harvest

Through the study and practice of this article, we have learned

  1. How to debug ffmpeg at breakpoint under Xcode and analyze the unpacking process of ffplay
  2. Understand the relationship between common structures: de protocol, de encapsulation, decoding corresponding structures and the relationship between them
  3. Understand the main variables and functions of several key structures related to unpacking. AVFormatContext, AVInputFormat, AVStream

Thank you for reading The next section we analyze the common structures of ffmpeg decoding part is welcome to pay attention to the official account of "audio and video development tour" to learn and grow together. Welcome to communicate

Posted on Fri, 03 Dec 2021 22:23:06 -0500 by Daegalus