Streaming media network protocol -- HLS

Introduction to HLS

HLS (HTTP Live Streaming), fully known as HTTP Live Streaming, is a media streaming protocol based on HTTP proposed by Apple company to realize the transmission of real-time audio and video streaming.

1. Principle

The server cuts the code stream into small media segments (usually one in 10 seconds) that can be downloaded through HTTP, and provides a supporting media list file (M3U8 file); The client downloads the media list file, parses the file according to the specification, obtains the address of each media slice and downloads it in order, so as to realize the effect of playing a code stream.
Among them, HTTP is the network protocol, the media playlist is the M3U8 file officially defined by Apple, and the media file format is TS or fMP4.

2. HLS multimedia system

The system structure is shown in the figure below:

It mainly consists of four parts:

  • Media data input source
    Collect audio and video data
  • Media encoding encapsulation server
    Encode, encapsulate and cut the raw stream to make it a continuous and short-term media file.
  • Distributor
    It is composed of a standard network server, which receives requests from clients and distributes resources (M3U8 files and media files)
  • client
    Streaming playback

3. Advantages and disadvantages of HLS:

  • Advantages:
    1. Using HTTP protocol to transmit data will not be shielded by firewall and is easy to distribute.
    2. Rate adaptation to improve playback fluency.
    3. Stateless protocol for load balancing.
    4. Good browser support, no need to install plug-ins.
  • Insufficient:
    Poor real-time performance and high delay for two reasons:
    1> Using sliced multimedia files, because slicing requires time coding, the delay of a slice is inevitable.
    2> Using http short connection requires continuous connection with the server. Based on tcp, it requires three handshakes and four waves, and the interaction takes a long time.

Media description file

M3U8 file, HLS media description file, UTF-8 encoding.
It can be a master playlist or a media playlist.
The following figure shows the relationship between the list and media files, which is a two-level list.

1. Master playlist

Media playlist url and related information used to represent multiple code streams with different code rates.
As follows:

#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1280000,CODECS="avc1.640028,mp4a.40.2"
http: //example.com/low.m3u8 
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=2560000,CODECS="avc1.640028,mp4a.40.2"
http: //example.com/mid.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=7680000,CODECS="avc1.640028,mp4a.40.2"
http: //example.com/hi.m3u8
TAG and formsignificance
#EXTM3UIndicates that the file is an extended M3U8 file and must be placed on the first line.
#EXT-X-STREAM-INF:Used to specify a Variants Stream and a Media URI containing multimedia information as a playlist
It is generally used for nesting M3U8. It is only valid for the URI immediately following it.
Attribute list has the following parameters:
·BANDWIDTH: required BANDWIDTH. This parameter must be a key parameter to realize rate adaptation.
·PROGRAM-ID: decimal integer that uniquely identifies a specific description within the scope of the playlist file.
·CODECS: Specifies the video and audio encoding type, which is not required.
·RESOLUTION: RESOLUTION, not required.

2. Media playlist

It is used to represent a series of slice URLs and related information of the same code rate code stream.

  • On demand media playlist:
    #EXTM3U
    #EXT-X-PLAYLIST-TYPE:VOD / / playlist type
    #EXT-X-VERSION:3 / / version information
    #EXT-X-TARGETDURATION:11 / / target duration of each partition
    #EXT-X-MEDIA-SEQUENCE:0 / / the sequence number of the first partition of the file
    #EXTINF:10.922578, / / actual duration of fragmentation
    test000.ts               // The first fragment file with sequence number 0
    #EXTINF:9.929578, / / actual duration of the second partition
    test001.ts               // The second fragment file, with serial number of 1
    ...
    #EXT-X-ENDLIST / / end flag of the list
    
  • For the encrypted code stream, an EXT-X-KEY tag is added to represent the encryption method, the request address of the key and IV (offset)
    Note that you need to convert IV to hexadecimal (char -- > hex)
    Its media playlist is as follows:
    #EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:10
    #EXT-X-MEDIA-SEQUENCE:12616381
    #EXT-X-KEY:METHOD=AES-128,URI="3M_key",IV=0x4F1E7B58678D094361DF3F0FFDEDD333
    #EXTINF:10,
    720p_aes_0.ts
    #EXT-X-KEY:METHOD=AES-128,URI="3M_key",IV=0x4F1E7B58678D094361DF3F0FFDEDD333
    #EXTINF:10,
    720p_aes_1.ts
    #EXT-X-ENDLIST
    
  • For a live stream, there is no ENDLIST tag at the end of its media playlist, as follows:
    EXT-X-PLAYLIST-TYPE by EVENT,And not at the end of the list EXT-X-ENDLIST tag!
    #EXTM3U
    #EXT-X-PLAYLIST-TYPE:EVENT
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:11
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.922578,
    test000.ts
    #EXTINF:9.929578,
    test001.ts               
    ...
    
  • The following describes the meaning of each TAG:
    TAG and formsignificance
    #EXTM3UAn expanded M3U8 file must be placed on the first line. This TAG indicates that the file is an M3U8 file
    #EXT-X-PLAYLIST-TYPEPlaylist type. There are two values: VOD -- > on demand; Event -- > live broadcast.
    #EXT-X-VERSION:Version number of the protocol, n = 3, HLS v3; N is 4, HLS v4
    #EXT-X-TARGETDURATION:The maximum duration of each slice in the list
    #EXT-X-MEDIA-SEQUENCE:The sequence number of the first slice of the file. If not, the default value is 0
    #EXTINF:duration(float),titleThe actual duration and file name of a slice. The slice url can be obtained by splicing the file name and url.
    #EXT-X-KEYDecryption information has three parameters:
    • METHOD: encryption algorithm. The hls protocol specifies three enumeration values: none, AES-128 (CBC) and sample-aes
    • URI: decryption key relative resource path
    • IV: decryption offset
    #EXT-X-ENDLISTM3U8 file terminator. There is no such tag in the live broadcast scene

3. URL generation rules in playlist:

There are several different URL generation rules in the playlist.

  1. Give URL directly
    That is, the absolute path can be directly used to request after it is obtained. For example:

    #EXTM3U
    #EXT-X-TARGETDURATION:10
    #EXT-X-VERSION:3
    #EXTINF:9.009,
    http ://media.example.com/first.ts
    #EXTINF:9.009,
    http ://media.example.com/second.ts
    #EXTINF:3.003,
    http ://media.example.com/third.ts
    #EXT-X-ENDLIST
    

    Then we ask directly http://media.example.com/first.ts , you can get the slice

  2. Single file name relative path
    Only the file name is given, indicating that the resource file and m3u8 file are placed in the same directory, and url splicing is required. For example:

    #EXTM3U
    #EXT-X-TARGETDURATION:10
    #EXT-X-VERSION:3
    #EXTINF:9.009,
    first.ts
    #EXTINF:9.009,
    second.ts
    #EXTINF:3.003,
    third.ts
    #EXT-X-ENDLIST
    

    Splicing rule: remove the m3u8 file name from the url requesting the m3u8 file, and then splice the slice file name, which is the correct url.

    If m3u8 of url Is: http://media.example.com/index.m3u8
     So each slice url The prefix is: http://media.example.com/
    So the first slice url namely: http://media.example.com/first.ts
    
  3. Relative path with file path
    Give the path of the resource file, indicating that the resource file and m3u8 file are not in the same directory, and url splicing is required, such as:

    #EXTM3U
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=800000,RESOLUTION=1080x608
    1000k/hls/index.m3u8
     Or:
    #EXTM3U
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=800000,RESOLUTION=1080x608
    /1000k/hls/index.m3u8
    

    Splicing rules are the same as 2).

    If m3u8 of url Is: http://media.example.com/index.m3u8
     Then each media playlist url The prefix is: http://media.example.com/
    So the media playlist url namely: http://media.example.com/1000k/hls/index.m3u8
    

    There is another:

    #EXTM3U
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=800000,RESOLUTION=1080x608
    /15467_73a719b2/1000k/hls/index.m3u8
    

    The url requesting the m3u8 is:

    http://media.example.com/123/15467_73a719b2/index.m3u8

    You can see that there are duplicates between the file path and the request url, which needs to be de duplicated.
    According to the splicing rule of 2), it is:

    http://media.example.com/123/15467_73a719b2/15467_73a719b2/1000k/hls/index.m3u8

    Obviously, it's not right. You need to remove the weight, so it should be:

    http://media.example.com/123/15467_73a719b2/1000k/hls/index.m3u8

  4. Relative position of double inclined rod:
    The domain name is usually directly behind the double diagonal bar, such as:

    #EXTM3U
    #EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=800000,RESOLUTION=1080x608
    //douban.donghongzuida.com/20210109/15467_73a719b2/1000k/hls/index.m3u8
    

    Splicing rule: directly add the protocol requesting this m3u8 list.

    If m3u8 by http://media.example.com/index.m3u8,
    Then it is: http://douban.donghongzuida.com/20210109/15467_73a719b2/1000k/hls/index.m3u8
     If m3u8 by https://media.example.com/index.m3u8,
    Then it is: https://douban.donghongzuida.com/20210109/15467_73a719b2/1000k/hls/index.m3u8
    

4. Expansion

  1. The HLS higher version supports ISO profile, that is, fragment MP4. Therefore, the HLS adds an EXT-X-MAP tag to identify it.
    For example:

    #EXTM3U
    #EXT-X-TARGETDURATION:15
    #EXT-X-ALLOW-CACHE:YES
    #EXT-X-PLAYLIST-TYPE:VOD
    #EXT-X-VERSION:6
    #EXT-X-MEDIA-SEQUENCE:1
    #EXT-X-MAP:URI="init-v1-a1.mp4"
    #EXTINF:13.000,
    seg-1-v1-a1.m4s
    #EXTINF:12.000,
    seg-2-v1-a1.m4s
    #EXT-X-ENDLIST
    

    The moov and mdat of fragment mp4 are separated. In hls, the slice carrying moov is called init segment; The slice carrying mdat is called media segment.
    In order to distinguish between the two, the tag EXT-X-MAP is used to represent the url of init segment. There may be audio and video frames in meidia segment.
    For TS profile, EXT-X-MAP can also be used. Its init segment is the packet of PAT and PMT, which can improve the load rate of ts.

  2. Implementation of low delay HLS
    1> LHLS scheme proposed by the third party company
    HTTP 1.1 chunking
    Reference link: https://www.theoplayer.com/low-latency-hls-streaming

    2> Low latency HLS scheme officially proposed by Apple
    The biggest improvement is to decompose the latest slice into smaller slices, that is, the requested data granularity is smaller, the generation time and delay of accessible slices are reduced.
    For this purpose, the #EXT-X-PART tag is specially added
    For example:

    #EXTINF:6.003,
    LLHLS_Video1_67750710.mp4
    #EXT-X-PROGRAM-DATE-TIME:2021-03-18T09:20:29.482Z
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.0.mp4",INDEPENDENT=YES
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.1.mp4",INDEPENDENT=YES
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.2.mp4",INDEPENDENT=YES
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.3.mp4",INDEPENDENT=YES
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.4.mp4",INDEPENDENT=YES
    #EXT-X-PART:DURATION=1.000,URI="LLHLS_Video1_67750711.5.mp4",INDEPENDENT=YES
    

    Visible LLHLS_Video1_67750710.mp4 this slice is cut into smaller slices, so you don't have to wait for the whole slice to be generated before accessing it.

  3. Build HLS server.
    Use nginx or Apache2 to build Http server

  4. How to insert advertisements during hls playing?
    The same advertising video can be inserted in front of any slice of any HLS source. Therefore, it is impossible to ensure that the coding format, code rate and other information of the advertising video are consistent with the HLS source, which will affect the normal playback of the client player. Therefore, it is necessary to inform the player of the time point when the advertisement is inserted into the HLS source and the metadata of the advertisement. HLS has specially set #ext-x-discontinuity to indicate that the front slice and the back slice are completely different, and the player needs to make corresponding processing. After the player detects this tag, it needs to reinitialize the decoder because the encoding type of the slice changes.
    example:
    An ordinary M3u8 file:

    #EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:11
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.922578,
    test000.ts
    #EXTINF:9.929578,
    test001.ts
    ...
    

    If you want to insert an ad before the beginning of the original video:

    #EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:11
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.0,
    ad0.ts
    #EXTINF:8.0,
    ad1.ts
    #EXT-X-DISCONTINUITY
    #EXTINF:10.922578,
    test000.ts
    #EXTINF:9.929578,
    test001.ts
    ...
    

    If you want to insert an advertisement in the middle of the original video:

    #EXTM3U
    #EXT-X-VERSION:3
    #EXT-X-TARGETDURATION:11
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.922578,
    test000.ts
    #EXT-X-DISCONTINUITY
    #EXTINF:10.0,
    ad0.ts
    #EXTINF:8.0,
    ad1.ts
    #EXT-X-DISCONTINUITY
    #EXTINF:9.929578,
    test001.ts
    ...
    
  5. Rate switching strategy:
    Comprehensively consider the network bandwidth and buffer level.
    When the current bandwidth is less than the current playback bit rate and the data in the buffer is insufficient (not enough for one slice), it will be reduced to the corresponding bit rate.
    When the current bandwidth is greater than the current playback code rate and the data in the buffer is sufficient (greater than one slice), it will rise to the corresponding code rate.
    How to calculate the number of slices in the current buffer?
    Calculate whether the target duration of a slice in the current buffer is sufficient by the amount of data in the buffer and the average bit rate of the last downloaded slice

Problems encountered in practical application

  1. What is the interval between playlist requests during live broadcast?
    A: generally, the playlist is updated at an interval of one slice to reduce the burden on the server.

  2. During live broadcast, the selection of the first playback clip?
    A: HLS protocol stipulates that during live broadcast, the client should select the slice longer than two targets from the last slice in the playlist as the first slice for playback.
    First_req_segment_sequence <= last_segment_in_list_sequence – 3.
    In order to balance real-time performance and fluency, generally speaking, the client should start playing from the penultimate segment or penultimate segment in the m3u8 file.

  3. How to judge whether the broadcast source is live or on-demand?
    A: there are two ways to judge
    1> Judge whether the media playlist has #EXT-X-ENDLIST tag, yes, on demand; No, live.
    2> Judge #EXT-X-PLAYLIST-TYPE:

    #EXT-X-PLAYLIST-TYPE: VOD / / on demand source
    #EXT-X-PLAYLIST-TYPE: EVENT / / live broadcast source

  4. How to determine whether an M3U8 file is a main playlist or a media playlist?
    Answer: check whether EXT-X-STREAM-INF exists in the list. If it exists, the main m3u8 list will be displayed; Does not exist, media playlist.

  5. During live broadcast, how to judge whether a new segment url is added to the currently requested m3u8 file compared with the last requested m3u8 file?
    A: a more efficient method is to compare whether the md5 code of the last requested m3u8 file is the same as that of the current m3u8 file. If there is no update, the resolution can be skipped; No, there are updates.

  6. What will the downloader do if the m3u8 list cannot be requested?
    A: retry twice. The waiting time for each retry is 10 seconds. If the request is not received, return error to the player.

  7. Why is the real-time performance of HLS poor?
    Answer: two reasons:
    1> It adopts sliced media form with large granularity. Slicing can be requested only after coding. The delay of a slice is inevitable.
    2> Using http short connection for data request requires continuous response to the server request. Each request takes an RTT time and the interaction time is long.

Tags: Network Protocol hls

Posted on Sun, 05 Dec 2021 20:44:14 -0500 by mj_23