Android audio and video development -- screen recording and live broadcasting technology

brief introduction

When watching the live broadcast of mobile games, what we see in the audience is the content on the player's screen. How is this realized? This blog will write a live video Demo to achieve the effect similar to the live video of mobile games

It's easy to obtain screen data. Android system provides corresponding services. The difficulty lies in transmitting data to the live broadcast server. We use RtmpDump to transmit Rtmp data. Because RtmpDump is implemented in C language, we also need NDK development. Java alone can't be implemented. Of course, if we're not afraid of trouble, You can also compile fmpeg to realize Rtmp streaming. The open source ijkplayer player of station B is also developed based on fmpeg

Realization effect

Finally, we push the stream to the live broadcast room of station B, where we can see the picture on our mobile phone screen in real time

Basic process

  • Obtain screen recording data
  • h264 encode the data
  • Rtmp packet
  • Upload to the streaming address of the live broadcast server

Obtain screen recording data

Get MediaProjectionService through Intent, and then get mediaprojectionvirtualcanvas, from which we get the original data of screen recording

    private void initLive() {
        mediaProjectionManager = (MediaProjectionManager) getSystemService(Context.MEDIA_PROJECTION_SERVICE);
        Intent screenRecordIntent = mediaProjectionManager.createScreenCaptureIntent();

    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        super.onActivityResult(requestCode, resultCode, data);
        if (requestCode == 100 && resultCode == Activity.RESULT_OK) {
            //Mediaprojection - > generate screen recording data
            mediaProjection = mediaProjectionManager.getMediaProjection
                    (resultCode, data);

h264 encode the data

We need to h264 encode the raw YUV data obtained through MediaProject. At this time, we use the native MediaCodec for hardware coding

    public void start(MediaProjection mediaProjection){
        this.mediaProjection = mediaProjection;
        // Configuring MediaCodec
        MediaFormat mediaFormat = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC,width,height);
        // Color format
        mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
        mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 400_000);
        mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 15);
        // Set the time interval for triggering keyframes to 2 s
        mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 2);

        // Create encoder
        try {
            mediaCodec = MediaCodec.createEncoderByType("video/avc");
            Surface surface = mediaCodec.createInputSurface();
            mediaProjection.createVirtualDisplay(TAG,width,height,1, DisplayManager.VIRTUAL_DISPLAY_FLAG_PUBLIC
        } catch (IOException e) {

    public void run() {
        isLiving = true;
        MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();

        while (isLiving){
            //If the time difference is greater than 2 s, notify the encoder to generate I frame
            if (System.currentTimeMillis() - timeStamp >= 2000){
                // Bundle notification Dsp
                Bundle msgBundle = new Bundle();
                timeStamp = System.currentTimeMillis();
            // The next step is the general operation of MediaCodec to obtain the available index of Buffer. There is no need to obtain the output index here. It has been operated internally
            int outputBufferIndex = mediaCodec.dequeueOutputBuffer(bufferInfo,100_000);
            if (outputBufferIndex >=0){
                // Got it
                ByteBuffer byteBuffer = mediaCodec.getOutputBuffer(outputBufferIndex);
                byte[] outData = new byte[bufferInfo.size];

Rtmp packet

After the above two steps, we have obtained the encoded h264 data. Next, it is a headache to encapsulate Rtmp (we have almost forgotten the knowledge of Ndk)

Firstly, we import the source code of rtmpdump into the cpp file of the project. We use rtmpdump to connect to the server and transmit Rtmp data. We need to know that the data in our hand is still h264 code stream and cannot be transmitted directly. It needs to be encapsulated into Rtmp packets

The third-party library Rtmpdump is used to push the stream to the live broadcast server. Because the amount of code of Rtmpdump is not very large, we directly copy the source code to the cpp file of Android. If we need to use fmpeg, we can't use this calling method. We need to compile the so library file in advance

For the time being, we don't need to have a deep understanding of Rtmp, because it's easy to wrap ourselves in and use Rtmp to transmit h264 data. Rtmp has already specified how to place sps,pps and key frames. We need to use NDK to fill Rtmp data

Use of RtmpDump

  • Connect server
  1. RTMP_Init(RTMP *r) initialization
  2. RTMP_EnableWrite(RTMP *r) configuration enables data writing
  3. RTMP_Connect(RTMP *r, RTMPPacket *cp)
  4. RTMP_ConnectStream(RTMP *r, int seekTime)
  • send data
  1. RTMPPacket_Alloc(RTMPPacket *p, int nSize)
  2. RTMP_SendPacket(RTMP *r, RTMPPacket *packet, int queue)
  3. RTMPPacket_Free(RTMPPacket *p)

Connect to the live broadcast server

In this step, you need to prepare the live streaming address in advance, and then implement the native method

extern "C" JNIEXPORT jboolean JNICALL
Java_com_bailun_kai_rtmplivedemo_RtmpPack2Remote_connectLiveServer(JNIEnv *env, jobject thiz,
                                                                   jstring url) {
    // First, convert Java to C string, otherwise it cannot be used
    const char *live_url = env->GetStringUTFChars(url,0);
    int result;

    do {
     // Structure object allocates memory
     livePack = (LivePack *)(malloc(sizeof(LivePack)));
     // Clear dirty data on memory
     // Rtmp request memory
     livePack->rtmp = RTMP_Alloc();
     // Set rtmp initialization parameters, such as timeout and url
     livePack->rtmp->Link.timeout = 10;
     LOGI("connect %s", url);

     if (!(result = RTMP_SetupURL(livePack->rtmp,(char *)live_url))){
     // Enable Rtmp write
     if (!(result = RTMP_Connect(livePack->rtmp,0))){
     LOGI("RTMP_ConnectStream ");
     if (!(result = RTMP_ConnectStream(livePack->rtmp, 0)))
     LOGI("connect success");
    }while (0);

    if (!result && livePack){
        livePack = nullptr;
    return result;

Send data to live broadcast server

Interestingly, the Rtmp protocol does not need to pass delimiters (h264 delimiter is 0 0 1), and the contents of the first Rtmp packet of the push stream are sps, pps, etc

// Send rtmp data to the server
extern "C"
Java_com_bailun_kai_rtmplivedemo_RtmpPack2Remote_sendData2Server(JNIEnv *env, jobject thiz,
                                                                 jbyteArray buffer, jint length,
                                                                 jlong tms) {
    int result;
    // Copy data
    jbyte *bufferArray = env->GetByteArrayElements(buffer, 0);
    result = sendDataInner(bufferArray,length,tms);
    //Free memory
    return result;

int sendDataInner(jbyte *array, jint length, jlong tms) {
    int result = 0;
    //Handling sps, pps
    if (array[4] == 0x67){
        // Read sps and pps data and save them to the structure
       return result;

    //Processing I frames, other frames
    if(array[4] == 0x65){
      RTMPPacket * spsPpsPacket = createRtmpSteramPack(livePack);

    RTMPPacket* rtmpPacket = createRtmpPack(array,length,tms,livePack);
    result = sendPack(rtmpPacket);
    return result;

int sendPack(RTMPPacket *pPacket) {
    int result = RTMP_SendPacket(livePack->rtmp,pPacket,1);
    return result;

// Send Rtmp packet corresponding to sps and pps
RTMPPacket *createRtmpSteramPack(LivePack *pack) {
    //  Create Rtmp packet corresponding to RTMPPacket structure of RtmpDump library
    int body_size = 16 + pack->sps_len + pack->pps_len;
    RTMPPacket  *rtmpPacket = static_cast<RTMPPacket *>(malloc(sizeof(RTMPPacket)));
    int index = 0;
    rtmpPacket->m_body[index++] = 0x17;
    //AVC sequence header is set to 0x00
    rtmpPacket->m_body[index++] = 0x00;
    rtmpPacket->m_body[index++] = 0x00;
    rtmpPacket->m_body[index++] = 0x00;
    rtmpPacket->m_body[index++] = 0x00;
    //AVC sequence header
    rtmpPacket->m_body[index++] = 0x01;
//    Original operation

    rtmpPacket->m_body[index++] = pack->sps[1]; //Profiles such as baseline, main and high

    rtmpPacket->m_body[index++] = pack->sps[2]; //profile_compatibility
    rtmpPacket->m_body[index++] = pack->sps[3]; //profile level
    rtmpPacket->m_body[index++] = 0xFF;//It's set for you
    rtmpPacket->m_body[index++] = 0xE1; //reserved (111) + lengthSizeMinusOne (number of 5-bit sps) is always 0xe1
//High eight
    rtmpPacket->m_body[index++] = (pack->sps_len >> 8) & 0xFF;
//    Lower eight
    rtmpPacket->m_body[index++] = pack->sps_len & 0xff;
//    Copy sps content
    memcpy(&rtmpPacket->m_body[index], pack->sps, pack->sps_len);
    index +=pack->sps_len;
//    pps
    rtmpPacket->m_body[index++] = 0x01; //pps number
//rtmp protocol
    //pps length
    rtmpPacket->m_body[index++] = (pack->pps_len >> 8) & 0xff;
    rtmpPacket->m_body[index++] = pack->pps_len & 0xff;
//    Copy pps content
    memcpy(&rtmpPacket->m_body[index], pack->pps, pack->pps_len);
//Video type
    rtmpPacket->m_packetType = RTMP_PACKET_TYPE_VIDEO;
    rtmpPacket->m_nBodySize = body_size;
//    Video 04
    rtmpPacket->m_nChannel = 0x04;
    rtmpPacket->m_nTimeStamp = 0;
    rtmpPacket->m_hasAbsTimestamp = 0;
    rtmpPacket->m_headerType = RTMP_PACKET_SIZE_LARGE;
    rtmpPacket->m_nInfoField2 = livePack->rtmp->m_stream_id;
    return rtmpPacket;

RTMPPacket *createRtmpPack(jbyte *array, jint length, jlong tms, LivePack *pack) {
    array += 4;
    RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
    int body_size = length + 9;
    RTMPPacket_Alloc(packet, body_size);
    if (array[0] == 0x65) {
        packet->m_body[0] = 0x17;
        LOGI("Send keyframes data");
    } else{
        packet->m_body[0] = 0x27;
        LOGI("Send non keyframes data");
//    Fixed size
    packet->m_body[1] = 0x01;
    packet->m_body[2] = 0x00;
    packet->m_body[3] = 0x00;
    packet->m_body[4] = 0x00;

    packet->m_body[5] = (length >> 24) & 0xff;
    packet->m_body[6] = (length >> 16) & 0xff;
    packet->m_body[7] = (length >> 8) & 0xff;
    packet->m_body[8] = (length) & 0xff;

    memcpy(&packet->m_body[9], array, length);
    packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
    packet->m_nBodySize = body_size;
    packet->m_nChannel = 0x04;
    packet->m_nTimeStamp = tms;
    packet->m_hasAbsTimestamp = 0;
    packet->m_headerType = RTMP_PACKET_SIZE_LARGE;
    packet->m_nInfoField2 = pack->rtmp->m_stream_id;

    return packet;

void readSpsPps(jbyte *array, jint length, LivePack *pack) {
    for (int i = 0; i < length; i++) {
        if (i+4 < length){
            // Find subscript for pps
            if (array[i] == 0x00
                && array[i+1] == 0x00
                && array[i+2] == 0x00
                && array[i+3] == 0x01
                && array[i+4] == 0x68
                // Save sps
                livePack->sps_len = i - 4;
                livePack->sps = static_cast<int8_t *>(malloc(livePack->sps_len));
                memcpy(livePack->sps,array + 4,livePack->sps_len);
                // Save pps
                livePack->pps_len = length -(livePack->sps_len+4) - 4;
                livePack->pps = static_cast<int8_t *>(malloc(livePack->pps_len));
                LOGI("sps:%d pps:%d", livePack->sps_len, livePack->pps_len);

Use of pointers

  • malloc only allocates memory and cannot initialize the obtained memory. Therefore, in a new piece of memory, its value will be random and the applied memory will be continuous
  • The return type is void * and void * represents a pointer of undetermined type. C/C + + specifies that void * type can be cast to any other type of pointer, malloc function returns void * type, C + +: p = malloc (sizeof(int)); The program cannot be compiled and an error is reported: "void * cannot be assigned to int * type variables". Therefore, the cast must be passed through (int *)


First, we get the picture of the mobile phone screen through the system service. At this time, the original data can not be transmitted through the network. After h264 coding, we package the Rtmp packet, and then transmit it in the way specified in the Rtmp protocol

Refer to learning video:

Tiktok video and video encoding technology
Android audio and video -- the real battle from zero to push video streaming to BiliBili
Why H265 has higher compression ratio and clearer quality than H264? Learn the truth from the coding principle
Advanced practice of Android audio and video

In order to make you better understand Android audio and video related knowledge * *, you can click the small card below to visit and learn * *.

Tags: Java Android Design Pattern

Posted on Thu, 21 Oct 2021 09:55:02 -0400 by mastermike707