Deep optimization of taote fluent streaming scene

Author: Jiang Zejun (true meaning)

Taote uses Flutter in many business scenarios. In addition, the business scenario itself has a certain complexity, which makes it obvious that Flutter uses native (Android/iOS) development in the sliding browsing process of low-end streaming scenarios. By analyzing the performance problems of the business layer at each stage of the fluent rendering process, after a series of in-depth optimization, the average frame rate has reached more than 50 frames, surpassing the original performance, but the Caton rate still can not achieve the best experience effect. It has encountered bottlenecks and technical challenges that are difficult to break through, and technical attempts and breakthroughs are needed.

This paper will talk about the underlying principles, optimization ideas, optimization strategies of actual scenes, implementation of core technologies, optimization results, etc. it is expected to bring you some inspiration and help. You are also welcome to communicate and make corrections to build a beautiful Flutter technology community.

Rendering mechanism

Native vs fluent

Flutter itself is based on the Native system, so the rendering mechanism is very close to Native. It is shared by Xiao Yu of Google Flutter team [1], as shown in the following figure:

Rendering process

As shown in the figure on the left, after receiving the VSync signal, the Flutter goes through 8 stages as a whole, in which the data will be submitted to the GPU after the Compositing stage.

Semantics stage will transfer the information that RenderObject marked needs to be semantically updated to the system to realize auxiliary functions. Through the semantic interface, users with visual impairment can understand the UI content, which is not related to the overall rendering process.

The Finalize Tree stage adds all to the_ All inactive elements of inactive elements are unmount ed, which is not related to the overall drawing process.

Therefore, the overall rendering process of fluent mainly focuses on the stage in the right of the figure above:

GPU Vsync

After receiving the vertical synchronization signal, the shuttle engine will notify the shuttle framework to begin the frame and enter the Animation stage.

Animation

It mainly implements the transient callbacks callback. The fluent engine will notify the fluent framework to draw the frame and enter the Build phase.

Build

Build the data structure of the UI component tree to be rendered, that is, create the corresponding Widget and the corresponding Element.

Layout

The purpose is to calculate the real size of the space occupied by each node for layout, and then update the layout information of all direct render objects.

Compositing Bits

update the RenderObject to be updated.

Paint

To generate a Layer Tree, the Layer Tree cannot be used directly. It also needs to be synthesized into a Scene and rasterized. The reason for level merging is that there are many levels in Flutter. It is inefficient to directly transfer each layer to GPU, so Composite will be done first to improve efficiency. After rasterization, it will be handed over to the fluent engine for processing.

Compositing

Synthesize the Layout Tree into a Scene, and create a raster image of the current state of the Scene, that is, Rasterize the raster image, and then submit it to the fluent engine. Finally, Skia submits the data to the GPU through the Open GL or Vulkan interface, and the GPU displays it after processing.

Core rendering phase

Widget

Most of us are writing widgets, which can be understood as the data structure of a component tree and the main part of the build phase. Among them, the depth of the Widget Tree, the setState rationality of StatefulWidget, whether there is unreasonable logic in the build function and the use of relevant widgets calling saveLayer often become performance problems.

Element

Associate the Widget and RenderObject to generate the Element corresponding to the Widget to store the context information. Fluent generates the RenderObject view tree by traversing the Element to support the UI structure.

RenderObject

RenderObject determines the Layout information in the Layout phase, and the Paint phase generates the corresponding Layer, which shows its importance. Therefore, most of the drawing performance optimization in fluent takes place here. The data constructed from the RenderObject tree will be added to the LayerTree required by the Engine.

Performance optimization ideas

To understand the underlying rendering mechanism and the core rendering stage, you can divide the optimization into three layers:

The optimization details of each layer are not discussed here. This paper mainly describes the actual scenario.

Streaming scene

Flow module principle

Under native development, RecyclerView/UICollectionView is usually used to develop list scenarios; Under the development of fluent, the fluent framework also provides the ListView component, which is actually a SliverList.

Core source code

We analyze it from the core source code of SliverList:

class SliverList extends SliverMultiBoxAdaptorWidget {

  @override
  RenderSliverList createRenderObject(BuildContext context) {
    final SliverMultiBoxAdaptorElement element = context as SliverMultiBoxAdaptorElement;
    return RenderSliverList(childManager: element);
  }
}

abstract class SliverMultiBoxAdaptorWidget extends SliverWithKeepAliveWidget {

  final SliverChildDelegate delegate;

  @override
  SliverMultiBoxAdaptorElement createElement() => SliverMultiBoxAdaptorElement(this);

  @override
  RenderSliverMultiBoxAdaptor createRenderObject(BuildContext context);
}

By viewing the source code of SliverList, we can see that SliverList is a RenderObjectWidget with the following structure:

Let's first look at its core source code of RenderObject:

class RenderSliverList extends RenderSliverMultiBoxAdaptor {

  RenderSliverList({
    @required RenderSliverBoxChildManager childManager,
  }) : super(childManager: childManager);

  @override
  void performLayout(){
    ...
    //Layout restrictions of parent nodes on child nodes
    final SliverConstraints constraints = this.constraints;
    final double scrollOffset = constraints.scrollOffset + constraints.cacheOrigin;
    final double remainingExtent = constraints.remainingCacheExtent;
    final double targetEndScrollOffset = scrollOffset + remainingExtent;
    final BoxConstraints childConstraints = constraints.asBoxConstraints();
    ...
    insertAndLayoutLeadingChild(childConstraints, parentUsesSize: true);
    ...
    insertAndLayoutChild(childConstraints,after: trailingChildWithLayout,parentUsesSize: true);
    ...
    collectGarbage(leadingGarbage, trailingGarbage);
    ...
  }
}

abstract class RenderSliverMultiBoxAdaptor extends RenderSliver ...{
  @protected
  RenderBox insertAndLayoutChild(BoxConstraints childConstraints, {@required RenderBox after,...}) {
    _createOrObtainChild(index, after: after);
    ...
  }

  RenderBox insertAndLayoutLeadingChild(BoxConstraints childConstraints, {@required RenderBox after,...}) {
    _createOrObtainChild(index, after: after);
    ...
  }

  @protected
  void collectGarbage(int leadingGarbage, int trailingGarbage) {
    _destroyOrCacheChild(firstChild);
    ...
  }

  void _createOrObtainChild(int index, { RenderBox after }) {
    _childManager.createChild(index, after: after);
    ...
  }

  void _destroyOrCacheChild(RenderBox child) {
    if (childParentData.keepAlive) {
      //For better performance, we will not keep alive and use else logic
      ...
    } else {
      _childManager.removeChild(child);
      ...
    }
  }
}

Looking at the source code of RenderSliverList, it is found that the creation and removal of children are carried out through its parent class renderslivermultiboxadapter. The renderslivermultiboxadapter is through_ The childManager is implemented by the SliverMultiBoxAdaptorElement. The layout size is limited by the parent node during the whole drawing process of SliverList.

In a streaming scenario:

  • In the sliding process, a new child entering the visual area is created through SliverMultiBoxAdaptorElement.createChild; (i.e. each item card in the business scenario)
  • In the sliding process, the old children that are not in the visual area are removed through SliverMultiBoxAdaptorElement.removeChild.

Let's take a look at the core source code of SliverMultiBoxAdaptorElement:

class SliverMultiBoxAdaptorElement extends RenderObjectElement implements RenderSliverBoxChildManager {
  final SplayTreeMap<int, Element> _childElements = SplayTreeMap<int, Element>();

  @override
  void createChild(int index, { @required RenderBox after }) {
    ...
    Element newChild = updateChild(_childElements[index], _build(index), index);
    if (newChild != null) {
      _childElements[index] = newChild;
    } else {
      _childElements.remove(index);
    }
    ...
  }

  @override
  void removeChild(RenderBox child) {
    ...
    final Element result = updateChild(_childElements[index], null, index);
    _childElements.remove(index);
    ...
  }

  @override
  Element updateChild(Element child, Widget newWidget, dynamic newSlot) {
    ...
    final Element newChild = super.updateChild(child, newWidget, newSlot);
    ...
  }
}

By checking the source code of SliverMultiBoxAdaptorElement, we can find that the operations on child are actually carried out through the updateChild of the parent Element.

Next, let's look at the core code of Element:

abstract class Element extends DiagnosticableTree implements BuildContext {
  @protected
  Element updateChild(Element child, Widget newWidget, dynamic newSlot) {
    if (newWidget == null) {
      if (child != null)
        deactivateChild(child);
      return null;
    }
    Element newChild;
    if (child != null) {
      ...
      bool hasSameSuperclass = oldElementClass == newWidgetClass;;
      if (hasSameSuperclass && child.widget == newWidget) {
        if (child.slot != newSlot)
          updateSlotForChild(child, newSlot);
        newChild = child;
      } else if (hasSameSuperclass && Widget.canUpdate(child.widget, newWidget)) {
        if (child.slot != newSlot)
          updateSlotForChild(child, newSlot);
        child.update(newWidget);
        newChild = child;
      } else {
        deactivateChild(child);
        newChild = inflateWidget(newWidget, newSlot);
      }
    } else {
      newChild = inflateWidget(newWidget, newSlot);
    }
    ...
    return newChild;
  }

  @protected
  Element inflateWidget(Widget newWidget, dynamic newSlot) {
    ...
    final Element newChild = newWidget.createElement();
    newChild.mount(this, newSlot);
    ...
    return newChild;
  }

  @protected
  void deactivateChild(Element child) {
    child._parent = null;
    child.detachRenderObject(); 
    owner._inactiveElements.add(child); // this eventually calls child.deactivate() & child.unmount()
    ...
  }
}

You can see that it mainly calls the mount and detachRenderObject of Element. Here, let's look at the source code of these two methods of RenderObjectElement:

abstract class RenderObjectElement extends Element {
  @override
  void mount(Element parent, dynamic newSlot) {
    super.mount(parent, newSlot);
    ...
    _renderObject = widget.createRenderObject(this);
    attachRenderObject(newSlot);
    ...
  }

  @override
  void attachRenderObject(dynamic newSlot) {
    ...
    _ancestorRenderObjectElement = _findAncestorRenderObjectElement();
    _ancestorRenderObjectElement?.insertChildRenderObject(renderObject, newSlot);
    ...
  }

  @override
  void detachRenderObject() {
    if (_ancestorRenderObjectElement != null) {
      _ancestorRenderObjectElement.removeChildRenderObject(renderObject);
      _ancestorRenderObjectElement = null;
    }
    ...
  }
}

By checking the traceability of the above source code, we can see that:

In a streaming scenario:

  • During the sliding process, a new child is created by creating a new Element and mount ing it to the Element Tree; Then create the corresponding RenderObject and call_ ancestorRenderObjectElement?.insertChildRenderObjectï¼›
  • Remove and mount the corresponding Element from the Element Tree unmount without removing the old child in the visual area during sliding; And then called ancestorRenderObjectElement.removeChildRenderObject.

Actually, this_ The elder renderobjectelement is the SliverMultiBoxAdaptorElement. Let's take a look at the SliverMultiBoxAdaptorElement:

class SliverMultiBoxAdaptorElement extends RenderObjectElement implements RenderSliverBoxChildManager {

  @override
  void insertChildRenderObject(covariant RenderObject child, int slot) {
    ...
    renderObject.insert(child as RenderBox, after: _currentBeforeChild);
    ...
  }

  @override
  void removeChildRenderObject(covariant RenderObject child) {
    ...
    renderObject.remove(child as RenderBox);
  }
}

In fact, all the methods called are ContainerRenderObjectMixin. Let's look at ContainerRenderObjectMixin:

mixin ContainerRenderObjectMixin<ChildType extends RenderObject, ... {
  void insert(ChildType child, { ChildType after }) {
        ...
    adoptChild(child);// attach render object
    _insertIntoChildList(child, after: after);
  }

  void remove(ChildType child) {
    _removeFromChildList(child);
    dropChild(child);// detach render object
  }
}

ContainerRenderObjectMixin maintains a two-way linked list to hold the current children RenderObject, so the creation and removal will be synchronized in the two-way linked list of ContainerRenderObjectMixin during sliding.

Finally, it can be summarized as follows:

  • During the sliding process, a new child is created by creating a new Element and mount ing it to the Element Tree; Then create the corresponding RenderObject, call SliverMultiBoxAdaptorElement.insertChildRenderObject attach to the Render Tree, and synchronously add the RenderObject to the double linked list mixin of SliverMultiBoxAdaptorElement;
  • Remove and mount the corresponding Element from the Element Tree unmount without removing the old child in the visual area during sliding; Then remove the corresponding RenderObject from the mixin's double linked list by using SliverMultiBoxAdaptorElement.removeChildRenderObject, and synchronously detach the RenderObject from the render tree.

Rendering principle

Through the analysis of the core source code, we can classify the elements of the streaming scene as follows:

Let's look at the overall rendering process and mechanism when users slide up to view more commodity cards and trigger the loading of data on the next page for display:

  • When sliding upward, the top 0 and 1 cards move out of the Viewport area (Visible Area + Cache Area). We define it as entering the Detach Area. After entering the Detach Area, the corresponding RenderObject will be dropped from the Render Tree detach, and the corresponding Element will be removed from the Element Tree unmount, and synchronously removed from the two-way linked list;
  • Judge whether it is necessary to start loading the next page of data by listening to the sliding calculation position of the ScrollController. Then, the Loading Footer component at the bottom will enter the visual area or cache area. You need to add 1 to the childcount of SliverChildBuilderDelegate. The last child returns the Loading Footer component and calls setState to refresh the whole SliverList. update will call performRebuild for reconstruction, and the middle part will be updated in the user's visual area; Then create the Loading Footer component corresponding to the new Element and RenderObject, and synchronously add them to the two-way linked list;
  • When the loading is finished and the data is returned, setState will be called again to refresh the entire SliverList, update will call performRebuild to reconstruct, and the middle part will be updated in the user's visual area; Then, the Loading Footer component detaches the corresponding RenderObject from the render tree, removes and mounts the corresponding Element from the Element Tree unmount, and synchronously removes it from the two-way linked list;
  • The new item at the bottom will enter the visual area or cache. You need to create a corresponding new Element and RenderObject and add them to the two-way linked list synchronously.

Optimization strategy

Above, users can slide upward to view more commodity cards and trigger the scene of loading data on the next page for display, which can be optimized from five directions:

Load More

By listening to the ScrollController's sliding, it is better to continuously calculate without judgment. It is automatically recognized that the data of the next page needs to be loaded, and then initiate the loadMore() callback. Create a new ReuseSliverChildBuilderDelegate, add loadMore and footerBuilder at the same level as item Builder, and include the Loading Footer component by default. Judge whether to dynamically callback loadMore() and automatically construct the footer component in SliverMultiBoxAdaptorElement.createChild(int index,...).

Local refresh

Referring to the fluency of long list before optimizing the [2], after calling the setState to refresh the whole SliverList after the data coming back from the next page, it will cause the middle part to complete update operation in the user visibility area. In fact, it only needs to refresh the newly created part and optimize the part of SliverMultiBoxAdaptorElement.update(SliverMultiBoxAdaptorWidget newWidget) to achieve partial refresh. As shown below:

Element & renderobject reuse

Refer to the fluency optimization of long list [2] and the reuse design of Google Android RecyclerView ViewHolder [3] before idle fish. When a new item is created, the ViewHolder similar to Android RecyclerView can hold and reuse components. Based on the analysis of the principle of rendering mechanism, in fluent, a Widget can actually be understood as a data structure of a component tree, that is, it is more a data expression of a component structure. We need to cache and hold the Element and RenderObject component types of the removed item. When creating a new item, we first take it out of the cache and reuse it. At the same time, it does not destroy the design of Flutter's own Key. If an item uses a Key, it only reuses the Element and RenderObject that are the same as its Key. However, in the streaming scenario, the list data is different, so if the Key is used in the streaming scenario, it cannot be reused. If Element and RenderObject are reused, the item component does not recommend using Key.

We add a cache state to the classification of elements in the original streaming scenario:

As shown below:

GC inhibition

Dart has its own GC mechanism, which is similar to Java's generational recycling. It can suppress GC in the sliding process and customize the GC recycling algorithm. In view of this discussion with Google's fluent experts, in fact, dart does not have multi-threaded switching for garbage collection like Java. Single threaded garbage collection is faster and lighter. At the same time, it needs to make in-depth transformation of the fluent engine. Considering the small income, it will not be carried out temporarily.

Asynchronization

The fluent engine restricts non main isolates from calling Platform related APIs, and puts all logic that does not interact with Platform threads into new isolates. Frequent creation and recycling of isolates will also have a certain impact on performance. Fluent compute < Q, R > (isolates. Computecallback < Q, R > callback, Q message, {string debuglabel}) creates a new Isolate each time it is called, After the task is executed, it will be recycled to implement an Isolate similar to the thread pool to process non view tasks. After the actual test, the improvement is not obvious, so it is not described.

Core technology implementation

We can classify the calling link codes as follows:

In the slicermultiboxadaptorelement inherited from RenderObjectElement, all rendering cores do not destroy the original function design and the structure of the fluent framework. The Element of reusesmultiboxadaptorelement is added to implement the optimization strategy, and can be directly matched with the RenderSliverList of the original slicerlist or customized streaming components (for example, waterfall flow component) is used by RenderObject.

Local refresh

Call link optimization

Judge whether it is a local refresh in the update method of ReuseSliverMultiBoxAdaptorElement. If it is not a local refresh, perform rebuild; if it is a local refresh, only create the newly generated item.

Core code

@override
void update(covariant ReuseSliverMultiBoxAdaptorWidget newWidget) {
  ...
  //Local refresh
  if(_isPartialRefresh(oldDelegate, newDelegate)) {
      ...
      int index = _childElements.lastKey() + 1;
      Widget newWidget = _buildItem(index);
      // do not create child when new widget is null
      if (newWidget == null) {
        return;
      }
      _currentBeforeChild = _childElements[index - 1].renderObject as RenderBox;
      _createChild(index, newWidget);
    } else {
       // need to rebuild
       performRebuild();
    }
}

Element & renderobject reuse

Call link optimization

  • Create: in the createChild method of ReuseSliverMultiBoxAdaptorElement, read _cacheelementsand reuse the cached elements of the corresponding component type; if there are no reusable elements of the same type, create the corresponding new elements and renderobjects.
  • Remove: in the removeChild method of ReuseSliverMultiBoxAdaptorElement, remove the removed RenderObject from the double linked list, do not deactivate the Element and detach the RenderObject, update the _slot of the corresponding Element to null, so that it can be reused normally next time, and then cache the corresponding Element into the linked list of the corresponding component type of _cacheElements.

Note: no inactive element can be implemented without calling, but RenderObject cannot be implemented directly without detach. It is necessary to add a new method removeOnly in the object.dart file of the fluent framework layer, that is, only remove RenderObject from the double linked list without detach.

Core code

  • establish
//For the new method, createChild will call this method
_createChild(int index, Widget newWidget){
  ...
  Type delegateChildRuntimeType = _getWidgetRuntimeType(newWidget);
  if(_cacheElements[delegateChildRuntimeType] != null
      && _cacheElements[delegateChildRuntimeType].isNotEmpty){
    child = _cacheElements[delegateChildRuntimeType].removeAt(0);
  }else {
    child = _childElements[index];
  }
  ...
  newChild = updateChild(child, newWidget, index);
  ...
}
  • remove
@override
void removeChild(RenderBox child) {
 ...
 removeChildRenderObject(child); // call removeOnly
 ...
 removeElement = _childElements.remove(index);
 _performCacheElement(removeElement);
 }

Load More

Call link optimization

When creating child, judge whether to build footer for processing.

Core code

@override
void createChild(int index, { @required RenderBox after }) {
    ...
    Widget newWidget;
    if(_isBuildFooter(index)){ // call footerBuilder & call onLoadMore
      newWidget = _buildFooter();
    }else{
      newWidget = _buildItem(index);
    }
    ...
    _createChild(index, newWidget);
    ...
}

Overall structure design

  • Centralize the core optimization capabilities in the Element layer to provide the underlying capabilities;
  • Take ReuseSliverMultiBoxAdaptorWidget as the base class and return the optimized Element by default;
  • The capabilities of loadMore and FooterBuilder are uniformly exposed to the upper layer by ReuseSliverChildBuilderDelegate inherited from SliverChildBuilderDelegate;
  • If you have your own customized flow component Widget, you can directly change the inheritance relationship from RenderObjectWidget to ReuseSliverMultiBoxAdaptorWidget, such as customized single list component (ReuseSliverList), waterfall flow component (ReuseWaterFall), etc.

Optimization results

Based on a series of previous in-depth optimizations and switching the fluent engine to UC Hummer, the optimization variables of the streaming scene are controlled separately, and PerfDog is used to obtain the fluency data for fluency test comparison:

It can be seen that the overall performance data has been optimized and improved. On average, combined with the test data before replacing the Engine, the frame rate has been improved by 2-3 frames, and the Caton rate has decreased by 1.5 percentage points.

summary

Mode of use

In the same way as the native slicerlist, the Widget can be replaced with the corresponding reusable component (reusesliverlist / reusewaterfall / customslicerlist). If footer and loadMore are required, ReuseSliverChildBuilderDelegate can be used; if it is not required, the native slicerchildbuilderdelegate can be used directly.

Paging scenario required

return ReuseSliverList( // ReuseWaterFall or CustomSliverList
delegate: ReuseSliverChildBuilderDelegate(
  (BuildContext context, int index) {
    return getItemWidget(index);
  }, 
  //Build footer
  footerBuilder: (BuildContext context) {
    return DetailMiniFootWidget();
  },
  //Add loadMore listener
  addUnderFlowListener: loadMore,
  childCount: dataOfWidgetList.length
)
);

No paging scenario

return ReuseSliverList( // ReuseWaterFall or CustomSliverList
delegate: SliverChildBuilderDelegate(
  (BuildContext context, int index) {
    return getItemWidget(index);
  }, 
  childCount: dataOfWidgetList.length
)
);

Attention

When using the item/footer component, do not add a Key, otherwise it is considered that only the same Key is reused. Because the Element is reused, although the Widget expressing the data results of the component tree will be updated every time, the State of StatefulElement is generated when the Element is created and will also be reused. It is consistent with the design of fluent itself, so it needs to be in didupda Tewidget (covariant t oldwidget) can retrieve the data cached in the State from the Widget again.

Reuse Element Lifecycle

Callback the status of each item, and the upper layer can do logic processing and resource release. For example, previously, the data cached in the State was retrieved from the Widget in the didUpdateWidget(covariant T oldWidget), which can be placed in onDisappear or automatically played video stream;

///Reusable life cycle
mixin ReuseSliverLifeCycle{

  // Visible at the front desk
  void onAppear() {}

  // Invisible in the background
  void onDisappear() {}
}

reference material

[1] : Google fleet team Xiao Yu: Fleet performance profiling and theory: https://files.flutter-io.cn/events/gdd2018/Profiling_your_Flutter_Apps.pdf

[2] : free fish cloud Cong: he doubled the fluency of free fish APP long list

[3]: Google Android RecyclerView.ViewHolder:RecyclerView.Adapter#onCreateViewHolder: https://developer.android.com/reference/androidx/recyclerview/widget/RecyclerView.Adapter#onCreateViewHolder(android.view.ViewGroup,%20int)

Focus on Alibaba mobile technology official account , 3 mobile technology practices & dry goods for you to think about every week!

Tags: element

Posted on Fri, 26 Nov 2021 03:49:54 -0500 by newburcj