Detailed explanation of the source code of the new back pressure monitoring index in Flink1.13

Back pressure (also known as back pressure) is one of the most important monitoring indicators of flink. It can intuitively reflect whether the downstream task can process the received data in time. For details of back pressure, please refer to the official website: Monitoring back pressure.

Note: 1.13 the introduction of back pressure on the official website is still based on the back pressure calculation method of 1.12.

Before 1.12, Flink used output stack sampling to determine whether there was back pressure. In 1.13, it changed to use task Mailbox timing, and re implemented the UI display of job graph. Flink now displays the degree of Busy and back pressure on the UI through colors and values. The redder the color, the more serious the back pressure. In addition, in the task details, two new indicators (as shown in the figure below) are added to the BackPressure tag, Idle and Busy, which respectively represent the Busy status of the task. For these two indicators, let's take you to learn more about the collection logic of these two indicators.

Take a look at the introduction of these two indicators on the official website:

MetricsDescriptionType
idleTimeMsPerSecondThe time (in milliseconds) this task is idle (has no data to process) per second. Idle time excludes back pressured time, so if the task is back pressured it is not idle.Meter
busyTimeMsPerSecondThe time (in milliseconds) this task is busy (neither idle nor back pressured) per second. Can be NaN, if the value could not be calculated.Meter

It can be seen from the introduction of the official website that these two indicators are used to reflect the busy/idle situation of task every second. Therefore, it can also be guessed that the sum of these two indicators should be equal to 1000, and the sum of the two indicators in the figure above should be equal to 100%. Next, let's go to the source code to see how the indicators are collected and whether they are complementary.

The quickest way to find this indicator in the source code is to directly search idletimmspersecond in the org.apache.flick.runtime.metrics package. You can directly find the class TaskIOMetricGroup containing this indicator. You can see that this indicator belongs to the IO aspect. The specific definitions are as follows (for the sake of intuition, irrelevant codes are excluded):

private final TimerGauge idleTimePerSecond;
private final Gauge busyTimePerSecond;
private final TimerGauge backPressuredTimePerSecond;

public TaskIOMetricGroup(TaskMetricGroup parent) {
    super(parent);
    this.idleTimePerSecond = gauge(MetricNames.TASK_IDLE_TIME, new TimerGauge());
    this.backPressuredTimePerSecond =
            gauge(MetricNames.TASK_BACK_PRESSURED_TIME, new TimerGauge());
    this.busyTimePerSecond = gauge(MetricNames.TASK_BUSY_TIME, this::getBusyTimePerSecond);
    }

It can be seen that idleTimePerSecond and backPressuredTimePerSecond are TimerGauge types, and a TimerGauge object is directly created through new in the construction method; busyTimePerSecond is obtained by calling the getBusyTimePerSecond method;

Note: gauge method does not affect indicator definition, but only binds indicator name and indicator value, so it is not displayed here.

The implementation of getBusyTimePerSecond method is as follows:

private double getBusyTimePerSecond() {
        double busyTime = idleTimePerSecond.getValue() + backPressuredTimePerSecond.getValue();
        return busyTimeEnabled ? 1000.0 - Math.min(busyTime, 1000.0) : Double.NaN;
    }

It can be seen that the value of busyTimePerSecond depends on busyTimeEnabled. busyTimeEnabled is passed in externally through the set method. There are two classes calling the set method, SourceStreamTask and StreamTask. The value passed in by SourceStreamTask is false and that passed in by StreamTask is true. Therefore, the busyTimeMsPerSecond of source task is NaN; When it is a stream task, it is calculated by idleTimePerSecond and backPressuredTimePerSecond. If backPressuredTimePerSecond is 0, it will be added to idleTimePerSecond to 1000, which also verifies the above conjecture.

Next, you can focus on idleTimePerSecond to see how this value is calculated. This value is passed to external objects through the get method. There is only one StreamTask in the code calling the get method, as follows:

if (!recordWriter.isAvailable()) {
    timer = ioMetrics.getBackPressuredTimePerSecond();
    resumeFuture = recordWriter.getAvailableFuture();
} else {
    timer = ioMetrics.getIdleTimeMsPerSecond();
    resumeFuture = inputProcessor.getAvailableFuture();
}
assertNoException(
        resumeFuture.thenRun(
                new ResumeWrapper(controller.suspendDefaultAction(timer), timer)));

It can be seen that StreamTask uses this variable in two places after obtaining idleTimePerSecond, one as a parameter of the suspendDefaultAction method and the other as a parameter of the ResumeWrapper constructor. Let's see what it does in the ResumeWrapper constructor first.

private static class ResumeWrapper implements Runnable {
    private final Suspension suspendedDefaultAction;
    private final TimerGauge timer;

    public ResumeWrapper(Suspension suspendedDefaultAction, TimerGauge timer) {
        this.suspendedDefaultAction = suspendedDefaultAction;
        timer.markStart();
        this.timer = timer;
    }

    @Override
    public void run() {
        timer.markEnd();
        suspendedDefaultAction.resume();
    }
}

It can be seen that the markStart method is called in the constructor, and when the object run is called, markEnd is created, that is, after the object is created, the idle time is calculated, and the computation is stopped at the beginning of run.

Next, let's look at the specific implementation of suspendDefaultAction:

@Override
public MailboxDefaultAction.Suspension suspendDefaultAction(
        TimerGauge suspensionIdleTimer) {
    return mailboxProcessor.suspendDefaultAction(suspensionIdleTimer);
}

In the suspendDefaultAction method, the mailboxProcessor.suspendDefaultAction method is called again. Continue to see the implementation of mailboxProcessor.suspendDefaultAction.

private void maybePauseIdleTimer() {
    if (suspendedDefaultAction != null && suspendedDefaultAction.suspensionTimer != null) {
        suspendedDefaultAction.suspensionTimer.markEnd();
    }
}

private void maybeRestartIdleTimer() {
    if (suspendedDefaultAction != null && suspendedDefaultAction.suspensionTimer != null) {
        suspendedDefaultAction.suspensionTimer.markStart();
    }
}

/**
 * Calling this method signals that the mailbox-thread should (temporarily) stop invoking the
 * default action, e.g. because there is currently no input available.
 */
private MailboxDefaultAction.Suspension suspendDefaultAction(
        @Nullable TimerGauge suspensionTimer) {

    checkState(
            mailbox.isMailboxThread(),
            "Suspending must only be called from the mailbox thread!");

    checkState(suspendedDefaultAction == null, "Default action has already been suspended");
    if (suspendedDefaultAction == null) {
        suspendedDefaultAction = new DefaultActionSuspension(suspensionTimer);
        ensureControlFlowSignalCheck();
    }

    return suspendedDefaultAction;
}

As you can see, in the mailboxProcessor.suspendDefaultAction method, the DefaultActionSuspension is created by using the passed idleTimePerSecond and assigned to the internal parameter suspendedDefaultAction, while the suspendedDefaultAction is used, that is, maybePauseIdleTimer and maybeRestartIdleTimer. maybePauseIdleTimer is to suspend timing, maybeRestartIdleTimer clocks the restart. The calling of these two methods is as follows:

private boolean processMailsWhenDefaultActionUnavailable() throws Exception {
    boolean processedSomething = false;
    Optional<Mail> maybeMail;
    while (isDefaultActionUnavailable() && isNextLoopPossible()) {
        maybeMail = mailbox.tryTake(MIN_PRIORITY);
        if (!maybeMail.isPresent()) {
            maybeMail = Optional.of(mailbox.take(MIN_PRIORITY));
        }
        maybePauseIdleTimer();
        maybeMail.get().run();
        maybeRestartIdleTimer();
        processedSomething = true;
    }
    return processedSomething;
}

private boolean processMailsNonBlocking(boolean singleStep) throws Exception {
    long processedMails = 0;
    Optional<Mail> maybeMail;

    while (isNextLoopPossible() && (maybeMail = mailbox.tryTakeFromBatch()).isPresent()) {
        if (processedMails++ == 0) {
            maybePauseIdleTimer();
        }
        maybeMail.get().run();
        if (singleStep) {
            break;
        }
    }
    if (processedMails > 0) {
        maybeRestartIdleTimer();
        return true;
    } else {
        return false;
    }
}

You can see that both methods pause the timing when you start processing the task, and start timing after the task is processed, so as to count the idle time of the task.

Note: to understand the specific design and implementation of MailBox, please refer to this article: Flink StreamTask thread model based on MailBox implementation

After understanding the usage of idleTimePerSecond, let's take a look at its specific implementation. Idletimpersecond is defined as the TimerGauge class. Let's take a look at the specific contents of the TimerGauge class:

public TimerGauge() {
    this(SystemClock.getInstance());
}

public TimerGauge(Clock clock) {
    this.clock = clock;
}

public synchronized void markStart() {
    if (currentMeasurementStart == 0) {
        currentMeasurementStart = clock.absoluteTimeMillis();
    }
}

public synchronized void markEnd() {
    if (currentMeasurementStart != 0) {
        currentCount += clock.absoluteTimeMillis() - currentMeasurementStart;
        currentMeasurementStart = 0;
    }
}

In the definition of idleTimePerSecond, the TimerGauge() construction method is used, so the clock is SystemClock, and the absoluteTimeMillis() method of SystemClock is to obtain the current timestamp. Therefore, it can be seen that when timing starts, the current timestamp is obtained and saved to currentMeasurementStart. When timing stops, the timestamp is obtained here and subtracted from currentMeasurementStart, Add the calculation result + = to currentCount, and set currentMeasurementStart to 0 again. In this way, through continuous start and end, the execution time of the task can be accumulated, and the execution pressure of the task can be judged.

Tags: Big Data flink

Posted on Fri, 03 Sep 2021 20:47:51 -0400 by phphelpme