Flink getting started state update

There is such a need to count the number of times a line is clicked every hour,
It needs to write statistics to redis every 30 seconds, and use the Slide window to Slide, but a problem is found when the key does not have new message consumption,
When the value of (key, value) does not change, the sliding statistics will still output the statistical value, and the value of this key does not have to be written to redis.
Therefore, the update of ValueState is used,

The code is as follows:

        DataStream<Tuple2<String, Integer>> result = exposure.map(new MapFunction<PlanDetailBO, Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> map(PlanDetailBO value) throws Exception {
                String durationData = value.getFirstFromDate();
                String firstTrafficCode = value.getFirstNo();
                String secondTrafficCode = value.getSecondNo();
                String startStationCode = value.getFirstFromStationCode();
                String transferArriveStationCode = value.getFirstToStationCode();
                String transferLeaveStationCode = value.getSecondFromStationCode();
                String endStationCode = value.getSecondToStationCode();
                String key = String.format("%s+%s+%s+%s+%s+%s+%s", durationData, firstTrafficCode, secondTrafficCode, startStationCode, transferArriveStationCode, transferLeaveStationCode, endStationCode);
                return Tuple2.of(key, 1);
            }
        }).keyBy(s -> s.f0).window(SlidingEventTimeWindows.of(Time.seconds(60*60), Time.seconds(30))).sum(1);

Statistics result: in the past hour, it is triggered every 30 seconds. No matter whether there is a new message under the key or not, it will be output in 30 seconds.
So define ValueState update to update

        DataStream<Tuple2<String, Integer>> updateResult = result.keyBy(0).map(new RichMapFunction<Tuple2<String, Integer>, Tuple2<String, Integer>>() {
        	//Record statistics
            private transient ValueState<Integer> counts;
            @Override
            public void open(Configuration parameters) throws Exception {
            	//Set the TTL life cycle of ValueState to 1 hour, and the contents of ValueState will be cleared automatically
                StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api.common.time.Time.minutes(60)).setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite).setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired).build();
                //Set the default value of ValueState
                ValueStateDescriptor<Integer> descriptor = new ValueStateDescriptor<Integer>("plan_num", Integer.class);
                descriptor.enableTimeToLive(ttlConfig);
                counts = getRuntimeContext().getState(descriptor);
                super.open(parameters);
            }

            @Override
            public Tuple2<String, Integer> map(Tuple2<String, Integer> value) throws Exception {
                Integer num = value.f1;
                String key = value.f0+"status";
                //Output null if the statistic does not change
                if (num == counts.value()) {
                    return Tuple2.of(key, null);
                }
                counts.update(num);
                return Tuple2.of(key, num);
            }
        }).filter(s->s.f1!=null); //Filter null to output only changing values

        updateResult.print();
		//Write redis, omitted

This will complete our update.

Tags: Redis Apache

Posted on Wed, 06 Nov 2019 09:43:47 -0500 by rashpal