Installation of Kafka Cluster (Traditional & Docker) &Springboot Integration of Kafka

Catalog

1. What is kafka

2. Installation of Kafka

Traditional way

docker installation

3. Integrating SpringBoot

General Mode Consumption

Producer Callback Mode

Kafka Transactions

Consumer Bulk Consumption Message

Consumer Manual Confirmation

Specified consumption

Specify a custom partitioner

Consumer side exception handling

Message Filter

1. What is kafka

* Kafka is a high-throughput distributed publishing and subscription messaging system that processes all the action stream data consumers have in their websites. This action (web browsing, search, and other user actions)Is a key factor in many social functions on modern networks. This data is usually solved by processing logs and log aggregations for throughput requirements. This is a viable solution for log data like Hadoop and offline analysis systems that require real-time processing constraints. Kafka's goal is to integrate them through Hadoop's parallel loading mechanismOnline and offline message processing is also used to provide real-time messages through clusters. - From Baidu Encyclopedia

2. Installation of Kafka

Traditional way

Download address: Apache Download Mirrors

Installation environment:

1. Java8+, Reference Install jdk17&jdk8 on Linux System Install _Well-done Snail Blog-CSDN Blog

2. Install ZK, reference Build ZooKeeper 3.7.0 Cluster (Traditional & Docker) _Well-rounded Snail Blog-CSDN Blog

3. Unzip Files

[root@localhost ~]# tar -zxvf kafka_2.13-3.0.0.tgz 

4. Move to/usr/local/kafka

[root@localhost ~]# mv kafka_2.13-3.0.0 /usr/local/kafka

5. Modify kafka configuration file

[root@localhost config]# vim server.properties 

broker1

broker.id=0
#Monitor
listeners=PLAINTEXT://192.168.139.155:9092
#zk address
zookeeper.connect=192.168.139.155:2181, 192.168.139.155:2182, 192.168.139.155:2183

broker2

broker.id=1
#Monitor
listeners=PLAINTEXT://192.168.139.156:9092
#zk address
zookeeper.connect=192.168.139.155:2181, 192.168.139.155:2182, 192.168.139.155:2183

broker3

broker.id=2
#Monitor
listeners=PLAINTEXT://192.168.139.157:9094
#zk address
zookeeper.connect=192.168.139.155:2181, 192.168.139.155:2182, 192.168.139.155:2183

6. Start kafka separately

[root@localhost kafka]# ./bin/kafka-server-start.sh -daemon config/server.properties

7. Create a topic on one of them

[root@localhost kafka]# ./bin/kafka-topics.sh --bootstrap-server 192.168.139.155:9092 --create --topic test-topic --partitions 3   --replication-factor 3 

Partitions have been created as indicated by the zk visualization tool.

8. Testing

send message

[root@localhost kafka]# ./bin/kafka-console-producer.sh --topic test-topic --bootstrap-server 192.168.139.155:9092

Consumer messages receive messages on another broker

[root@localhost kafka]# ./bin/kafka-console-consumer.sh --topic test-topic --from-beginning --bootstrap-server 192.168.139.156:9092

docker installation

1. Pull mirror

[root@localhost ~]# docker pull wurstmeister/kafka 

2. Installation

Broker1

docker run -d --name kafka1 \
-p 9092:9092 \
-e KAFKA_BROKER_ID=1 \
-e KAFKA_ZOOKEEPER_CONNECT=192.168.139.155:2181,192.168.139.155:2182,192.168.139.155:2183 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.139.155:9092 \
-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9092 -t wurstmeister/kafka

Broker2

docker run -d --name kafka2 \
-p 9093:9093 \
-e KAFKA_BROKER_ID=2 \
-e KAFKA_ZOOKEEPER_CONNECT=192.168.139.155:2181,192.168.139.155:2182,192.168.139.155:2183 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.139.155:9093 \
-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9093 -t wurstmeister/kafka

Broker3

docker run -d --name kafka3 \
-p 9094:9094 \
-e KAFKA_BROKER_ID=3 \
-e KAFKA_ZOOKEEPER_CONNECT=192.168.139.155:2181,192.168.139.155:2182,192.168.139.155:2183 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://192.168.139.155:9094 \
-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9094 -t wurstmeister/kafka

3. Testing

Enter Container

[root@bogon ~]# docker container exec -it 2d4be3823f16 /bin/bash 

Enter the / opt/kafka_2.13-2.7.1/bin directory

Create topic

bash-5.1# ./kafka-topics.sh --bootstrap-server 192.168.139.155:9092 --create --topic my-topic --partitions 3   --replication-factor 3 

Create message

bash-5.1# ./kafka-console-producer.sh --topic my-topic --bootstrap-server 192.168.139.155:9092 

Consumers, enter another container for consumption

bash-5.1# ./kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server 192.168.139.155:9093

3. Integrating SpringBoot

General Mode Consumption

By default, offset values are submitted automatically. They can be configured by attributes under consumer

enable-auto-commit: false 

Producer

   public void sendMSg(){
        System.out.println(">>>>>>>>>>>>>>>>>");
        for (int i=0;i<5;i++){
            kafkaTemplate.send("xiaojie-topic","test message>>>>>>>>>>>>>>>>>>>>>>"+i);
        }
    }

Consumer

  @KafkaListener(groupId = "xiaojie_group",topics = {"xiaojie-topic"})
    public void onMessage(ConsumerRecord<?, ?> record) {
        log.info("Consumer Theme>>>>>>{},Consumption zoning>>>>>>>>{},Consumption offset>>>>>{},Message Content>>>>>{}",
                record.topic(), record.partition(), record.offset(), record.value());
    }

Producer Callback Mode

The producer callback function can confirm that the message was successfully sent to the broker, failed to send, retried, or manually compensated to ensure that the message was delivered to the broker. There are two ways to

Mode 1:

    public void sendMsgCallback(String callbackMessage){
        kafkaTemplate.send("callback-topic","xiaojie_key",callbackMessage).addCallback(success -> {
        //Callback function when message is successfully sent
            // Topics to which messages are sent
            String topic = success.getRecordMetadata().topic();
            // Partitions to which messages are sent
            int partition = success.getRecordMetadata().partition();
            // offset of message in partition
            long offset = success.getRecordMetadata().offset();
            System.out.println("Send message successfully>>>>>>>>>>>>>>>>>>>>>>>>>" + topic + "-" + partition + "-" + offset);
        }, failure -> {
            //Callback function for message sending failure
            System.out.println("Message sending failed and can be compensated manually");
        });
    }

Mode 2

public void sendMsgCallback1(String callbackMessage){
        kafkaTemplate.send("callback-topic","xiaojie_key",callbackMessage).addCallback(new ListenableFutureCallback<SendResult<String, String>>() {
            @Override
            public void onFailure(Throwable ex) {
                //fail in send
                System.out.println("Sending failed.");
            }
            @Override
            public void onSuccess(SendResult<String, String> result) {
                //Partition Information
                Integer partition = result.getRecordMetadata().partition();
                //theme
                String topic=result.getProducerRecord().topic();
                String key=result.getProducerRecord().key();
                //Send Successfully
                System.out.println("The sending was successful......... The partition is:"+partition+",theme topic:"+topic+",key:"+key);
            }
        });
    }

Kafka Transactions

Application Scenarios

  1. The simplest requirement is that multiple messages from producer make up a transaction. These messages need to be visible or invisible to consumer at the same time.
  2. producer may send messages to multiple topic s, multiple partition s, and these messages need to be able to be placed within a single transaction, which results in a typical distributed transaction.
  3. kafka's application scenarios often involve an application consuming one topic before processing it and sending it to another. This consume-transform-produce process needs to be placed within a transaction, such as a consumer location that cannot be submitted if it fails during message processing or sending.
  4. The application in which the producer or producer resides may be suspended, and new producers need to know how to handle previously incomplete transactions after they are started.
  5. Streaming may have a deep topology, and if downstream is read only after an upstream message transaction is committed, it may result in very long rt throughput being significantly reduced, so read committed and read uncommitted transaction isolation levels need to be implemented

Open transaction management at spring.kafka.producer.transaction-id-prefix: tx #

Note: retries cannot be zero, acks=-1 or all at this time

    /**
     * @description: kafak Transaction submission of local transactions does not require a transaction manager
     * @param:
     * @return: void
     * @author xiaojie
     * @date: 2021/10/14 21:35
     */
    public void sendTx(){
        kafkaTemplate.executeInTransaction(kafkaOperations -> {
            String msg="This is data for a test transaction......";
            kafkaOperations.send("tx-topic",msg);
            int i=1/0; //After an error, the message is not sent to the broker because the transaction exists
            return null;
        });
    }

Consumer Bulk Consumption Message

Consumer mass consumption message, if the mass consumption mode is turned on at this time, then the same topic, consumers will consume in bulk, no longer consume one by one.

spring.kafka.listener.type=batch to start mass consumption

    /**
     * @description: Bulk consumption message
     * @param:
     * @param: records
     * @return: void
     * @author xiaojie
     * @date: 2021/10/14 21:52
     */
    @KafkaListener(topics = "xiaojie-topic")
    public void batchOnMessage(List<ConsumerRecord<?, ?>> records) {
        for (ConsumerRecord<?, ?> record : records) {
            log.info("Bulk consumption message>>>>>>>>>>>>>>>>>{}", record.value());
        }
    }

Consumer Manual Confirmation

Kafak does not delete messages from the queue after they are consumed, as rabbitmq does. Kafka usually determines how long the data can be retained based on time. The default time is configured using the log.retention.hours parameter, which has a default value of 168 hours, or one week. In addition, there are two other parameters, log.retention.minutes and log.retention.ms. These three parameters work as follows:Again, it's all about deciding how long a message will be deleted, but log.retention.ms is recommended. If more than one parameter is specified, Kafka will take precedence over the minimum one. Kafka consumes in offset location, and if not, confirms that the consumer will consume from the last location when they consume next time.

Modify consumer auto commit to false:enable-auto-commit: false

Configure Factory Class

/*
    RECORD,Submit each record after it has been processed by the ListenerConsumer
    BATCH,Submit each batch of records after it has been processed by the ListenerConsumer
    TIME, How often do you submit, beyond which time it will be submitted automatically
    COUNT, Number of submissions per submission, beyond which automatic submissions are made
    COUNT_TIME, Submit on any condition that meets the time and quantity
    MANUAL_IMMEDIATE
    MANUAL
 */ 
@Bean("manualListenerContainerFactory")
    public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> manualListenerContainerFactory(
            ConsumerFactory<String, String> consumerFactory) {
        ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory);
        factory.getContainerProperties().setPollTimeout(1500);
        factory.setBatchListener(true); //Setting the batch to true requires the consumer to receive information in batches
        //Configure manual submission offset
        factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
        return factory;
    }

Consumer

    /*
     *
     * @param message
     * @param ack
     * @Manually submit ack
     * containerFactory  Manually submit message ack
     * errorHandler Consumer Exception Handler
     * @author xiaojie
     * @date 2021/10/14
     * @return void
     */
    @KafkaListener(containerFactory = "manualListenerContainerFactory", topics = "xiaojie-topic",
            errorHandler = "consumerAwareListenerErrorHandler"
    )
    public void onMessageManual(List<ConsumerRecord<?, ?>> record, Acknowledgment ack) {
        for (int i=0;i<record.size();i++){
            System.out.println(record.get(i).value());
        }
        ack.acknowledge();//Submit offset directly
    }

Specified consumption

  /**
     * @description: id: Consumer ID;
     * groupId: Consumer group ID;
     * topics: topics cannot be used with topic Partitions
     * topicPartitions: More detailed listening information can be configured, and topic, parition, offset listening can be specified.
     * @param:
     * @param: record
     * @return: void
     * @author xiaojie
     * @date: 2021/10/14 21:50
     */
    @KafkaListener(groupId = "xiaojie_group",topicPartitions = {
            @TopicPartition(topic = "test-topic", partitions = {"1"}),
            @TopicPartition(topic = "xiaojie-test-topic", partitions = {"1"},
                    partitionOffsets = @PartitionOffset(partition = "2", initialOffset = "15"))
    })
    public void onMessage1(ConsumerRecord<?, ?> record) {
        //Specify consumption for a topic, a partition, and a consumption location
        //Execute consume partition 1 of xiaojie-test-top, and partitions 1 and 2 of xiaojie-test-top, and consume partition 2 from 15
        log.info("Consumer Theme>>>>>>:{},Consumption zoning>>>>>>>>:{},Consumption offset>>>>>:{},Message Content>>>>>:{}",
                record.topic(), record.partition(), record.offset(), record.value());
    }

Specify a custom partitioner

We know that each topic in Kafka is divided into partitions, so which partition is appended when the producer sends the message to the top? This is called partition policy. Kafka provides us with a default partition policy, and it also supports a custom partition policy. Its routing mechanism is:
1. If a partition (i.e. a custom partition policy) is specified when sending a message, the message append is directly to the specified partition;
2. If a patition is not specified when sending a message, but a key is specified (kafka allows one key per message), then the key value is hash calculated and routed to the specified partition based on the calculation result. In this case, all messages of the same Key are guaranteed to enter the same partition; this way, message order consumption can be resolved
 3. If neither patition nor key is specified, the default partitioning strategy of kafka is used to poll for a patition;
package com.xiaojie.config;

import org.apache.kafka.clients.producer.Partitioner;
import org.apache.kafka.common.Cluster;
import org.springframework.stereotype.Component;

import java.util.Map;

/**
 * @Description:Custom Partitioners We know that each topic in Kafka is divided into multiple partitions, so which partition does the producer append when sending messages to the topic? This is called partition policy. Kafka provides us with a default partition policy, and it also supports a custom partition policy. Its routing mechanism is:
 * If a partition (i.e. a custom partition policy) is specified when sending a message, the message append is directly to the specified partition;
 * If no patition is specified when sending a message, but a key is specified (kafka allows one key per message), the key value is hash calculated and routed to the specified partition based on the results.
 * In this case, all messages from the same Key are guaranteed to enter the same partition; this approach solves message order consumption
 * patition If neither key nor kafka is specified, the default partitioning strategy of kafka is used to poll for a patition.
 * @author: xiaojie
 * @date: 2021.10.14
 */
@Component
public class CustomizePartitioner implements Partitioner {
    @Override
    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        //Compute Partitioner
        System.out.println("key>>>>>>>>>>>>>"+key);
        if ("weixin".equals(key)&&"test-topic".equals(topic)){
            return 1;
        }
        return 0;
    }

    @Override
    public void close() {

    }

    @Override
    public void configure(Map<String, ?> configs) {

    }
}

Consumer side exception handling

package com.xiaojie.config;

import org.springframework.context.annotation.Bean;
import org.springframework.kafka.listener.ConsumerAwareListenerErrorHandler;
import org.springframework.stereotype.Component;

/**
 * @author xiaojie
 * @version 1.0
 * @description:With exception handlers, we can handle exceptions that occur during consumer consumption.
 * Place the BeanName of this exception handler in the errorHandler property of the @KafkaListener annotation
 * @date 2021/10/14 21:56
 */
@Component
public class MyErrorHandler {
    @Bean
    ConsumerAwareListenerErrorHandler consumerAwareListenerErrorHandler(){
        return (message, e, consumer) -> {
            System.out.println("Message consumption exception"+message.getPayload());
            System.out.println("Exception Information>>>>>>>>>>>>>>>>>"+e);
            return null;
        };
    }
}

Usage

/*
     *
     * @param message
     * @param ack
     * @Manually submit ack
     * containerFactory  Manually submit message ack
     * errorHandler Consumer Exception Handler
     * @author xiaojie
     * @date 2021/10/14
     * @return void
     */
    @KafkaListener(containerFactory = "manualListenerContainerFactory", topics = "xiaojie-topic",
            errorHandler = "consumerAwareListenerErrorHandler"
    )
    public void onMessageManual(List<ConsumerRecord<?, ?>> record, Acknowledgment ack) {
        for (int i=0;i<record.size();i++){
            System.out.println(record.get(i).value());
        }
        ack.acknowledge();//Submit offset directly
    }

Message Filter

  @Bean("filterFactory")
    public ConcurrentKafkaListenerContainerFactory filterFactory(ConsumerFactory<String, String> consumerFactory) {
        ConcurrentKafkaListenerContainerFactory factory = new ConcurrentKafkaListenerContainerFactory();
        factory.setConsumerFactory(consumerFactory);
        factory.setAckDiscarded(true);
        factory.setRecordFilterStrategy(consumerRecord -> {
            String value = (String) consumerRecord.value();
            if (value.contains("hello")) {
                //Return false message is not filtered to continue consumption
                return false;
            }
            System.out.println("....................");
            //The return true message was filtered out
            return true;
        });
        return factory;
    }

Use method

  /** 
     * @description: Consumer filter
     * @param: 
     * @param: record
     * @return: void
     * @author xiaojie
     * @date: 2021/10/16 1:04
     */
    @KafkaListener(topics = "filter-topic",containerFactory = "filterFactory")
    public void filterOnmessage(ConsumerRecord<?,?> record){
        log.info("The consumption news is:{}",record.value());
    }

Reference resources: SpringBoot Integrated kafka Full Actual_Felix-CSDN Blog

Tags: kafka

Posted on Fri, 15 Oct 2021 12:12:33 -0400 by jds580s