Spring boot Kafka serialization uses Avro

[requirement]: the producer sends data to kafka for serialization, uses Avro, the consumer deserializes through Avro, and stores the data in the database through MyBatis.

1, Pom

[1]Apache Avro 1.8;[2]Spring Kafka 1.2;[3]Spring Boot 1.5;[4]Maven 3.5;

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>com.codenotfound</groupId>
  <artifactId>spring-kafka-avro</artifactId>
  <version>0.0.1-SNAPSHOT</version>

  <name>spring-kafka-avro</name>
  <description>Spring Kafka - Apache Avro Serializer Deserializer Example</description>
  <url>https://www.codenotfound.com/spring-kafka-apache-avro-serializer-deserializer-example.html</url>

  <parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>1.5.4.RELEASE</version>
  </parent>

  <properties>
    <java.version>1.8</java.version>

    <spring-kafka.version>1.2.2.RELEASE</spring-kafka.version>
    <avro.version>1.8.2</avro.version>

  </properties>

  <dependencies>
    <!-- spring-boot -->
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-test</artifactId>
      <scope>test</scope>
    </dependency>
    <!-- spring-kafka -->
    <dependency>
      <groupId>org.springframework.kafka</groupId>
      <artifactId>spring-kafka</artifactId>
      <version>${spring-kafka.version}</version>
    </dependency>
    <dependency>
      <groupId>org.springframework.kafka</groupId>
      <artifactId>spring-kafka-test</artifactId>
      <version>${spring-kafka.version}</version>
      <scope>test</scope>
    </dependency>
    <!-- avro -->
    <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro</artifactId>
      <version>${avro.version}</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <!-- spring-boot-maven-plugin -->
      <plugin>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-maven-plugin</artifactId>
      </plugin>
      <!-- avro-maven-plugin -->
      <plugin>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-maven-plugin</artifactId>
        <version>${avro.version}</version>
        <executions>
          <execution>
            <phase>generate-sources</phase>
            <goals>
              <goal>schema</goal>
            </goals>
            <configuration>
              <sourceDirectory>${project.basedir}/src/main/resources/avro/</sourceDirectory>
              <outputDirectory>${project.build.directory}/generated/avro</outputDirectory>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

2, Avro file

[1] Avro relies on a schema consisting of primitive types defined using JSON. For this example, we will use Apache Avro Getting Started Guide User mode, as shown below. The schema is stored in the user.avsc file under src / main / resources / avro. I use electronicsPackage.avsc here. namespace specifies the package path you specify when generating the java class and the file generated when generating the name table.

{"namespace": "com.yd.cyber.protocol.avro",
 "type": "record",
 "name": "ElectronicsPackage",
 "fields": [
     {"name":"package_number","type":["string","null"],"default": null},
     {"name":"frs_site_code","type":["string","null"],"default": null},
     {"name":"frs_site_code_type","type":["string","null"],"default":null},
     {"name":"end_allocate_code","type":["string","null"],"default": null},
     {"name":"code_1","type":["string","null"],"default": null},
     {"name":"aggregat_package_code","type":["string","null"],"default": null}
    ]
}

[2] Avro comes with a code generation function that allows us to automatically create Java classes according to the "user" mode defined above. Once the relevant classes are generated, there is no need to use the schema directly in the program. These classes can use avro-tools.jar or Maven Projects. Call Maven Projects to compile and automatically generate the electronicsPackage.java file: the following is the way through maven
[the external chain picture transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-4oecg8ql-1632723896290) (data: image / GIF; Base64, r0lgodlhaqabapabap / / / waaach5baekaaaaaaaaaaaaaaabaaaeaaaicraeaow = =)]

[3] This will result in the generation of an electronicsPackage.java class that contains the schema and many * * Builder * * methods to construct electronicsPackage objects.
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-wfgnns6g-1632723896292) (data: image / GIF; Base64, r0lgodlhaqabapabap / / / w3ach5baekaaaaaaaaaaaaaaabaaaeaaaicraeaow = =)]

3, Generate Avro messages for Kafka topics

Kafka Byte stores and transfers arrays in its topics. However, when we use Avro objects, we need to convert between these Byte arrays. Before version 0.9.0.0, the Kafka Java API used the implementation of the Encoder/ Decoder interface to handle transformations, but in the new API, these have been replaced by the implementation of the Serializer/ Deserializer interface. Kafka comes with many Built in (DE) serializer, But not Avro. To solve this problem, we will create an AvroSerializer class, which implements interfaces specifically for Avro objects. Then, we implement the method that takes the ` ` serialize() topic name and data object as input. In this case, the object is the extended Avro object SpecificRecordBase. This method Serializes the Avro object into a byte array And return the result. This class belongs to a general class, which can be configured and used multiple times at a time.

package com.yd.cyber.web.avro;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.Map;

import org.apache.avro.io.BinaryEncoder;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.io.EncoderFactory;
import org.apache.avro.specific.SpecificDatumWriter;
import org.apache.avro.specific.SpecificRecordBase;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Serializer;

/**
 *  avro serialization class 
 * @author zzx
 * @creat 2020-03-11-19:17
 */
public class AvroSerializer<T extends SpecificRecordBase> implements Serializer<T> {
    @Override
    public void close() {}

    @Override
    public void configure(Map<String, ?> arg0, boolean arg1) {}

    @Override
    public byte[] serialize(String topic, T data) {
        if(data == null) {
            return null;
        }
        DatumWriter<T> writer = new SpecificDatumWriter<>(data.getSchema());
        ByteArrayOutputStream byteArrayOutputStream  = new ByteArrayOutputStream();
        BinaryEncoder binaryEncoder  = EncoderFactory.get().directBinaryEncoder(byteArrayOutputStream , null);
        try {
            writer.write(data, binaryEncoder);
            binaryEncoder.flush();
            byteArrayOutputStream.close();
        }catch (IOException e) {
            throw new SerializationException(e.getMessage());
        }
        return byteArrayOutputStream.toByteArray();
    }
}

4, AvroConfig configuration class

The Avro configuration information is in the AvroConfig configuration class. Now, we need to change it. AvroConfig starts to use our custom Serializer implementation. This is done by setting the "VALUE_SERIALIZER_CLASS_CONFIG" property to the AvroSerializer class. In addition, we changed the generic types of ProducerFactory and KafkaTemplate to specify ElectronicsPackage instead of String. When we have multiple serializations, the configuration file needs multiple requirements to add the objects we need to serialize.

package com.yd.cyber.web.avro;

/**
 * @author zzx
 * @creat 2020-03-11-20:23
 */
@Configuration
@EnableKafka
public class AvroConfig {

    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;

    @Value("${spring.kafka.producer.max-request-size}")
    private String maxRequestSize;

    @Bean
    public Map<String, Object> avroProducerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG, maxRequestSize);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, AvroSerializer.class);
        return props;
    }

    @Bean
    public ProducerFactory<String, ElectronicsPackage> elProducerFactory() {
        return new DefaultKafkaProducerFactory<>(avroProducerConfigs());
    }

    @Bean
    public KafkaTemplate<String, ElectronicsPackage> elKafkaTemplate() {
        return new KafkaTemplate<>(elProducerFactory());
    }
}

5, Send message via kafkaTemplate

Finally, call the send method of kafkaTemplate through the Controller class to accept an Avro electronicsPackage object as input. Notice that we also updated the kafkaTemplate generic type.

package com.yd.cyber.web.controller.aggregation;

import com.yd.cyber.protocol.avro.ElectronicsPackage;
import com.yd.cyber.web.vo.ElectronicsPackageVO;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.BeanUtils;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import javax.annotation.Resource;

/**
 * <p>
 * InnoDB free: 4096 kB Front end controller
 * </p>
 *
 * @author zzx
 * @since 2020-04-19
 */
@RestController
@RequestMapping("/electronicsPackageTbl")
public class ElectronicsPackageController {

    //Log
    private static final Logger log = LoggerFactory.getLogger(ElectronicsPackageController.class);

    @Resource
    private KafkaTemplate<String,ElectronicsPackage> kafkaTemplate;

    @GetMapping("/push")
    public void push(){
        ElectronicsPackageVO electronicsPackageVO = new ElectronicsPackageVO();
        electronicsPackageVO.setElectId(9);
        electronicsPackageVO.setAggregatPackageCode("9");
        electronicsPackageVO.setCode1("9");
        electronicsPackageVO.setEndAllocateCode("9");
        electronicsPackageVO.setFrsSiteCodeType("9");
        electronicsPackageVO.setFrsSiteCode("9");
        electronicsPackageVO.setPackageNumber("9");
        ElectronicsPackage electronicsPackage = new ElectronicsPackage();
        BeanUtils.copyProperties(electronicsPackageVO,electronicsPackage);
        //send message
        kafkaTemplate.send("Electronics_Package",electronicsPackage);
        log.info("Electronics_Package TOPIC Sent successfully");
    }
}

6, Consuming Avro message deserialization from Kafka topic

The received message needs to be deserialized to Avro format. To do this, we create an AvroDeserializer class that implements the Deserializer interface. The deserialize() method combines the topic name and Byte array As input, then Decode it back to the Avro object . Retrieve the mode to be used for decoding from the targetType class parameter, which needs to be passed to the ` ` AvroDeserializer constructor as a parameter.

package com.yd.cyber.web.avro;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.Map;

import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.BinaryDecoder;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.specific.SpecificDatumReader;
import org.apache.avro.specific.SpecificRecordBase;
import org.apache.kafka.common.errors.SerializationException;
import org.apache.kafka.common.serialization.Deserializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.xml.bind.DatatypeConverter;

/**
 *  avro Deserialization
 * @author fuyx
 * @creat 2020-03-12-15:19
 */
public class AvroDeserializer<T extends SpecificRecordBase> implements Deserializer<T> {
    //Log system
    private static final Logger LOGGER = LoggerFactory.getLogger(AvroDeserializer.class);

    protected final Class<T> targetType;

    public AvroDeserializer(Class<T> targetType) {
        this.targetType = targetType;
    }
    @Override
    public void close() {}

    @Override
    public void configure(Map<String, ?> arg0, boolean arg1) {}

    @Override
    public T deserialize(String topic, byte[] data) {
        try {
            T result = null;
            if(data == null) {
                return null;
            }
            LOGGER.debug("data='{}'", DatatypeConverter.printHexBinary(data));
            ByteArrayInputStream in = new ByteArrayInputStream(data);
            DatumReader<GenericRecord> userDatumReader = new SpecificDatumReader<>(targetType.newInstance().getSchema());
            BinaryDecoder decoder = DecoderFactory.get().directBinaryDecoder(in, null);
            result = (T) userDatumReader.read(null, decoder);
            LOGGER.debug("deserialized data='{}'", result);
            return result;
        } catch (Exception ex) {
            throw new SerializationException(
                    "Can't deserialize data '" + Arrays.toString(data) + "' from topic '" + topic + "'", ex);
        } finally {

        }
    }
}

7, Deserialized configuration class

I put both the deserialized configuration and the serialized configuration in the AvroConfig configuration class. In AvroConfig, the AvroDeserializer needs to be updated as the value "value_serializer_class_config" attribute. We also changed the common types of ConsumerFactory and ConcurrentKafkaListenerContainerFactory to specify ElectronicsPackage instead of String. Creating a new AvroDeserializer by DefaultKafkaConsumerFactory requires "User.class" as a parameter of the constructor. Class <? > is required TargetType, AvroDeserializer to deserialize the consumption byte [] object into the appropriate target object (ElectronicsPackage ` ` class in this example).

@Configuration
@EnableKafka
public class AvroConfig {

    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;

    @Value("${spring.kafka.producer.max-request-size}")
    private String maxRequestSize;


    @Bean
    public Map<String, Object> consumerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
        props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, AvroDeserializer.class);
        props.put(ConsumerConfig.GROUP_ID_CONFIG, "avro");

        return props;
    }

    @Bean
    public ConsumerFactory<String, ElectronicsPackage> consumerFactory() {
        return new DefaultKafkaConsumerFactory<>(consumerConfigs(), new StringDeserializer(),
                new AvroDeserializer<>(ElectronicsPackage.class));
    }

    @Bean
    public ConcurrentKafkaListenerContainerFactory<String, ElectronicsPackage> kafkaListenerContainerFactory() {
        ConcurrentKafkaListenerContainerFactory<String, ElectronicsPackage> factory =
                new ConcurrentKafkaListenerContainerFactory<>();
        factory.setConsumerFactory(consumerFactory());

        return factory;
    }

}

8, Consumer News

The consumer listens to the corresponding Topic through @ KafkaListener. It should be noted here that the parameters of the object are directly obtained online, and the object is passed. For example, it may be necessary to pass in the ElectronicsPackage class, but when I write this, the error log always returns the problem of serialization, so I use the GenericRecord object to receive it, That is, there is no problem with the object defined in my deserialization. Then I store the received message into the database through mybatisplus.

package com.zzx.cyber.web.controller.dataSource.intercompany;

import com.zzx.cyber.web.service.ElectronicsPackageService;
import com.zzx.cyber.web.vo.ElectronicsPackageVO;
import org.apache.avro.generic.GenericRecord;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.BeanUtils;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.stereotype.Controller;

import javax.annotation.Resource;

/**
 * @desc:
 * @author: zzx
 * @creatdate 2020/4/1912:21
 */
@Controller
public class ElectronicsPackageConsumerController {

    //journal
    private static final Logger log  = LoggerFactory.getLogger(ElectronicsPackageConsumerController.class);

    //Service layer
    @Resource
    private ElectronicsPackageService electronicsPackageService;
    /**
     * Scan data test
     * @param genericRecordne
     */
    @KafkaListener(topics = {"Electronics_Package"})
    public void receive(GenericRecord genericRecordne) throws Exception {
        log.info("Data reception: electronicsPackage + "+  genericRecordne.toString());
        //Business processing class, which is automatically generated by mybatispuls
        ElectronicsPackageVO electronicsPackageVO = new ElectronicsPackageVO();
        //Copy the received data
        BeanUtils.copyProperties(genericRecordne,electronicsPackageVO);
        try {
            //Drop Library
            log.info("Data warehousing");
            electronicsPackageService.save(electronicsPackageVO);
        } catch (Exception e) {
            throw new Exception("Insert exception"+e);
        }
    }
}

Tags: Java Big Data kafka Spring Boot

Posted on Mon, 27 Sep 2021 01:41:11 -0400 by stiano