03 elastic log system - filebeat-kafka-logstash-elastic search-kibana-6.8.0 building process

1. introduction
2. Preparations

2.1 software version
2.2 log flow

3. Configure zookeeper cluster
4. Configure kafka cluster
5. Configure filebeat output
6. Configure logstash input
7. Possible problems

1. introduction

It is written that redis is used as the intermediate cache and message queue to cut the peak of the log flow. However, in the process of using, when the amount of logs in the early stage is small, there is no problem at all. When the logs keep rising and the redis queue keeps piling up, the problem comes. Redis is a memory database. When the days cannot be consumed in real time, the memory occupation will keep rising Until OOM, the system crashes. Of course, the rate of logstash consuming logs is also a problem. However, consider replacing the single node redis, using the three node kafka, and modifying the startup parameters of elastic search. The following only describes the configuration of kafka and the problems encountered. Refer to the previous articles for other configurations.

2. Preparations

Node:
192.168.72.56
192.168.72.57
192.168.72.58

2.1 software version

All relevant software of elastic is installed with 6.8.0 rpm package
zookeeper: 3.4.14，Download address
kafka: 2.11-2.4.0，Download address

System version: CentOS Linux release 7.7.1908 (Core)

2.2 log flow

Filebeat > Kafka cluster > logstash > elasticsearch cluster > kibana

3. Configure zookeeper cluster

We use the zookeeper cluster outside kafka. There is also a zookeeper component in the actual kafka installation package. Reference: https://www.cnblogs.com/longBlogs/p/10340251.html

Configuration is described below

wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz tar -xvf zookeeper-3.4.14.tar.gz -C /usr/local cd /usr/local ln -sv zookeeper-3.4.14 zookeeper cd zookeeper/conf cp zoo_sample.cfg zoo.cfg mkdir -pv /usr/local/zookeeper/

Node 1 edit the configuration file zoo.cfg

# Specify data folder, log folder dataDir=/usr/local/zookeeper/data dataLogDir=/usr/local/zookeeper/logs clientPort=2181 server.1=192.168.72.56:2888:3888 server.2=192.168.72.57:2888:3888 server.3=192.168.72.58:2888:3888 # The first port is the communication port between master and slave, which is 2888 by default. The second port is the port for leader election. When the cluster is started, the port for election or new election after the leader hangs up is 3888 by default

Configuration node id

echo "1" > /usr/local/zookeeper/data/myid #server1 configuration, different nodes, the same number as server.1 configured above echo "2" > /usr/local/zookeeper/data/myid #server2 configuration. Each node is different. It is the same as the number of server.2 configured above echo "3" > /usr/local/zookeeper/data/myid #server3 configuration. Each node is different. It is the same as the number of server.3 configured above

Start stop zookeeper

# start-up /usr/local/zookeeper/bin/zkServer.sh start # Stop it /usr/local/zookeeper/bin/zkServer.sh stop # Status view /usr/local/zookeeper/bin/zkServer.sh status

Configure zookeeper service

cd /usr/lib/systemd/system # vim zookeeper.service ========================================= [Unit] Description=zookeeper server daemon After=zookeeper.target [Service] Type=forking ExecStart=/usr/local/zookeeper/bin/zkServer.sh start ExecReload=/usr/local/zookeeper/bin/zkServer.sh stop && sleep 2 && /usr/local/zookeeper/bin/zkServer.sh start ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop Restart=always [Install] WantedBy=multi-user.target ======================================================= # systemctl start zookeeper # systemctl enable zookeeper

4. Configure kafka cluster

Download and install

wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz tar -xvf kafka_2.11-2.4.0.tgz -C /usr/local cd /usr/local ln -sv kafka_2.11-2.4.0 kafka cd kafka/config

Modify configuration

# vim server.properties broker.id=1 # Each broker's unique identification in the cluster requires a positive number, and the three nodes are different host.name=192.168.72.56 # New item, node IP num.network.threads=3 # The number of partitions per topic. More partitions allow more parallel operations num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/var/log/kafka # Log folder num.partitions=3 num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 log.retention.hours=168 #The maximum retention time (hours) of segment file. The timeout will be deleted, that is, the data 7 days ago will be cleaned up log.segment.bytes=1073741824 # The size (in bytes) of each segment in the log file, which is 1G by default log.retention.check.interval.ms=300000 log.cleaner.enable=true # Enable log cleanup zookeeper.connect=192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 # Address of zookeeper cluster, which can be multiple zookeeper.connection.timeout.ms=6000 group.initial.rebalance.delay.ms=0

The default memory required by Kafka node is 1G. If you need to modify the memory, you can modify the configuration item of kafka-server-start.sh
Find the Kafka? Heap? Opts configuration item. For example, modify it as follows:
export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"

Start kafka

cd /usr/local/kafka ./bin/kafka-server-start.sh -daemon ./config/server.properties

Set startup

# cd /usr/lib/systemd/system # vim kafka.service ========================================= [Unit] Description=kafka server daemon After=kafka.target [Service] Type=forking ExecStart=/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties ExecReload=/usr/local/kafka/bin/kafka-server-stop.sh && sleep 2 && /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh Restart=always [Install] WantedBy=multi-user.target ======================================================= # systemctl start kafka # systemctl enable kafka

Create topic
Create 3 partitions, 3 backups

cd /usr/local/kafka /bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic java

Frequently used commands

1) Stop kafka ./bin/kafka-server-stop.sh 2) Create topic ./bin/kafka-topics.sh --create --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --replication-factor 1 --partitions 1 --topic topic_name Partition expansion ./bin/kafka-topics.sh --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --alter --topic java --partitions 40 3) Show topic ./bin/kafka-topics.sh --list --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 4) View description topic ./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic topic_name 5) Producer sends message ./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name 6) Consumer consumption news ./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name 7) Delete topic ./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 8) View per partition consumer_offsets (consumer hosts you can connect to) ./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic __consumer_offsets 5. Configure filebeat output

For details, please refer to https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html

output.kafka: enabled: true hosts: ["192.168.72.56:9092","192.168.72.56:9092","192.168.72.56:9092"] topic: java required_acks: 1 compression: gzip message.max.bytes: 500000000 # The maximum number of bytes transmitted per message. Those larger than this number will be discarded

Restart filebeat

systemctl restart filebeat

6. Configure logstash input

Install kafka input module first

/usr/share/logstash/bin/logstash-plugin install logstash-input-kafka

Add profile:

vim /etc/logstash/conf.d/kafka.conf ====================== input { kafka { bootstrap_servers => "192.168.72.56:9092" group_id => "java" auto_offset_reset => "latest" consumer_threads => "5" decorate_events => "false" topics => ["java"] codec => json } } output { elasticsearch { hosts => ["192.168.72.56:9200","192.168.72.57:9200","192.168.72.58:9200"] user => "elastic" password => "changme" index => "logs-other-%{+YYYY.MM.dd}" http_compression => true } }

After adding, test the configuration file

/usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/kafka.conf

Test OK, restart logstash

7. Possible problems

filebeat error

*WARN producer/broker/0 maximum request accumulated, waiting for space
Reference: https://linux.xiao5tech.com/bigdata/elk/elk ﹤ 2.2.1 ﹐ error ﹐ filebeat ﹐ Kafka ﹐ waiting ﹐ for ﹐ space.html
Reason: the buffer value configuration of Max message bytes is small
dropping too large message of size
Reference: https://www.cnblogs.com/zhaosc-haha/p/12133699.html
Reason: the number of message bytes transmitted exceeds the limit. Modify the scanning frequency of the log or confirm whether the log output is abnormal or unnecessary. Too large logs can seriously affect the performance of kafka.
Setting value: 10000000 (10MB)

zhangpfly Published 37 original articles, won praise 9, visited 60000+ Private letter follow

03 elastic log system - filebeat-kafka-logstash-elastic search-kibana-6.8.0 building process