03 elastic log system - filebeat-kafka-logstash-elastic search-kibana-6.8.0 building process
- 1. introduction
- 2. Preparations
- 3. Configure zookeeper cluster
- 4. Configure kafka cluster
- 5. Configure filebeat output
- 6. Configure logstash input
- 7. Possible problems
It is written that redis is used as the intermediate cache and message queue to cut the peak of the log flow. However, in the process of using, when the amount of logs in the early stage is small, there is no problem at all. When the logs keep rising and the redis queue keeps piling up, the problem comes. Redis is a memory database. When the days cannot be consumed in real time, the memory occupation will keep rising Until OOM, the system crashes. Of course, the rate of logstash consuming logs is also a problem. However, consider replacing the single node redis, using the three node kafka, and modifying the startup parameters of elastic search. The following only describes the configuration of kafka and the problems encountered. Refer to the previous articles for other configurations.
2. PreparationsNode:
192.168.72.56
192.168.72.57
192.168.72.58
2.1 software version
All relevant software of elastic is installed with 6.8.0 rpm package
zookeeper: 3.4.14,Download address
kafka: 2.11-2.4.0,Download address
System version: CentOS Linux release 7.7.1908 (Core)
2.2 log flow
Filebeat > Kafka cluster > logstash > elasticsearch cluster > kibana
3. Configure zookeeper clusterWe use the zookeeper cluster outside kafka. There is also a zookeeper component in the actual kafka installation package. Reference: https://www.cnblogs.com/longBlogs/p/10340251.html
Configuration is described below
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz tar -xvf zookeeper-3.4.14.tar.gz -C /usr/local cd /usr/local ln -sv zookeeper-3.4.14 zookeeper cd zookeeper/conf cp zoo_sample.cfg zoo.cfg mkdir -pv /usr/local/zookeeper/
Node 1 edit the configuration file zoo.cfg
# Specify data folder, log folder dataDir=/usr/local/zookeeper/data dataLogDir=/usr/local/zookeeper/logs clientPort=2181 server.1=192.168.72.56:2888:3888 server.2=192.168.72.57:2888:3888 server.3=192.168.72.58:2888:3888 # The first port is the communication port between master and slave, which is 2888 by default. The second port is the port for leader election. When the cluster is started, the port for election or new election after the leader hangs up is 3888 by default
Configuration node id
echo "1" > /usr/local/zookeeper/data/myid #server1 configuration, different nodes, the same number as server.1 configured above echo "2" > /usr/local/zookeeper/data/myid #server2 configuration. Each node is different. It is the same as the number of server.2 configured above echo "3" > /usr/local/zookeeper/data/myid #server3 configuration. Each node is different. It is the same as the number of server.3 configured above
Start stop zookeeper
# start-up /usr/local/zookeeper/bin/zkServer.sh start # Stop it /usr/local/zookeeper/bin/zkServer.sh stop # Status view /usr/local/zookeeper/bin/zkServer.sh status
Configure zookeeper service
cd /usr/lib/systemd/system # vim zookeeper.service ========================================= [Unit] Description=zookeeper server daemon After=zookeeper.target [Service] Type=forking ExecStart=/usr/local/zookeeper/bin/zkServer.sh start ExecReload=/usr/local/zookeeper/bin/zkServer.sh stop && sleep 2 && /usr/local/zookeeper/bin/zkServer.sh start ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop Restart=always [Install] WantedBy=multi-user.target ======================================================= # systemctl start zookeeper # systemctl enable zookeeper4. Configure kafka cluster
Download and install
wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz tar -xvf kafka_2.11-2.4.0.tgz -C /usr/local cd /usr/local ln -sv kafka_2.11-2.4.0 kafka cd kafka/config
Modify configuration
# vim server.properties broker.id=1 # Each broker's unique identification in the cluster requires a positive number, and the three nodes are different host.name=192.168.72.56 # New item, node IP num.network.threads=3 # The number of partitions per topic. More partitions allow more parallel operations num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/var/log/kafka # Log folder num.partitions=3 num.recovery.threads.per.data.dir=1 offsets.topic.replication.factor=1 transaction.state.log.replication.factor=1 transaction.state.log.min.isr=1 log.retention.hours=168 #The maximum retention time (hours) of segment file. The timeout will be deleted, that is, the data 7 days ago will be cleaned up log.segment.bytes=1073741824 # The size (in bytes) of each segment in the log file, which is 1G by default log.retention.check.interval.ms=300000 log.cleaner.enable=true # Enable log cleanup zookeeper.connect=192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 # Address of zookeeper cluster, which can be multiple zookeeper.connection.timeout.ms=6000 group.initial.rebalance.delay.ms=0
The default memory required by Kafka node is 1G. If you need to modify the memory, you can modify the configuration item of kafka-server-start.sh
Find the Kafka? Heap? Opts configuration item. For example, modify it as follows:
export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"
Start kafka
cd /usr/local/kafka ./bin/kafka-server-start.sh -daemon ./config/server.properties
Set startup
# cd /usr/lib/systemd/system # vim kafka.service ========================================= [Unit] Description=kafka server daemon After=kafka.target [Service] Type=forking ExecStart=/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties ExecReload=/usr/local/kafka/bin/kafka-server-stop.sh && sleep 2 && /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh Restart=always [Install] WantedBy=multi-user.target ======================================================= # systemctl start kafka # systemctl enable kafka
Create topic
Create 3 partitions, 3 backups
cd /usr/local/kafka /bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic java
Frequently used commands
1) Stop kafka ./bin/kafka-server-stop.sh 2) Create topic ./bin/kafka-topics.sh --create --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --replication-factor 1 --partitions 1 --topic topic_name Partition expansion ./bin/kafka-topics.sh --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --alter --topic java --partitions 40 3) Show topic ./bin/kafka-topics.sh --list --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 4) View description topic ./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic topic_name 5) Producer sends message ./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name 6) Consumer consumption news ./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name 7) Delete topic ./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 8) View per partition consumer_offsets (consumer hosts you can connect to) ./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic __consumer_offsets 5. Configure filebeat outputFor details, please refer to https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html
output.kafka: enabled: true hosts: ["192.168.72.56:9092","192.168.72.56:9092","192.168.72.56:9092"] topic: java required_acks: 1 compression: gzip message.max.bytes: 500000000 # The maximum number of bytes transmitted per message. Those larger than this number will be discarded
Restart filebeat
systemctl restart filebeat6. Configure logstash input
Install kafka input module first
/usr/share/logstash/bin/logstash-plugin install logstash-input-kafka
Add profile:
vim /etc/logstash/conf.d/kafka.conf ====================== input { kafka { bootstrap_servers => "192.168.72.56:9092" group_id => "java" auto_offset_reset => "latest" consumer_threads => "5" decorate_events => "false" topics => ["java"] codec => json } } output { elasticsearch { hosts => ["192.168.72.56:9200","192.168.72.57:9200","192.168.72.58:9200"] user => "elastic" password => "changme" index => "logs-other-%{+YYYY.MM.dd}" http_compression => true } }
After adding, test the configuration file
/usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/kafka.conf
Test OK, restart logstash
7. Possible problemsfilebeat error
-
*WARN producer/broker/0 maximum request accumulated, waiting for space
Reference: https://linux.xiao5tech.com/bigdata/elk/elk ﹤ 2.2.1 ﹐ error ﹐ filebeat ﹐ Kafka ﹐ waiting ﹐ for ﹐ space.html
Reason: the buffer value configuration of Max message bytes is small -
dropping too large message of size
Reference: https://www.cnblogs.com/zhaosc-haha/p/12133699.html
Reason: the number of message bytes transmitted exceeds the limit. Modify the scanning frequency of the log or confirm whether the log output is abnormal or unnecessary. Too large logs can seriously affect the performance of kafka.
Setting value: 10000000 (10MB)