ELK log analysis platform (Elasticsearch + Logstash + Kibana): illustration + second understanding + the most complete in history

Introduction to ELK platform

Elasticsearch + Logstash + Kibana (ELK) is an open-source log management scheme. When analyzing website access, we usually embed JS for data statistics with the help of Google / Baidu / CNZZ. However, when website access is abnormal or attacked, we need to analyze it in the background, such as Nginx And Nginx log segmentation / GoAccess/Awstats are relatively simple single node solutions. For distributed clusters or when the data level is large, it will be more than enough, but the emergence of ELK can make us calmly face new challenges.

Logs mainly include system logs, application logs and security logs. System operation and maintenance personnel and developers can understand the software and hardware information of the server through the log, check the errors in the configuration process and the causes of the errors. Regular analysis of logs can understand the load, performance and security of the server, so as to take timely measures to correct errors.

Usually, logs are stored on different devices. If you manage dozens or hundreds of servers, you are still using the traditional method of logging in to each machine in turn. Does this feel cumbersome and inefficient. It is imperative that we use centralized log management, such as the open source syslog, to collect and summarize the logs on all servers.

After centralized log management, log statistics and retrieval become a more troublesome thing. Generally, we can use grep, awk, wc and other Linux commands to achieve retrieval and statistics, but we still have a little difficulty in using this method for higher requirements such as query, sorting and statistics and a large number of machines.

The open source real-time log analysis ELK platform can perfectly solve the above problems. ELK is composed of ElasticSearch, Logstash and Kiabana.

Official website: https://www.elastic.co/products

  • Elasticsearch is an open source distributed search engine. Its features include: distributed, zero configuration, automatic discovery, automatic index fragmentation, index copy mechanism, restful style interface, multi data sources, automatic search load, etc.
  • Logstash is a completely open source tool. It can collect, filter and store your logs for future use (such as search).
  • Kibana is also an open source and free tool. It can provide log analysis friendly Web interface for Logstash and ElasticSearch, which can help you summarize, analyze and search important data logs.

Extended reading

CentOS 7.x install ELK(Elasticsearch+Logstash+Kibana)- http://www.chenshake.com/centos-install-7-x-elk-elasticsearchlogstashkibana/

Centos 6.5 installing nginx log analysis system elasticsearch + logstash+ redis + kibana -http://blog.chinaunix.net/xmlrpc.php?r=blog/article&uid=17291169&id=4898582

logstash-forwarder and grok examples - https://www.ulyaoth.net/threads/logstash-forwarder-and-grok-examples.32413/

Three compartment- http://chenlinux.com/

elastic - https://www.elastic.co/guide

LTMP index- http://wsgzao.github.io/index/#LTMP

Log after containerization

With the service containerization, running on a CentOS server, the server has built a docker environment and installed docker compose. However, in terms of log processing, there is no good method to collect complete logs, so we can only rely on the docker logs containerID method after entering the server. It is very inconvenient. Previously, we also noted the ELK technology, However, I have been developing system functions and trying my best to realize them. Today, I am free and recall the task of ELK to view logs.

Note: it is planned to write a blog post of elk log based on my practical operation of high availability environment. The following is the content of others for reference only. I will supplement this piece after my practical operation of high availability environment.

Schematic diagram of ELK operation

A schematic diagram of ELK operation is drawn:

As shown in the figure: Logstash collects the logs generated by the AppServer and stores them in the ElasticSearch cluster, while Kibana queries the data from the ES cluster, generates charts, and then returns them to the Browser.

ELK platform construction

System environment

System: Centos release 6.7 (Final)

ElasticSearch: 2.1.0

Logstash: 2.1.1

Kibana: 4.3.0

Java: openjdk version "1.8.0_65"

Note: since the operation of Logstash depends on the Java environment, and the version of Logstash above 1.5 is not lower than java 1.7, it is recommended to use the latest version of Java. Because we only need the Java running environment, we can only install JRE, but I still use JDK here. Please search and install by yourself.

ELK Download: https://www.elastic.co/downloads/

ElasticSearch

To configure ElasticSearch:

tar -zxvf elasticsearch-2.1.0.tar.gz
cd elasticsearch-2.1.0

Install the Head plug-in (Optional):

./bin/plugin install mobz/elasticsearch-head

Then edit the ES configuration file:

vi config/elasticsearch.yml

Modify the following configuration items:

cluster.name=es_cluster
node.name=node0
path.data=/tmp/elasticsearch/data
path.logs=/tmp/elasticsearch/logs
#The current hostname or IP address is centos2
network.host=centos2
network.port=9200

Other options remain the default, and then start ES:

./bin/elasticsearch

You can see that the transmission port between it and other nodes is 9300, and the port to accept HTTP requests is 9200.

Use ctrl+C to stop. Of course, you can also use the background process to start ES:

./bin/elasticsearch &

Then you can open the page localhost:9200, and you will see the following:

Return to the cluster showing the configuration_ Name and name, and the version of ES installed.

The head plug-in just installed is a plug-in that interacts with the ES cluster with a browser. It can view the cluster status, doc content of the cluster, perform search and ordinary Rest requests. You can also open it now localhost:9200/_plugin/head Page to view the ES cluster status:

You can see that there is no index or type in the ES cluster, so these two are empty.

Logstash

The functions of Logstash are as follows:

In fact, it is just a collector. We need to specify Input and Output for it (of course, Input and Output can be multiple). Since we need to Output the log of Log4j in Java code to ElasticSearch, the Input here is Log4j and the Output is ElasticSearch.

Configure Logstash:

tar -zxvf logstash-2.1.1.tar.gz
cd logstash-2.1.1

Write the configuration file (the name and location can be arbitrary. Here I put it in the config directory and call it log4j_to_es.conf):

mkdir config
vi config/log4j_to_es.conf

Enter the following:

# For detail structure of this file
# Set: https://www.elastic.co/guide/en/logstash/current/configuration-file-structure.html
input {
  # For detail config for log4j as input, 
  # See: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-log4j.html
  log4j {
    mode => "server"
    host => "centos2"
    port => 4567
  }
}
filter {
  #Only matched data are send to output.
}
output {
  # For detail config for elasticsearch as output, 
  # See: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
  elasticsearch {
    action => "index"          #The operation on ES
    hosts  => "centos2:9200"   #ElasticSearch host, can be array.
    index  => "applog"         #The index to write data to.
  }
}

The logstash command has only two parameters:

Therefore, use the agent to start it (specify the configuration file with - f):

./bin/logstash agent -f config/log4j_to_es.conf

At this point, we can use Logstash to collect logs and save them to ES. Let's take a look at the project code.

Java Project

As usual, look at the project structure chart first:

pom.xml is very simple and uses only the Log4j Library:

<dependency>
    <groupId>log4j</groupId>
    <artifactId>log4j</artifactId>
    <version>1.2.17</version>
</dependency>

log4j.properties to output log4j logs to SocketAppender, because the official website says:

log4j.rootLogger=INFO,console

# for package com.demo.elk, log would be sent to socket appender.
log4j.logger.com.demo.elk=DEBUG, socket

# appender socket
log4j.appender.socket=org.apache.log4j.net.SocketAppender
log4j.appender.socket.Port=4567
log4j.appender.socket.RemoteHost=centos2
log4j.appender.socket.layout=org.apache.log4j.PatternLayout
log4j.appender.socket.layout.ConversionPattern=%d [%-5p] [%l] %m%n
log4j.appender.socket.ReconnectionDelay=10000

# appender console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d [%-5p] [%l] %m%n

Note: the port number here should be consistent with the port number monitored by Logstash. Here it is 4567.

Application.java, use LOGGER of Log4j to print the log:

package com.demo.elk;

import org.apache.log4j.Logger;

public class Application {
    private static final Logger LOGGER = Logger.getLogger(Application.class);
    public static void main(String[] args) throws Exception {
        for (int i = 0; i < 10; i++) {
            LOGGER.error("Info log [" + i + "].");
            Thread.sleep(500);
        }
    }
}

Use the Head plug-in to view the ES status and content

Run Application.java and look at the output of the console first (of course, this output is only for verification, and it is OK not to output to the console):

Let's look at the head page of ES:

Switch to Browser tab:

Click a document (doc) to display all the information of the document:

You can see that in addition to the basic message field, Logstash adds many fields for us. And in https://www.elastic.co/guide/en/logstash/current/plugins-inputs-log4j.html This is also clearly stated in:

The above uses the ES Head plug-in to observe the status and data of the ES cluster, but this is just a simple page for interacting with the ES and cannot generate reports or charts. Next, Kibana is used to perform search and generate charts.

Kibana

Configure Kibana:

tar -zxvf kibana-4.3.0-linux-x86.tar.gz
cd kibana-4.3.0-linux-x86
vi config/kibana.yml

Modify the following items (since it is a stand-alone version, the value of host can also be replaced by localhost, which is only used for demonstration):

server.port: 5601
server.host: "centos2"
elasticsearch.url: http://centos2:9200
kibana.index: ".kibana"

Start kibana:

./bin/kibana

Open the address with a browser:

In order to use kibana later, you need to configure at least one Index name or Pattern, which is used to determine the Index in ES during analysis. Here, I enter the Index name applog configured before. Kibana will automatically load the doc field under the Index and automatically select the appropriate field for the time field in the icon:

After clicking Create, you can see that the configured Index name is added on the left:

Next, switch to the Discover tab. Note that the upper right corner is the query time range. If no data is found, you may need to adjust this time range. Here I choose Today:

Next, you can see the data in ES:

Perform a search and see:

Click the Save button on the right to save the query as search_all_logs. Next, go to the Visualize page, click create a Vertical Bar Chart, and then select the query search just saved_ all_ After logs, Kibana will generate a histogram similar to the following figure (there are only 10 logs in the same time period, which is ugly, but enough to explain the problem:):

You can set various parameters of the graph on the left, click the Apply Changes button, and the graph on the right will be updated. Similarly, other types of graphics can be updated in real time.

Click save on the right to save the figure and name it search_all_logs_visual. Next, switch to the Dashboard page:

Click the new button and select the search you just saved_ all_ logs_ Visual graphics, which will be displayed on the panel:

If there is more data, we can add multiple charts on the Dashboard page according to business needs and concerns: column chart, line chart, map, pie chart, etc. Of course, we can set the update frequency to make the chart update automatically:

If the set time interval is short enough, it is close to real-time analysis.

Here, ELK platform deployment and basic testing have been completed.

Docker compose elk + filebeat view docker and container logs

Note: it is planned to write a blog post of elk log based on my practical operation of high availability environment. The following is the content of others for reference only. I will supplement this piece after my practical operation of high availability environment.

Original text:

https://www.cnblogs.com/weschen/p/11046906.html

At present, the development team of my company is relatively small. I have developed a small system for the factories under the group, running on a CentOS server. The server has built a docker environment and installed docker compose. However, in terms of log processing, there is no good method to collect complete logs, so I can only rely on entering the server, It's very inconvenient to enter and view using docker logs containerID. Previously, it was also related to the technology of noting ELK, but it has been developing the system functions and making every effort to realize them. Today, when I'm free, I think of the task of checking ELK logs again.

Project folder, project code, warehouse address: https://github.com/ChenWes/docker-elk

docker-compose.yml

[](javascript:void(0)😉

version: '3'

services:
  filebeat:
    hostname: filebeat
    image: weschen/filebeat
    build:
      context: filebeat
      dockerfile: Dockerfile
    volumes:
      # needed to access all docker logs (read only) :
     - "/var/lib/docker/containers:/usr/share/dockerlogs/data:ro"
      # needed to access additional informations about containers
     - "/var/run/docker.sock:/var/run/docker.sock"
    links:
       - logstash
  kibana:
    image: docker.elastic.co/kibana/kibana:6.5.2
    environment:
      - "LOGGING_QUIET=true"
    links:
      - elasticsearch
    ports:
      - 5601:5601
  logstash: 
    hostname: logstash 
    image: weschen/logstash
    build:
      context: logstash
      dockerfile: Dockerfile
    ports:
      - 5044:5044
    environment:
      LOG_LEVEL: error
    links:
      - elasticsearch
  elasticsearch:
    hostname: elasticsearch
    image: weschen/elasticsearch
    build:
      context: elasticsearch
      dockerfile: Dockerfile
    environment:
      - cluster.name=docker-elk-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256m -Xmx256m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    ports:
      - 9200:9200

[](javascript:void(0)😉

1.Elasticsearch

File elasticsearch/Dockerfile

FROM docker.elastic.co/elasticsearch/elasticsearch:6.5.2
COPY --chown=elasticsearch:elasticsearch elasticsearch.yml /usr/share/elasticsearch/config/

CMD ["elasticsearch", "-Elogger.level=INFO"]

File elasticsearch/elasticsearch.yml

[](javascript:void(0)😉

cluster.name: ${cluster.name}
network.host: 0.0.0.0

# minimum_master_nodes need to be explicitly set when bound on a public IP
# set to 1 to allow single node clusters
# Details: https://github.com/elastic/elasticsearch/pull/17288
discovery.zen.minimum_master_nodes: 1

[](javascript:void(0)😉

2.Logstash

File logstash/Dockerfile

FROM docker.elastic.co/logstash/logstash:6.5.2

RUN rm -f /usr/share/logstash/pipeline/logstash.conf
COPY pipeline /usr/share/logstash/pipeline/

File logstash/pipeline/logstash.conf

[](javascript:void(0)😉

input { 
    beats {
        port => 5044
        host => "0.0.0.0"
      }
} 

output { 
    elasticsearch { 
        hosts => elasticsearch
        manage_template => false
        index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    } 
    
    stdout { codec => rubydebug }
}

[](javascript:void(0)😉

3.Filebeat

File filebeat/Dockerfile

[](javascript:void(0)😉

FROM docker.elastic.co/beats/filebeat:6.5.2

# Copy our custom configuration file
COPY filebeat.yml /usr/share/filebeat/filebeat.yml

USER root
# Create a directory to map volume with all docker log files
RUN mkdir /usr/share/filebeat/dockerlogs
RUN chown -R root /usr/share/filebeat/
RUN chmod -R go-w /usr/share/filebeat/

[](javascript:void(0)😉

File filebeat/filebeat.yml

[](javascript:void(0)😉

filebeat.inputs:
- type: docker
  combine_partial: true
  containers:
    path: "/usr/share/dockerlogs/data"
    stream: "stdout"
    ids:
      - "*"
  exclude_files: ['\.gz$']
  ignore_older: 10m

processors:
  # decode the log field (sub JSON document) if JSON encoded, then maps it's fields to elasticsearch fields
- decode_json_fields:
    fields: ["log", "message"]
    target: ""
    # overwrite existing target elasticsearch fields while decoding json fields    
    overwrite_keys: true
- add_docker_metadata:
    host: "unix:///var/run/docker.sock"

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

# setup filebeat to send output to logstash
output.logstash:
  hosts: ["logstash"]

# Write Filebeat own logs only to file to avoid catching them with itself in docker log files
logging.level: error
logging.to_files: false
logging.to_syslog: false
loggins.metrice.enabled: false
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644
ssl.verification_mode: none

[](javascript:void(0)😉

Step 1: first enter the project directory and handle the image first

#Processing elasticsearch image cd elasticsearch
docker build -t  weschen/elasticsearch .cd ..
#Handling cd filebeat images
docker build -t  weschen/filebeat .cd ..
#Process logstash image cd logstash
docker build -t  weschen/logstash .cd ..

View the image as follows:

Step 2: use docker compose up - D to run the service

Attachment: if elasticsearch encounters insufficient memory, please refer to< An error occurred while docker started the elasticsearch container>

Open [host IP]: 9200 in the browser, and the following interface can be opened, indicating that the elasticsearch service has been started. If you do not see this information, you need to wait for the elasticsearch service to start

Then open [host IP]: 5601 in the browser, which is the Kibana log viewing platform

Enter [index pattern] in [management] of the system menu

To use Kibana for the first time, you need to create an index pattern. The operation of creating an index pattern is as follows. If you create an index pattern in the Discover menu, the following will appear

After creating the index pattern, you should be able to view the Logs by viewing the Logs

Home page view log

Source address: https://github.com/ChenWes/docker-elk

reference resources:

http://baidu.blog.51cto.com/71938/1676798

http://blog.csdn.net/cnweike/article/details/33736429

https://my.oschina.net/itblog/blog/547250

http://baidu.blog.51cto.com/71938/1676798

Posted on Mon, 08 Nov 2021 04:15:19 -0500 by donturo