catalogue
1, Introduction to ELK log analysis system
1. Composition of ELK log analysis system
2, Introduction to three software
two point one Logstash introduction
two point two Main components of Logstash
two point three LogStash host classification
three point one Introduction to Kibana
three point two Main functions of Kibana
2. Configure Elasticsearch environment
two point one Set the local host mapping file and check the Java environment for three hosts
two point two Install elasticsearch - rpm package
two point three Change elasticsearch master profile
two point four Create data storage path and authorize
two point five Start Elasticsearch successfully
3. node1 and node2 install the elastic search head plug-in
three point one Compile and install node component dependent packages
three point two Installing the phantomjs front-end frame
three point three Install elasticsearch head data visualization tool
three point four Modify master profile
three point five Start elasticsearch head
three point six Use the real machine to open the browser to view
three point seven Create an index in the node node to test
three point eight After the index is created, view it in the web page
4. Deploying logstash on apache server
four point one Install the httpd service and start it
four point two Install the logstash service and start it
four point three Conduct docking test with elastic search (node)
four point four The host browser accesses node1 node to view index information
four point five Make docking configuration
5. node1 host installation Kibana
six point one Create the newly created index and match with the same name
six point two Apache log files (access logs, error logs) docked to the Apache host
six point four You can enter the kibana interface to create an index
preface
Log analysis is the main means for operation and maintenance engineers to solve system faults and find problems. Logs mainly include system logs, application logs and security logs. System operation and maintenance personnel and developers can understand the software and hardware information of the server through the log, check the errors in the configuration process and the causes of the errors. Regular analysis of logs can understand the load, performance and security of the server, so as to take timely measures to correct errors.
1, Introduction to ELK log analysis system
ELK log analysis system is a collection of Logstash, elastscearch and Kibana open source software. Externally, it is an open source scheme as a log management system. It can search, analyze and visually display logs from any source and in any format.
1. Composition of ELK log analysis system
- Elastic search (ES): by building clusters; Store log data, index log data
- logstash: collect logs and store them to es
- kibana: log information is displayed in the form of view, which is more user-friendly
2. Log processing steps
Centralized log management
Format the log and output it to Elasticsearch
Index and store the formatted data (elastic search)
Display of front-end data (Kibana)
2, Introduction to three software
1,Elasticsearch
one point one summary
- It provides a distributed multi-user full-text search engine
one point two Core concept
(1) Near real time (NRT)
Elastic search is a near real-time search platform, which means that there is a slight delay (usually 1 second) from indexing a document until the document can be searched
(2) Cluster
A cluster is organized by one or more nodes. They jointly hold your entire data and provide indexing and search functions together. One of the nodes is the primary node, which can be elected, and provides cross node joint index and search functions. The cluster has a unique identifier name, which is elasticsearch by default,
The cluster name is very important. Each node is added to its cluster based on the cluster name. Therefore, ensure that different cluster names are used in different environments.
A cluster can have only one node. It is strongly recommended to configure elasticsearch in cluster mode.
(3) Node
A node is a single server, which is a part of the cluster, stores data and participates in the index and search functions of the cluster. Like clusters, nodes are also identified by name. By default, the character name is randomly assigned when the node is started. Of course, you can define it yourself. This name is also very important. It is used to identify the node corresponding to the server in the cluster.
Nodes can join the cluster by specifying the cluster name. By default, each node is set to join the elastic search cluster. If multiple nodes are started, assuming that they can automatically find each other, they will automatically form a cluster called elastic search.
(4) Index
An index is a collection of documents with somewhat similar characteristics. For example, you can have an index of customer data, another index of product catalog, and an index of order data. An index is identified by a name (which must be all lowercase letters), and this name should be used when we want to index, search, update and delete the documents corresponding to this index. In a cluster, you can define as many indexes as you want.
Index the library relative to the relational database
(5) Type
In an index, you can define one or more types. A type is a logical classification / partition of your index, and its semantics is entirely up to you. Usually, a type is defined for documents with a set of common fields. For example, let's assume that you run a blog platform and store all your data in an index.
In this index, you can define one type for user data, another type for blog data, and of course, another type for comment data.
Table type relative to relational database
(6) Document
A document is a basic information unit that can be indexed. For example, you can have a customer's document, a product's document, and, of course, an order's document. The document is represented in JSON(Javascript Object Notation), which is a ubiquitous Internet data interaction format.
In an index/type, you can store as many documents as you want. Note that although a document is physically located in an index, in fact, a document must be indexed and assigned a type in an index.
Columns of a document relative to a relational database
(7) Shards & replicas
In practice, the data stored in the index may exceed the hardware limit of a single node. For example, a 1 billion document requires 1TB of space, which may not be suitable for storage on the disk of a single node, or the search request from a single node is too slow. To solve this problem, elastic search provides the function of dividing the index into multiple slices. When creating an index, you can define the number of slices you want to slice. Each partition is a fully functional independent index, which can be located on any node in the cluster.
Two main reasons for fragmentation:
Expand horizontally to increase storage capacity
Distributed parallel cross slice operation to improve performance and throughput
The mechanism of distributed fragmentation and how to summarize the documents of search requests are completely controlled by elastic search, which are transparent to users.
Network problems and other problems can occur at any time. For robustness, it is strongly recommended to have a failover mechanism, no matter what kind of failure, to prevent fragmentation or node unavailability. For this purpose, elastic search lets us make one or more copies of the index shards, which are called sharded copies or copies.
2,Logstash
two point one Logstash introduction
- A powerful data processing tool
- It can realize data transmission, format processing and formatted output
- Data input (from business input), data processing (such as filtering, rewriting, etc.) and data output (output to Elasticsearch cluster)
two point two Main components of Logstash
shipper: the log collector, which is responsible for monitoring the changes of local log files and collecting the latest contents of log files in time. Usually, the remote agent only needs to run this component
indexer: the log storer, which is responsible for receiving logs and writing them to local files
broker: a log hub that connects multiple shipper s and indexer s
search and storage: allows you to search and store events
web interface: presentation interface based on wWeb
two point three LogStash host classification
Agent host: as the shipper of events, send various log data to the central host; just run the Logstash agent program;
Central host: it can run various components including broker, indexer, search and storage and Web Interface to receive, process and store log data
3,Kibana
three point one Introduction to Kibana
- An open source analysis and visualization platform for Elasticsearch
- Search and view data stored in Elasticsearch index
- Advanced data analysis and display through various charts
three point two Main functions of Kibana
Seamless integration of Elasticsearch. Kibana architecture is customized for Elasticsearch and can add any structured and unstructured data to Elasticsearch index. Kibana also makes full use of Elasticsearch's powerful search and analysis functions.
Integrate your data. Kibana can better handle massive data and create column charts, line charts, scatter charts, histograms, pie charts and maps.
Complex data analysis. Kibana improves Elasticsearch's analysis capabilities, enabling it to analyze data more intelligently, perform mathematical transformations, and cut and block data as required.
Benefit more team members. The powerful database visualization interface enables all business posts to benefit from the data collection.
The interface is flexible and easy to share. Kibana can be used to create, save and share data more conveniently, and communicate visual data quickly.
Simple configuration. Kibana is very simple to configure and enable, and the user experience is very friendly. Kibana comes with its own Web server, which can start and run quickly.
Visualize multiple data sources. Kibana can easily integrate data from Logstash, ES Hadoop, Beats or third-party technologies into Elasticsearch.
Simple data export. Kibana can easily export the data of interest, quickly model and analyze after merging with other data sets, and find new results.
3, ELK deployment
1. Experimental environment
Server installation service allocation, host name and IP address
2. Configure Elasticsearch environment
two point one Set the local host mapping file and check the Java environment for three hosts
[root@node1 ~]# vim /etc/hosts 192.168.179.123 node1 192.168.179.124 node2 [root@node1 ~]# java -version 'view java version' openjdk version "1.8.0_181" OpenJDK Runtime Environment (build 1.8.0_181-b13) OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
two point two Install elasticsearch - rpm package
[root@node1 ~]# rpm -ivh elasticsearch-5.5.0.rpm
two point three Change elasticsearch master profile
[root@node1 ~]# cp /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.yml.bak [root@node1 ~]# vim /etc/elasticsearch/elasticsearch.yml 17 cluster.name: my-elk-cluster 'Cluster name' 23 node.name: node1 'Node name' 33 path.data: /data/elk_data 'Data storage path' 37 path.logs: /var/log/elasticsearch/ 'Data storage path' 43 bootstrap.memory_lock: false 'Do not lock memory at startup' 55 network.host: 0.0.0.0 'Providing service bindings IP Address, 0.0.0.0 Represents all addresses' 59 http.port: 9200 'The listening port is 9200' 68 discovery.zen.ping.unicast.hosts: ["node1", "node2"] 'Cluster discovery is implemented by unicast' [root@node1 ~]# grep -v "^#"/ etc/elasticsearch/elasticsearch.yml 'view the configuration file just modified' cluster.name: my-elk-cluster node.name: node1 path.data: /data/elk_data path.logs: /var/log/elasticsearch/ bootstrap.memory_lock: false network.host: 0.0.0.0 http.port: 9200 discovery.zen.ping.unicast.hosts: ["node1", "node2"]
two point four Create data storage path and authorize
[root@node1 ~]# mkdir -p /data/elk_data [root@node1 ~]# chown elasticsearch:elasticsearch /data/elk_data/
two point five Start Elasticsearch successfully
[root@node1 ~]# systemctl start elasticsearch.service [root@node1 ~]# netstat -ntap |grep 9200 tcp6 0 0 :::9200 :::* LISTEN 20500/java
two point six View the node information and open it with the browser of the real machine http://192.168.179.123:9200
two point seven The operation of node 2 is the same as that of node 1. Only when modifying the configuration file, the node name needs to be changed to node2
[root@node2 ~]# vim /etc/elasticsearch/elasticsearch.yml node.name: node2 [root@node2 ~]# netstat -ntap |grep 9200 tcp6 0 0 :::9200 :::* LISTEN 10459/java
Cluster check health and status
3. node1 and node2 install the elastic search head plug-in
three point one Compile and install node component dependent packages
[root@node1 ~]# yum install gcc gcc-c++ make -y [root@node1 ~]# tar zxvf node-v8.2.1.tar.gz [root@node1 ~]# cd node-v8.2.1/ [root@node1 node-v8.2.1]# ./configure [root@node2 node-v8.2.1]# make -j3 'compile and specify the number of 3 threads' [root@node1 node-v8.2.1]# make install
three point two Installing the phantomjs front-end frame
[root@node1 ~]# tar jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2 -C /usr/local/src/ [root@node1 ~]# cd /usr/local/src/phantomjs-2.1.1-linux-x86_64/bin/ [root@node1 bin]# cp phantomjs /usr/local/bin / 'let the system recognize the phantomjs command'
three point three Install elasticsearch head data visualization tool
[root@node1 ~]# tar zxvf elasticsearch-head.tar.gz -C /usr/local/src/ [root@node1 ~]# cd /usr/local/src/elasticsearch-head/ [root@node1 elasticsearch-head]# npm install 'initialization item' npm WARN deprecated fsevents@1.2.13: fsevents 1 will break on node v14+ and could be using insecure binaries. Upgrade to fsevents 2. npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@^1.0.0 (node_modules/karma/node_modules/chokidar/node_modules/fsevents): npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@1.2.13: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"}) npm WARN elasticsearch-head@0.0.0 license should be a valid SPDX license expression up to date in 3.495s
three point four Modify master profile
[root@node1 ~]# vim /etc/elasticsearch/elasticsearch.yml 'Add' at the end http.cors.enabled: true 'Enable cross domain access support. The default is false' http.cors.allow-origin: "*" 'Allow cross domain access to domain names and addresses. Regular access is supported' [root@node1 ~]# systemctl restart elasticsearch.service
three point five Start elasticsearch head
[root@node1 ~]# cd /usr/local/src/elasticsearch-head/ [root@node1 elasticsearch-head]# NPM run start & 'start the project and run in the background' [1] 121858 [root@node1 elasticsearch-head]# > elasticsearch-head@0.0.0 start /usr/local/src/elasticsearch-head > grunt server Running "connect:server" (connect) task Waiting forever... Started connect web server on http://localhost:9100 [root@node1 elasticsearch-head]# netstat -ntap |grep 9200 tcp6 0 0 :::9200 :::* LISTEN 121753/java [root@node1 elasticsearch-head]# netstat -ntap |grep 9100 tcp 0 0 0.0.0.0:9100 0.0.0.0:* LISTEN 121868/grunt
three point six Use the real machine to open the browser to view
three point seven Create an index in the node node to test
[root@node1 ~]# curl -XPUT 'localhost:9200/index-demo/test/1?pretty&pretty' -H 'content-Type: application/json' -d '{"user":"cllt","mesg":"hello world"}' { "_index" : "index-demo", "_type" : "test", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "created" : true }
three point eight After the index is created, view it in the web page
4. Deploying logstash on apache server
four point one Install the httpd service and start it
[root@apache ~]# yum install httpd -y [root@apache ~]# systemctl start httpd [root@apache ~]# netstat -ntap |grep 80 tcp6 0 0 :::80 :::* LISTEN 67743/httpd
four point two Install the logstash service and start it
[root@apache ~]# rpm -ivh logstash-5.5.1.rpm [root@apache ~]# systemctl start logstash.service [root@apache ~]# ln -s /usr/share/logstash/bin/logstash /usr/local/bin/
four point three Conduct docking test with elastic search (node)
Logstash this command tests
Field description and explanation:
-f through this option, you can specify the configuration file of logstash and configure logstash according to the configuration file
-e is followed by a string, which can be used as the configuration of logstash (if it is ", stdin is used as input and stdout is used as output by default)
-Ttests that the configuration file is correct and exits
'The input adopts standard input and output, and the standard output is used for test' [root@apache ~]# logstash -e 'input { stdin{} } output { stdout{} }' ... www.baidu.com 'Enter the website test. You can exit after the test is normal' 2020-09-14T10:34:54.770Z apache www.baidu.com www.sina.com.cn 2020-09-14T10:35:03.852Z apache www.sina.com.cn www.taobao.com 2020-09-14T10:35:11.564Z apache www.taobao.com
'Testing: Using rubydebug Display detailed output, codec It is a codec' [root@apache ~]# logstash -e 'input { stdin{} } output { stdout{ codec=>rubydebug } }' ... 18:36:54.709 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600} www.baidu.com 'Enter web address' { "@timestamp" => 2020-09-14T10:37:04.225Z, "@version" => "1", "host" => "apache", "message" => "www.baidu.com" }
'use logstash Write information to elasticsearch in' [root@apache ~]# logstash -e 'input { stdin{} } output { elasticsearch { hosts=>["192.168.179.123:9200"] } }' 18:38:20.805 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600} www.baidu.com 'Enter URL test' www.sina.com www.google.com.cn
four point four The host browser accesses node1 node to view index information
four point five Make docking configuration
The Logstash configuration file is mainly composed of three parts: input, output and filter (processed as needed)
[root@apache ~]# chmod o+r /var/log/messages' read permissions for other users' [root@apache ~]# ll /var/log/messages -rw----r--. 1 root root 601686 9 June 14-18:41 /var/log/messages [root@apache ~]# vim /etc/logstash/conf.d/system.conf 'edit configuration file' input { 'input' file{ path => "/var/log/messages" 'Specifies the absolute path of the log file' type => "system" 'Define a type name' start_position => "beginning" 'Start from scratch' } } output { 'Output to elasticsearch in' elasticsearch { hosts => ["192.168.179.123:9200"] 'Specify master node IP address' index => "system-%{+YYYY.MM.dd}" 'Build an index' } } [root@apache ~]# systemctl restart logstash.service
5. node1 host installation Kibana
Install kibana
[root@node1 ~]# mv kibana-5.5.1-x86_64.rpm /usr/local/src/ [root@node1 ~]# cd /usr/local/src/ [root@node1 src]# rpm -ivh kibana-5.5.1-x86_64.rpm [root@node1 src]# cd /etc/kibana/ [root@node1 kibana]# cp kibana.yml kibana.yml.bak 'backup profile' [root@node1 kibana]# vim kibana.yml 'modify Kibana profile' 2 server.port: 5601 'kibana Open port' 7 server.host: "0.0.0.0" 'kibana Listening address' 21 elasticsearch.url: "http://192.168.179.123:9200 "'establish contact with elasticsearch ' 30 kibana.index: ".kibana" 'stay elasticsearch Add in.kibana Indexes' [root@node1 kibana]# systemctl start kibana.service [root@node1 kibana]# systemctl enable kibana.service Created symlink from /etc/systemd/system/multi-user.target.wants/kibana.service to /etc/systemd/system/kibana.service. [root@node1 kibana]# netstat -ntap |grep 5601 tcp 0 0 0.0.0.0:5601 0.0.0.0:* LISTEN 122556/node
6. Access test
six point one Create the newly created index and match with the same name
six point two Apache log files (access logs, error logs) docked to the Apache host
[root@apache ~]# cd /etc/logstash/conf.d/ [root@apache conf.d]# vim apache_log.conf input { file{ path => "/etc/httpd/logs/access_log" type => "access" start_position => "beginning" } file{ path => "/etc/httpd/logs/error_log" type => "error" start_position => "beginning" } } output { if [type] == "access" { elasticsearch { hosts => ["192.168.179.123:9200"] index => "apache_access-%{+YYYY.MM.dd}" } } if [type] == "error" { elasticsearch { hosts => ["192.168.179.123:9200"] index => "apache_error-%{+YYYY.MM.dd}" } } } [root@apache conf.d]# /usr/share/logstash/bin/logstash -f apache_log.conf 'specify configuration file for testing'
six point three After visiting the apache website, visit http://192.168.179.123:9100 Two index information can be found
six point four You can enter the kibana interface to create an index
summary
Deploy logstash on all services that need to collect logs, where logstash agent (logstash shipper) is used to monitor and filter the collected logs, and send the filtered content to the logstash indexer. The logstash indexer collects the logs and gives them to the full-text search service Elasticsearch. Elasticsearch can be used for custom search, and Kibana can be used to combine custom search for page display.