ELK log analysis platform -- elastic search

1, elasticsearch practice

Open source distributed search analysis engine and love ability are based on apache lucene, a full-text search engine library
elasticsearch is not only lucene, but also a full-text search engine:
A distributed file storage, each field can be indexed and searched
A distributed real-time analysis search engine
It can scale hundreds of service nodes and support PB level structured or unstructured data.

Basic modules:

cluster Manage cluster status and maintain cluster level configuration information alloction Functions and strategies related to package allocation discovery Discover the nodes in the cluster and select the primary node gateway Persistent storage of cluster state data broadcast by the received master indices Manage index settings at the global level http Allow access to ES API through JSON over HTTP transport Used for internal communication between nodes in the cluster engine It encapsulates the operation of lucene and the call of translog

elasticsearch application scenario:
Information retrieval
Log analysis
Business data analysis
Database acceleration
Operation and maintenance index monitoring

Official website: https://www.elastic.co/cn/

1. Installation
https://elasticsearch.cn/download

[root@server1 elk]# rpm -ivh elasticsearch-7.7.0-x86_64.rpm [root@server1 elk]# cd /etc/elasticsearch/ [root@server1 elasticsearch]# vim elasticsearch.yml #Install profile cluster.name: my-es #Specify cluster node.name: server4 #Specify nodes bootstrap.memory_lock: true #memory locked network.host: 172.25.1.1 #Listening port http.port: 9200 discovery.seed_hosts: ["server1", "server3","server5"] #There must be one, or the service won't work

[root@server1 elasticsearch]# vim /etc/security/limits.conf elasticsearch soft memlock unlimited elasticsearch hard memlock unlimited elasticsearch - nofile 65535 elasticsearch - nproc 4096

[root@server1 elasticsearch]# vim /usr/lib/systemd/system/elasticsearch.service LimitNPROC=4096 LimitMEMLOCK=infinity

[root@server1 elasticsearch]# swapoff -a #Disable swap partition [root@server1 elasticsearch]# netstat -antlp

2. Graphical operation
(1) Installation

[root@server1 elk]# ls elasticsearch-7.7.0-x86_64.rpm elasticsearch-head-master nodejs-9.11.2-1nodesource.x86_64.rpm https://mirrors.tuna.tsinghua.edu.cn/nodesource/rpm_9.x/el/7/x86_64/ #Download address [root@server1 elk]# yum install -y nodejs-9.11.2-1nodesource.x86_64.rpm [root@server1 elk]# node -v v9.11.2 [root@server1 elk]# npm -v 5.6.0

yum install unzip unzip elasticsearch-head-master.zip cd elasticsearch-head-master npm install --registry=https://registry.npm.taobao.org

yum install -y bzip2 tar jxf phantomjs-2.1.1-linux-x86_64.tar.bz2 cd phantomjs-2.1.1-linux-x86_64/bin/ mv phantomjs /usr/local/bin/ phantomjs yum provides */libfontconfig.so.1 yum install -y fontconfig-2.13.0-4.3.el7.x86_64 phantomjs cd /root/elk/elasticsearch-head-master

npm install --registry=https://registry.npm.taobao.org

(2) Start

[root@server1 elasticsearch-head-master]# cd _site/ [root@server1 _site]# ls app.css app.js background.js base fonts i18n.js index.html lang manifest.json vendor.css vendor.js [root@server1 _site]# vim app.js this.base_uri = this.config.base_uri || this.prefs.get("app-base_uri") || "http://172.25.1.1:9200"; [root@server1 elasticsearch-head-master]# npm run start &

Web search connection: http://172.25.1.1:9100/
Found unable to connect

(3) Modify ES cross domain hosting

[root@server1 elasticsearch-head-master]# vim /etc/elasticsearch/elasticsearch.yml http.cors.enabled: true http.cors.allow-origin: "*" [root@server1 elasticsearch-head-master]# systemctl restart elasticsearch

Connection failed to modify profile

[root@server1 elasticsearch]# vim elasticsearch.yml discovery.seed_hosts: ["server1", "server3","server5"] # #Bootstrap the cluster using an initial set of master-eligible nodes: # cluster.initial_master_nodes: ["server1"] [root@server1 elasticsearch]# systemctl restart elasticsearch

Successfully connected after modification

(4) Add a host node (the procedure is the same as before)

[root@server3 ~]# cd elk/ [root@server3 elk]# ls elasticsearch-7.7.0-x86_64.rpm [root@server3 elk]# rpm -ivh elasticsearch-7.7.0-x86_64.rpm

[root@server5 ~]# cd elk/ [root@server5 elk]# ls elasticsearch-7.7.0-x86_64.rpm [root@server5 elk]# rpm -ivh elasticsearch-7.7.0-x86_64.rpm

[root@server1 elasticsearch]# scp -p elasticsearch.yml server3:/etc/elasticsearch/elasticsearch.yml [root@server1 elasticsearch]# scp -p elasticsearch.yml server5:/etc/elasticsearch/elasticsearch.yml //Change your host name and ip address discovery.seed_hosts: ["server1", "server3","server5"] # #Bootstrap the cluster using an initial set of master-eligible nodes: # cluster.initial_master_nodes: ["server1","server3","server5"]

[root@server1 security]# scp limits.conf server3:/etc/security/limits.conf [root@server1 security]# scp limits.conf server5:/etc/security/limits.conf [root@server3 elasticsearch]# vim /usr/lib/systemd/system/elasticsearch.service LimitNPROC=4096 LimitMEMLOCK=infinity [root@server5 elasticsearch]# vim /usr/lib/systemd/system/elasticsearch.service LimitNPROC=4096 LimitMEMLOCK=infinity

[root@server3 elk]# systemctl start elasticsearch [root@server3 elk]# [root@server2 elk]# systemctl restart elasticsearch

elasticsearch node role
Master: it is mainly responsible for the creation and deletion of indexes in the cluster and the rebalance of data u. Master is not responsible for data indexing and retrieval, so the load is light. When the master node loses contact or hangs up, the ES cluster will automatically select a leader from other master nodes
Date Node: it is mainly responsible for the index and retrieval of data in the cluster. Generally, it is under great pressure
Coordinating Node: the main function of the original Client node is to distribute requests and merge results. By default, all nodes are coordinating nodes, which cannot be closed
Ingest Node: preprocessing index documents

3. Optimize nodes
By default, all three nodes can be used as the master to optimize the three nodes and clear their respective roles

(1)
In the production environment, if you don't modify the role information of the elasticsearch node, the cluster is prone to brain crack and other problems in the scene of high data volume and high concurrency. By default, each node in the elasticsearch cluster has the qualification to be the master node, also stores data, and can provide query services.
Node roles are controlled by the following attributes:

node.master false/true node.master false/true node.ingest true/false search.remote.connect true/false

By default, the values of these properties are true.

(2)

node.master This attribute indicates whether the node has the qualification to be the master node. Note: the value of this attribute is true, which does not mean that the node is the master node. Because the real master node is elected by multiple nodes with master node qualification. node.data This property indicates whether the node stores data node.ingest Whether to preprocess the document search.remote.connect Disable cross cluster query

(3)
First combination: (default)
node.master: true
node.data: true
node.ingest: true
search.remote.connect: true
This combination means that this node has the qualification to be the master node and also stores data. If a node is elected as the real master node, then it also stores data, so the pressure on this node is greater. This is OK in the test environment, but it is not recommended in practice.
(4)
Second combination: (Data node)
node.master: false
node.data: true
node.ingest: false
search.remote.connect: false
This combination means that this node is not qualified to be the master node, so it will not participate in the election and will only store data.
This node is called the data node. Several such nodes need to be set up separately in the cluster to store data. Provide storage and query services later.

(5)
Third combination: (master node)
node.master: true
node.data: false
node.ingest: false
search.remote.connect: false
This combination means that this node will not store data, has the qualification to become a master node, can participate in the election, and may become a real master node.
This node is called the master node.

(6)
The fourth combination: (Coordinating Node)
node.master: false
node.data: false
node.ingest: false
search.remote.connect: false
This combination means that this node will neither be the primary node nor store data,
The meaning of this node is as a coordination node, which can balance the load when the massive requests are needed.

(7)
The fifth combination: (Ingest Node)
node.master: false
node.data: false
node.ingest: true
search.remote.connect: false
This combination means that this node will neither be the primary node nor store data,
The meaning of this node is the ingest node, which preprocesses the indexed documents.

The responsibilities of these nodes can be divided in the production cluster
It is recommended to set more than 3 nodes in the cluster as master nodes, which are only responsible for becoming master nodes and maintaining the status of the whole cluster.
Then set a batch of data nodes according to the amount of data. These nodes are only responsible for storing data, and later provide the service of establishing index and querying index. In this way, if the user requests frequently, the pressure of these nodes will be greater.
Therefore, it is recommended to set up another batch of coordination nodes in the cluster, which are only responsible for processing user requests, realizing request forwarding, load balancing and other functions.

Node requirements
master node: normal server (general CPU and memory consumption)
data node: mainly consumes disk and memory.
path.data: data1,data2,data3
Such a configuration may lead to uneven data writing. It is recommended that only one data path be specified. The RAID 0 array can be used for the disk without requiring a high cost ssd.
Coordinating node: high requirements for cpu and memory

experiment
server1:

[root@server1 elasticsearch]# vim elasticsearch.yml node.name: server1 node.master: true node.data: false node.ingest: false search.remote.connect: false [root@server1 elasticsearch]# systemctl restart elasticsearch

server2\3:

[root@server2 elasticsearch]# vim elasticsearch.yml node.master: true node.data: true node.ingest: false search.remote.connect: false [root@server2 elasticsearch]# systemctl restart elasticsearch

#After the setting is successful, the primary node changes to server1

ELK log analysis platform -- elastic search

21 June 2020, 06:22 | Views: 4726

Add new comment

0 comments