Elasticsearch installation (Traditional & docker mode) & integrating Springboot

catalogue

1, Introduction to Elasticsearch

2, ElasticSearch cluster installation

Traditional way

  Docker installation

3, Configure Chinese word splitter

Traditional way

  Docker mode

4, Integrate SpringBoot

1, Introduction to Elasticsearch

         Elasticsearch is a distributed document storage middleware. Instead of storing information as columns and rows, it stores complex data structures that have been serialized into JSON documents. When you have multiple nodes in a cluster, the stored documents are distributed in the whole cluster and can be accessed from any node immediately.

         When a document is stored, it will be indexed and searched in near real time (1s). Elastic search uses a data structure called inverted index, which supports fast full-text search. The inverted index lists each unique word in all documents and identifies which document each word is in.

         An index can be regarded as an optimized collection of documents. Each document index is a collection of fields, which are key value pairs containing data. By default, Elasticsearch establishes inverted indexes for all data in each field, and each index field has a special optimized data structure. For example, text fields are in the inverted index, and numerical and geographic fields are stored in the BKD tree. It is precisely because of the combination of data structures by field that Elasticsearch has such fast search ability.

2, ElasticSearch cluster installation

The installed version of this article is 7.15.2, and some parameters of the old version are somewhat different; The JDK version is jdk17.

Traditional way

1. Installing the jdk environment

Installation jdk17&jdk8 under Linux system_ Familiar snail blog - CSDN blog

2. Unzip the installation package

#Unzip file
 tar -zxvf elasticsearch-7.15.2-linux-x86_64.tar.gz -C /usr/local/
#Duplicate name
mv elasticsearch-7.15.2 /usr/local/elasticsearch

3. Create user group

Since elasticsearch cannot be started with the root account, an account needs to be created

#Create user group
groupadd es
#Create user
useradd -g es snail_es
#to grant authorization
chown -R snail_es.es /usr/local/elasticsearch/

4. Create es data directory to store data and logs, and authorize

#Create catalog files and authorize
mkdir /usr/local/es
chown -R snail_es.es /usr/local/es

5. Modify the configuration file (please modify the configuration of each node according to the following configuration file response)

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#Cluster name. The names of the three nodes are the same
cluster.name: my-es
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#The name of each node is different
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#Data directory
path.data: /usr/local/es/data
#
# Path to log files:
#Log directory
path.logs: /usr/local/es/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#Current host ip
network.host: 192.168.139.160
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#External port number
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#Cluster discovery, the default port is 9300
discovery.seed_hosts: ["192.168.139.160","192.168.139.161", "192.168.139.162"]
#Cluster node name
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["node-1", "node-2","node-3"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

6. Modify the server parameters, otherwise an error will be reported during startup

1,Elasticsearch Use a large number of file descriptors or file handles. Running out of file descriptors can be catastrophic and may result in data loss.
Make sure that the will run Elasticsearch The limit on the number of user open file descriptors increased to 65536 or higher.
/etc/security/limits.conf take nofile Set to 65535

2,Elasticsearch Many thread pools are used for different types of operations. It is important to be able to create new threads when needed.
ensure Elasticsearch The number of threads that users can create is at least 4096.
You can set ulimit -u 4096 with root start-up Elasticsearch, Or through /etc/security/limits.conf set up  nproc by 4096


#terms of settlement

vi /etc/security/limits.conf,Add the following:
* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096

After that, restart the server to take effect


3,Elasticsearch Default use  mmapfs A directory stores its index. Default operating system pair mmap The limit of the count may be too low, which may cause out of memory exceptions. sysctl -w vm.max_map_count=262144

#terms of settlement:
stay/etc/sysctl.conf Add a line at the end of the file
vm.max_map_count=262144
 implement/sbin/sysctl -p Effective immediately

7. Switch the user to the installation directory and start the three nodes respectively

./bin/elasticsearch
#Background start
./bin/elasticsearch -d

8. Test results, browser input

http://192.168.139.160:9200/_cat/nodes?pretty

  Docker installation

1. Pull image file

[root@bogon ~]# docker pull elasticsearch:7.14.2

2. Create mount directory and authorize

[root@localhost ~]# mkdir -p /data/es/{conf,data,logs,plugins}
#to grant authorization
[root@localhost ~]# chmod 777 -R /data/

3. The configuration file only needs to be modified

node.name: node-1,node.name: node-2,node.name: node-3,

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#Cluster name. The names of the three nodes are the same
cluster.name: my-es
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#The name of each node is different
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#Data directory
#path.data: /usr/local/es/data
#
# Path to log files:
#Log directory
#path.logs: /usr/local/es/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#Current host ip
network.host: 0.0.0.0
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#External port number
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#Cluster discovery, the default port is 9300
discovery.seed_hosts: ["192.168.139.160","192.168.139.161", "192.168.139.162"]
#Cluster node name
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["node-1", "node-2","node-3"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

4. Perform step 6 above before creating a docker container

docker run --name elasticsearch  --net=host \
 -v /data/es/conf/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
 -v /data/es/data:/usr/share/elasticsearch/data \
 -v /data/es/logs:/usr/share/elasticsearch/logs \
 -v /data/es/plugins:/usr/share/elasticsearch/plugins \
 -d elasticsearch:7.14.2

  5. Verify

http://192.168.139.160:9200/_cat/nodes?pretty

 

3, Configure Chinese word splitter

Download the word splitter, corresponding to the es version
https://github.com/medcl/elasticsearch-analysis-ik/releases 

Traditional way

1. Create ik directory

[root@bogon plugins]# mkdir -p  /usr/local/elasticsearch/plugins/ik

2. Unzip the downloaded word segmentation to the ik directory

[root@bogon ik]# unzip /root/elasticsearch-analysis-ik-7.15.2.zip -d /usr/local/elasticsearch/plugins/ik/

3. Start elasticsearch to validate the word breaker  

  Docker mode

1. Extract it to the mount directory

[root@bogon plugins]# unzip /root/elasticsearch-analysis-ik-7.14.2.zip -d /data/es/plugins/ik

  2. Restart Docker

[root@bogon plugins]# docker start elasticsearch 

3. The inspection method is the same as above

4, Integrate SpringBoot

maven dependency

  <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>
        <!--swagger rely on -->
        <dependency>
            <groupId>io.springfox</groupId>
            <artifactId>springfox-boot-starter</artifactId>
            <version>3.0.0</version>
        </dependency>
    </dependencies>

Core code

package com.xiaojie.es.service;

import com.xiaojie.es.entity.User;
import com.xiaojie.es.mapper.UserMapper;
import com.xiaojie.es.util.ElasticSearchUtils;
import org.apache.commons.lang3.RandomStringUtils;
import org.apache.commons.lang3.RandomUtils;
import org.apache.commons.lang3.StringUtils;
import org.elasticsearch.common.Strings;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.FetchSourceContext;
import org.elasticsearch.search.sort.SortOrder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.List;
import java.util.Map;

/**
 * @Description:
 * @author: yan
 * @date: 2021.11.30
 */
@Service
public class UserService {
    @Autowired
    private UserMapper userMapper;
    @Autowired
    private ElasticSearchUtils elasticSearchUtils;

    //Add user
    public void add() throws IOException {
//        elasticSearchUtils.createIndex("user");
        for (int i = 0; i < 100; i++) {
            User user = new User();
            String chars = "11 On June 29, in the women's doubles final of the 2021 World Table Tennis Championships in Houston, the United States, Chinese team sun yingsha Wang Manyu beat Japanese team ITO Meicheng Zaotian Xina 3-0 to win the championship";
            user.setName(RandomStringUtils.random(3, chars));
            user.setAge(RandomUtils.nextInt(18, 40));
            userMapper.add(user);
            //Add to es
            elasticSearchUtils.addData(user, "user");
        }
    }

    /*
     *
     * @todo Query user
     * @author yan
     * @date 2021/11/30 16:24
     * @return java.util.List<java.util.Map<java.lang.String,java.lang.Object>>
     */
    public List<Map<String, Object>> search() throws IOException {
        //Build query criteria
        BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
        //Precise query
        //boolQueryBuilder.must(QueryBuilders.wildcardQuery("name", "Zhang San");
        // Fuzzy query
        boolQueryBuilder.filter(QueryBuilders.wildcardQuery("name", "king"));
        // Range query from: equivalent to a closed interval; gt: equivalent to open interval (>) GTE: equivalent to closed interval (> =) lt: open interval (<) LTE: closed interval (< =)
        boolQueryBuilder.filter(QueryBuilders.rangeQuery("age").from(18).to(32));
        SearchSourceBuilder query = new SearchSourceBuilder();
        query.query(boolQueryBuilder);
        //All fields to be queried are queried by default
        String fields = "";
        //Fields to highlight
        String highlightField = "name";
        if (StringUtils.isNotBlank(fields)) {
            //Query only specific fields. If all fields need to be queried, this item is not set.
            query.fetchSource(new FetchSourceContext(true, fields.split(","), Strings.EMPTY_ARRAY));
        }
        //Paging parameter, equivalent to pageNum
        Integer from = 0;
        //Paging parameter, equivalent to pageSize
        Integer size = 10;
        //Set paging parameters
        query.from(from);
        query.size(size);
        //Set the sorting field and sorting method. Note: the field is of text type and needs to be spliced with. keyword
        //query.sort("age", SortOrder.DESC);
        query.sort("name" + ".keyword", SortOrder.ASC);
        return elasticSearchUtils.searchListData("user", query, highlightField);
    }
}

Full code: Spring boot: spring boot integrates redis, message oriented middleware and other related codes   es module of

reference resources:

Elasticsearch Chinese document | elasticsearch Technology Forum

https://www.cnblogs.com/wqp001/p/14478900.html

Tags: Java Docker ElasticSearch ELK

Posted on Tue, 30 Nov 2021 11:10:03 -0500 by pradeepss