Real time monitoring system based on flow computing Oceanus and Elasticsearch Service

This paper describes how to use Tencent cloud big data component to complete the design and implementation of the real-time monitoring system. By collecting and analyzing the CPU, memory and other resource consumption data of cloud server (CVM) and its App applications in real time, we can effectively ensure the stable operation of the system. The use of cloud components such as Kafka, Flink and ES greatly reduces the investment of development, operation and maintenance personnel.

Scheme description

1. General

Combined with Tencent cloud CKafka, stream computing Oceanus (Flink), Grafana, etc., this scheme collects the system and application monitoring data in real time through the Filebeat tool in Beats and transmits it to CKafka, then connects the CKafka data to stream computing Oceanus (Flink), outputs it to Elasticsearch after simple business logic processing, and finally queries the results through Kibana page, And use cloud Grafana to monitor CVM or business application indicators.

Beats is a lightweight log collector. At present, beats contains a variety of tools: Metricbeat, Filebeat, Heartbeat, etc. Each Beat has a simple task: collecting logs or data and sending it to the output destination.

2. Scheme structure

Pre preparation

Before implementing this scheme, please ensure that the corresponding big data components have been created and configured.

1. Create a private network VPC

The private network (VPC) is a logically isolated network space that you define on Tencent cloud. When building CKafka, stream computing Oceanus, Elasticsearch cluster and other services, you should choose and recommend the same VPC. Please refer to for specific creation steps Help documentation.

2. Create a Ckafka instance

Kafka recommends to select the latest version 2.4.1, which has good compatibility with Filebeat collection tool.

After the purchase, create Kafka topic: topic app info

3. Create an Oceanus cluster for flow computing

Stream computing Oceanus is a powerful tool for real-time analysis of big data product ecosystem. It is an enterprise level real-time big data analysis platform based on Apache Flink with the characteristics of one-stop development, seamless connection, sub second delay, low cost, security and stability. Stream computing Oceanus aims to maximize the value of enterprise data and accelerate the construction process of real-time digitization of enterprises.

Create a cluster on the cluster management - > new cluster page of the flow computing Oceanus console. Please refer to the specific steps Help documentation.

4. Create Elasticsearch instance

On the Elasticsearch console, click [new] in the upper left corner to create a cluster. Please refer to the specific steps Help documentation.

5. Create an independent Grafana resource

In the internal test of independent Grafana, it is required to Grafana administration page Display the realization of business monitoring indicators through separate purchase. When purchasing, you still need to select the same VPC network as other resources.

6. Install and configure Filebeat

Filebeat is a lightweight log data collection tool that collects log information by monitoring files at specified locations in CVM machines. You can install filebeat in two ways.

  • Installation method 1: Download Filebeat and install it Filebeat download address
  • Installation method 2: use the Filebeat provided in Elasticsearch management page -- > beats management.

In this example, method 1 is adopted. Download to CVM and configure Filebeat. Add the following configuration items in filebeat.yml file:

# Monitoring log file configuration
- type: log
  enabled: true
  paths:
    - /tmp/test.log
# Monitoring data output item configuration
output.kafka:
  version: 2.0.0                           # kafka version number
  hosts: ["xx.xx.xx.xx:xxxx"]              # Please fill in the actual IP address + port
  topic: 'topic-app-info'                  # Please fill in the actual topic

Please configure the corresponding filebeat.yml file according to the actual business requirements. Refer to Filebeat official documentation.

Note: the example uses the Ckafka version of 2.4.1. Here, the configuration version is 2.0.0. "Error Kafka Kafka / client. Go: 341 Kafka (topic = topic app info): dropping invalid message" may appear on the version

Scheme implementation

Next, it introduces how to realize personalized monitoring through flow computing Oceanus through a case.

1. Filebeat collects data

(1) Enter the root directory of Filebeat and start Filebeat for data collection. In the example, the CPU, memory and other information displayed in the top command are collected. You can also collect the log of jar application, JVM usage, listening port, etc. for details, refer to Filebeat official website.

# filebeat startup
./filebeat -e -c filebeat.yml

# Write the monitoring system information to the test.log file
top -d 10 >>/tmp/test.log

(2) Enter the CKafka page, click [message query] on the left to query the corresponding topic message and verify whether data is collected.

Format of data collected by Filebeat:

{
	"@timestamp": "2021-08-30T10:22:52.888Z",
	"@metadata": {
		"beat": "filebeat",
		"type": "_doc",
		"version": "7.14.0"
	},
	"input": {
		"type": "log"
	},
	"host": {
		"ip": ["xx.xx.xx.xx", "xx::xx:xx:xx:xx"],
		"mac": ["xx:xx:xx:xx:xx:xx"],
		"hostname": "xx.xx.xx.xx",
		"architecture": "x86_64",
		"os": {
			"type": "linux",
			"platform": "centos",
			"version": "7(Core)",
			"family": "redhat",
			"name": "CentOSLinux",
			"kernel": "3.10.0-1062.9.1.el7.x86_64",
			"codename": "Core"
		},
		"id": "0ea734564f9a4e2881b866b82d679dfc",
		"name": "xx.xx.xx.xx",
		"containerized": false
	},
	"agent": {
		"name": "xx.xx.xx.xx",
		"type": "filebeat",
		"version": "7.14.0",
		"hostname": "xx.xx.xx.xx",
		"ephemeral_id": "6c0922a6-17af-4474-9e88-1fc3b1c3b1a9",
		"id": "6b23463c-0654-4f8b-83a9-84ec75721311"
	},
	"ecs": {
		"version": "1.10.0"
	},
	"log": {
		"offset": 2449931,
		"file": {
			"path": "/tmp/test.log"
		}
	},
	"message": "(B[m16root0-20000S0.00.00:00.00kworker/1:0H(B[m[39;49m[K"
}

2. Create flow calculation Oceanus job

In Oceanus, the data accessed by Kafka is processed and stored in Elasticsearch.

(1) Define Source

Construct the Flink Source according to the format of json message in Filebeat.

 CREATE TABLE DataInput (
     `@timestamp` VARCHAR,
     `host`       ROW<id VARCHAR,ip ARRAY<VARCHAR>>,
     `log`        ROW<`offset` INTEGER,file ROW<path VARCHAR>>,
     `message`    VARCHAR
 ) WITH (
     'connector' = 'kafka',         -- Optional 'kafka','kafka-0.11'. Pay attention to select the corresponding built-in  Connector
     'topic' = 'topic-app-info',    -- Replace with the you want to consume Topic
     'scan.startup.mode' = 'earliest-offset',            -- Can be latest-offset / earliest-offset / specific-offsets / group-offsets Any kind of
     'properties.bootstrap.servers' = '10.0.0.29:9092',  -- Replace with your Kafka Connection address
     'properties.group.id' = 'oceanus_group2',           -- Required parameter, Be sure to specify Group ID
     -- Define data format (JSON format)
     'format' = 'json',
     'json.ignore-parse-errors' = 'true',     -- ignore JSON Structure Parsing exception
     'json.fail-on-missing-field' = 'false'   -- If set to true, If a missing field is encountered, an error will be reported and set to false The missing field is set to null
 );

(2) Define Sink

CREATE TABLE es_output (
    `id`         VARCHAR,
    `ip`         ARRAY<VARCHAR>,
    `path`       VARCHAR,
    `num`        INTEGER,
    `message`    VARCHAR,
    `createTime` VARCHAR
) WITH (
    'connector.type' = 'elasticsearch',             -- Output to Elasticsearch
    'connector.version' = '6',                      -- appoint Elasticsearch Version of, for example '6', '7'. 
    'connector.hosts' = 'http://10.0.0.175:9200 '-- connection address of elasticsearch
    'connector.index' = 'oceanus_test2',            -- Elasticsearch of Index name
    'connector.document-type' = '_doc',             -- Elasticsearch of Document type
    'connector.username' = 'elastic',  
    'connector.password' = 'yourpassword', 
    'update-mode' = 'upsert',                       -- Optional without primary key 'append' Mode, or with primary key 'upsert' pattern     
    'connector.key-delimiter' = '$',                -- Optional parameters, Join character of composite primary key (The default is _ Symbol, for example key1_key2_key3)
    'connector.key-null-literal' = 'n/a',           -- Primary key is null The default value is 'null'
    'connector.failure-handler' = 'retry-rejected', -- Optional error handling. Optional 'fail' (Throw exception)'ignore'(Ignore any errors)'retry-rejected'((retry)

    'connector.flush-on-checkpoint' = 'true',       -- Optional parameters, Batch writes are not allowed during snapshot( flush), Default to true
    'connector.bulk-flush.max-actions' = '42',      -- Optional parameters, Maximum number of pieces per batch
    'connector.bulk-flush.max-size' = '42 mb',      -- Optional parameters, Cumulative maximum size per batch (Only support mb)
    'connector.bulk-flush.interval' = '60000',      -- Optional parameters, Interval between bulk writes (ms)
    'connector.connection-max-retry-timeout' = '1000',    -- Maximum timeout per request (ms)                                                  
    'format.type' = 'json'                          -- Output data format, Currently only supported 'json'
);

(3) Business logic

INSERT INTO es_output
SELECT 
  host.id       AS `id`,
  host.ip       AS `ip`,
  log.file.path AS `path`,
  log.`offset`  AS `num`,
  message,
  `@timestamp`  AS `createTime`
from DataInput;

(4) Operation parameters

[built in connector] Select flick connector elastic search6 and flick connector Kafka

Note: the new version of Flink 1.13 cluster does not require the user to select the built-in Connector. The old version of cluster requires the user to select the corresponding version of Connector

(5) ES data query

Query data on Kibana page of ES console, or enter CVM of the same subnet, and use the following command to query:

# Query the index username:password. Please replace with the actual account password
curl -XGET -u username:password http://xx.xx.xx.xx:xxxx/oceanus_test2/_search -H 'Content-Type: application/json' -d'
{
    "query": { "match_all": {}},
    "size":  10
}
'

For more access methods, please refer to Access ES cluster.

3. Business indicator monitoring

The application business data collected through Filebeat has been processed by Oceanus service and stored in ES. The business data can be monitored through ES+Grafana.

(1) Grafana configures es the ES data source. Enter the in grayscale publishing Grafana console , enter the newly created Grafana service, find the Internet address, open and log in. The Grafana account is admin. After logging in, click [Configuration], click [Add Source], search elasticsearch, fill in relevant ES instance information, and add a data source.

(2) Click [Dashboards] on the left, click [Manage], and click [New Dashboard] in the upper right corner to create a new panel and edit the panel.

(3) The display effect is as follows:

  • Real time monitoring of total data writing: monitor the total data written to the data source;
  • Real time monitoring of data source: monitor the amount of data written from a specific log;
  • Field average value monitoring: monitors the average value of a field;
  • Num field maximum value monitoring: monitor the maximum value of num field;

Note: this section is only an example and has no actual business meaning

summary

In this scheme, the CVM system data is collected in real time by Filebeat tool and stored in CKafka. The data is extracted, cleaned and converted by flow calculation Oceanus and stored in ES. Finally, the data in ES is monitored and displayed in real time by Grafana. It should be noted that:

  • There is no strict correspondence between the version of Ckafka and the open source version of Kafka. In the scheme, CKafka 2.4.1 and open source Filebeat-1.14.1 can be debugged successfully.
  • The Promethus service in cloud monitoring has been embedded with the Grafana monitoring service. However, the user-defined data source is not supported. The embedded Grafana can only access Promethus. The ES data access to Grafana can only be completed by using the independent internal test Grafana.

Posted on Thu, 18 Nov 2021 01:27:06 -0500 by Chris_Evans