Ambari 2.7.0 + hdp3.1.4.0 installation, hdfs data backup and recovery, hive data backup and recovery, hbase data backup and recovery

Catalog

1 Ambari + HDP offline installation
1.1 INTRODUCTION
1.1.1 introduction to ambari
1.1.2 HDP
1.1.3 HDP-UTILS
1.2 address of ambari official website
1.3 Ambari and HDP Downloads
1.4 system requirements
1.4.1 software requirements
1.5 modify the maximum number of open files
1.6 cluster node planning
1.7 firewall settings
1.8 turn off selinux
1.9 install jdk
1.10 setting hostname
1.11 set up Alibaba open source image yum source
1.12 install time synchronization service (ntp)
1.13 install mysql
1.13.1 uninstall the original mysql
1.13.2 preliminary preparation
1.13.3 use yum command to complete the installation
1.13.4 login
1.13.5 you need to change permissions to connect to MYSQL database remotely
1.14 create corresponding users and DB in mysql database
1.14.1 create ambari database and user name and password of the database
1.14.2 create hive database and user name and password of hive database
1.14.3 user name and password to create oozie database and oozie database
1.14.4 when using ranger, you need to configure the database
1.14.5 configure SAM and schema registry metadata stores in MySQL
1.14.6 Druid and Superset need relational data store to store metadata. Use the following command to create
1.14.7 download MySQL connection Java (executed on all three servers):
1.15 install Ambari on bigdata1 machine
1.15.1 install yum related tools
1.15.2 install Apache httpd
1.16 configure local Repo
1.16.1 configuring Ambari
1.16.2 configure HDP and HDP-UTILS
1.16.3 configure repo of HDP-GPL
1.16.4 distribution of Ambari.repo and HDP.repo
1.16.5 generate local source
1.17 install ambari server
1.17.1 Bigdata1 node installation
1.17.2 log in to MySQL and initialize the table of ambari
1.17.3 start ambari server
1.18 install Agent
1.19 visit Ambari web page
1.20 start cluster installation
1.21 hadoop HA configuration
1.22 other knowledge points
1.22.1 start, view status and stop of ambari server
1.22.2 data configuration detection during ambari startup
1.22.3 precautions when modifying the restart of yarn configuration
1.22.4 get Ambari's operation log:
1.1. Ambari unloading
1.1.1. Close all components
1.1.2. Close ambari server and ambari agent
1.1.3. yum delete all Ambari components
1.1.4. Delete various documents
1.1.5. Clean up the database
1.1.6. Reassemble ambari
1.23 other configurations
1.23.1 setting dfs permissions
1.23.2 Cannot modify mapred.job.name at runtime of hive
1.23.3 modify the file format in hive-site.xml to solve the problem of error in importing SQOOP
1.23.4 /usr/hdp/3.1.4.0-315/accumulo does not exist! Accumulo imports will fail
1.23.5 solutions to problems related to lack of access of spark to hive database
1.23.6 sqoop imports MySql data into hive, which is stuck
1.23.7 kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!
1.23.8 Spark cannot read the attributes in hive3.x:
1.23.9 sqoop is imported into mysql data hive, but the problem that spark SQL cannot access is solved
1.23.10 configure hive to integrate hbase (defined in hive's custom configuration)
1.23.11 error communicating with the metadata (state = 42000, code = 10280) after hive data migration
1.23.12 keeperrorcode = nonode for / HBase / hbaseid problem
1.23.13 configure virtual vcores
1.23.14 when sqoop imports data from oracle, the following problems occur due to the ojdbc version
1.23.15 solve the permission problem of hive add jar
1.23.16 problems in integration of Ogg and Ambari Hbase 2.0.2
1.23.17 configure hive.aux.jars.path on hive interface
1.23.18 can't start with home(s) in ambari]
1.23.19 Caused by: java.lang.OutOfMemoryError: unable to create new native thread
1.23.20 /usr/bin/hdp-select set zookeeper-server 3.1.4.0-315' returned 1. symlink target /usr/hdp/current/zookeeper-server for zookeeper already exists and it is not a symlink
1.23.21 set the number of copies of Block replication to 3
1.23.22 Error: java.lang.IllegalArgumentException: KeyValue size too large
1.23.23 solve 475175982519323-1 / - ext-10000 / 000000_asthe file is not owned by hive and load data is also not ran as hive
1.24 Hive1.1.0 upgrade to hive3.x
1.24.1 find two versions of hive
1.24.2 upgrade hive library 1.1.0 to hive 3.1.0
1.25 backup and recovery of HDFS data
1.26 hbase data backup and recovery
1.26.1 Export process
1.26.2 Import process
1.26.3 count the number of hbase table rows
1.27 Hive and ElasticSearch integration (on 3 servers)
1.27.1 configure auxlib
1.27.2 verify the configuration is correct
1.28 oracle golden gate and Ambari hbase integration
1.29 references

1Ambari + HDP offline installation

1.1 INTRODUCTION

1.1.1 introduction to abari

Like open-source software such as Hadoop, Ambar is also a project in the Apache Software Foundation, and it is a top-level project. As far as Ambari's function is concerned, it is to create, manage and monitor Hadoop clusters. But here, Hadoop refers to the whole Hadoop ecosystem (such as Hive,Hbase,Sqoop,Zookeeper, etc.), not just Hadoop. In a word, Ambari is a tool to make Hadoop and related big data software easier to use.

Ambari itself is also a distributed architecture software, which is mainly composed of two parts: Ambari Server and Ambari Agent. Simply put, user Ambari Server informs Ambari Agent to install corresponding software; Agent will regularly send status of each software module of each machine to Ambari Server, and finally these status information will be displayed in ambari GUI, which is convenient for users to understand various status of cluster and carry out corresponding maintenance.

1.1.2HDP

HDP is the software stack of hortonworks, which contains all software projects of hadoop ecosystem, such as HBase,Zookeeper,Hive,Pig, etc.

1.1.3HDP-UTILS

HDP-UTILS is a tool class library.

1.2 address of ambari official website

https://docs.cloudera.com/HDPDocuments/Ambari/Ambari-2.4.2.0/index.html

Click Ambari to enter:

Click Installation

Click Apache Ambari Installation:

1.3 ambari and HDP Downloads

Click Ambari Repositories:
Namely: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/bk_ambari-installation/content/ambari_repositories.html

Click HDP3.1.4 Repositories to download HDP related content:
Namely: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.4.0/bk_ambari-installation/content/hdp_314_repositories.html

1.4 system requirements

1.4.1 software requirements

1.5 modify the maximum number of open files

Related configurations are:

Modify the maximum number of Linux opened by the Linux operating system, and modify the content:
The modified file is / etc/security/limits.conf

* soft nofile 162144
 *hard nofile 162144 ා change the lock memory limit of linux
 es soft memlock unlimited
es hard memlock unlimited 
2. Modify the maximum number of threads in Linux
 The modified file is / etc/security/limits.d/20-nproc.conf
* soft nproc unlimited
root soft nproc unlimited
 3. Modify the settings of / etc/sysctl.conf file under Linux for all four machines
 (modified under root user)
Change the maximum memory area that linux can have. Add or modify the following:
Vm.max'map'count = 162144 change linux to disable swapping, add or modify as follows:
vm.swappiness = 1

And execute the following command:
sysctl -p

1.6 cluster node planning

1.7 firewall settings

systemctl status firewalld.service	        # View the status of the firewall
systemctl stop firewalld.service		    # Turn off firewall
systemctl disable firewalld.service		    # Set startup not to start
systemctl is-enabled firewalld.service		# Check whether the firewall service is set to power on and start

1.8 turn off selinux

https://www.linuxidc.com/Linux/2016-11/137723.htm

1.9 install jdk

Install jdk8 + (omitted here)

1.10 setting hostname

[root@bigdata2 installed]# cat /etc/hosts
192.168.106.128    bigdata1
192.168.106.129    bigdata2
192.168.106.130    bigdata3

1.11 set up Alibaba open source image yum source

Visit: https://developer.aliyun.com/mirror

Click to enter CentOS image: https://developer.aliyun.com/mirror/centos?spm=a2c6h.13651102.0.0.53322f70HdVnlQ


Run the following command on Linux:

mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
yum makecache

//The following can not be executed
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo

1.12 install time synchronization service (ntp)

Keep the time on Linux consistent with the network time, and avoid some time inconsistency problems (must be executed under the root command)

[root@bigdata1 ~]# yum install -y ntp
[root@bigdata1 ~]# ntpdate pool.ntp.org && hwclock –w

1.13 install mysql

1.13.1 uninstall the original mysql

Please refer to: https://blog.csdn.net/zhwyj1019/article/details/80274269. The main steps are as follows:

See if there are any installed mysql
rpm -qa | grep -i mysql // View command 1
yum list install mysql* // View Command 2

//Uninstall mysql installation package
yum remove mysql mysql-server mysql-libs compat-mysql51
yum remove mysql-community-release
rpm -e --nodeps mysql-community-libs-5.7.22-1.el7.x86_64
rpm -e –nodeps mysql57-community-release-el7-11.noarch

1.13.2 preliminary preparation

There is no mysql in the yum source of centoss, so you need to download the yum repo configuration file on the official website of mysql.
Download command:

wget https://dev.mysql.com/get/mysql57-community-release-el7-9.noarch.rpm

And then install repo

rpm -ivh mysql57-community-release-el7-9.noarch.rpm

After execution, two repo files mysql-community.repo mysql-community-source.repo will be generated in the directory / etc/yum.repos.d/

1.13.3 use yum command to complete the installation

Note: you must enter the / etc/yum.repos.d/ directory before executing the following script

1.13.3.1 installation command
yum install -y mysql-server
1.13.3.2 start mysql
systemctl start mysqld #Start MySQL
1.13.3.3 obtain the temporary password during installation (this password is used for the first login)

Be sure to start mysql first

[root@bigdata1 yum.repos.d]# grep 'temporary password' /var/log/mysqld.log
2019-12-18T06:14:31.848884Z 1 [Note] A temporary password is generated for root@localhost: hR_is6(nhhtt
[root@bigdata1 yum.repos.d]#

1.13.4 login

Login method:

[root@bigdata1 yum.repos.d]# mysql -u root -p
Enter password:      #Here's the code from above
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.28

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER 'root'@'localhost' IDENTIFIED BY '123456';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements
mysql> use mysql
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
mysql> update user set authentication_string = PASSWORD('123456') where user = 'root';
ERROR 1046 (3D000): No database selected
//If the above phenomenon occurs, the solution is:
mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)

mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.00 sec)

mysql> alter user 'root'@'localhost' identified by 'Admin123456';
Query OK, 0 rows affected (0.00 sec)

mysql>
1.13.4.1 modify mysql configuration

Modify / etc/my.cnf (add the configuration of the last line)

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

collation_server=utf8_general_ci
character_set_server=utf8
default-storage-engine=INNODB
validate_password_policy=0
validate_password_length=1

[client]
default_character-set=utf8

Namely:

Restart mysql:

systemctl restart mysqld
24.13.3.1 for validate_password_policy/validate_password_length, the following values are available (this chapter is not regarded as Ambari's document)


The default value is 1, i.e. MEDIUM, so the password set at the beginning must conform to the length, and must contain numbers, lowercase or uppercase letters, special characters.
Sometimes, I just want to test for myself, and I don't want the password to be so complicated. For example, I just want to set the root password to 123456.
Two global parameters must be modified:
First, modify the value of the validate ﹣ password ﹣ policy parameter

mysql> set global validate_password_policy=0;

In this way, the criteria for determining a password are based on the length of the password. This is determined by the validate? Password? Length parameter.

mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          8 |
+----------------------------+
1 row in set (0.01 sec)
validate_password_length The default value of the parameter is 8. It has a minimum value limit. The minimum value is:
validate_password_number_count
+ validate_password_special_char_count
+ (2 * validate_password_mixed_case_count)

Where, validate ﹣ password ﹣ number ﹣ count specifies the length of data in the password, validate ﹣ password ﹣ special ﹣ char ﹣ count specifies the length of special characters in the password, and validate ﹣ password ﹣ mixed ﹣ case ﹣ count specifies the length of large and small letters in the password.

For these parameters, the default value is 1, so the minimum value of validate ﹣ password ﹣ length is 4. If you explicitly specify that the value of validate ﹣ password ﹣ length is less than 4, though no error will be reported, the value of validate ﹣ password ﹣ length will be set to 4. As follows:

mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          8 |
+----------------------------+
1 row in set (0.01 sec)

mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.00 sec)

mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          4 |
+----------------------------+
1 row in set (0.00 sec)

If you modify validate_password_number_count, validate_password_special_char_count,

validate_password_mixed_case_count Any value in, then validate_password_length Dynamic changes will be made.
mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          4 |
+----------------------------+
row in set (0.00 sec)

mysql> select @@validate_password_mixed_case_count;
+--------------------------------------+
| @@validate_password_mixed_case_count |
+--------------------------------------+
|                                    1 |
+--------------------------------------+
row in set (0.00 sec)

mysql> set global validate_password_mixed_case_count=2;
Query OK, 0 rows affected (0.00 sec)

mysql> select @@validate_password_mixed_case_count;
+--------------------------------------+
| @@validate_password_mixed_case_count |
+--------------------------------------+
|                                    2 |
+--------------------------------------+
row in set (0.00 sec)

mysql> select @@validate_password_length;
+----------------------------+
| @@validate_password_length |
+----------------------------+
|                          6 |
+----------------------------+
row in set (0.00 sec)

Of course, the precondition is that the validate ﹣ password plug-in must be installed, and MySQL 5.7 is installed by default.
So how to verify whether the validate? Password plug-in is installed? You can check the following parameters, if not installed, the output will be empty.

mysql> SHOW VARIABLES LIKE 'validate_password%';
+--------------------------------------+-------+
| Variable_name                        | Value |
+--------------------------------------+-------+
| validate_password_dictionary_file    |       |
| validate_password_length             | 6     |
| validate_password_mixed_case_count   | 2     |
| validate_password_number_count       | 1     |
| validate_password_policy             | LOW   |
| validate_password_special_char_count | 1     |
+--------------------------------------+-------+
rows in set (0.00 sec)

1.13.5 you need to change permissions to connect to MYSQL database remotely

You can confirm this by:

mysql -u root -p     #Next, enter the password: Admin123456
mysql> alter user 'root'@'localhost' identified by 'Admin123456';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on *.* to 'root'@'%' identified by 'Admin123456' with grant option;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql>

To view the character set:

mysql> show variables like 'character_set_%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.01 sec)

mysql>

1.14 create corresponding users and DB in mysql database

1.14.1 create ambari database and user name and password of the database

mysql> set global validate_password_policy=0;
mysql> set global validate_password_length=1;

mysql> create database ambari character set utf8;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE USER 'ambari'@'%'IDENTIFIED BY 'Admin123456';
Query OK, 0 rows affected (0.00 sec)

mysql> GRANT ALL PRIVILEGES ON ambari.* TO 'ambari'@'%';
Query OK, 0 rows affected (0.00 sec)

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.01 sec)

1.14.2 create hive database and user name and password of hive database

mysql> create database hive character set utf8;
Query OK, 1 row affected (0.00 sec)

mysql> CREATE USER 'hive'@'%'IDENTIFIED BY 'Admin123456';
Query OK, 0 rows affected (0.00 sec)

mysql> GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%';
Query OK, 0 rows affected (0.00 sec)

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

In order to solve the chaos in hive, the solution is:
Because we know that metastore supports database level and table level character set is latin1, we just need to change the character set of the corresponding comment from latin1 to utf-8. There are three places to use annotation: table, partition and view. The following modifications are divided into two steps:
Do the following:
(1) , enter the database Metastore and execute the following 5 SQL statements

① Modify table field notes and table notes

ALTER TABLE COLUMNS_V2 MODIFY COLUMN COMMENT VARCHAR(256) CHARACTER SET utf8;
ALTER TABLE TABLE_PARAMS MODIFY COLUMN PARAM_VALUE VARCHAR(4000) CHARACTER SET utf8;

② To modify a zone field annotation:

ALTER TABLE PARTITION_PARAMS MODIFY COLUMN PARAM_VALUE VARCHAR(4000) CHARACTER SET utf8 ;
ALTER TABLE PARTITION_KEYS MODIFY COLUMN PKEY_COMMENT VARCHAR(4000) CHARACTER SET utf8;

③ Modify index notes:
alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;

If not, modify the contents of /etc/my.cnf in mysql, add as follows:

[mysqld]
character-set-server=utf8 
[client]
default-character-set=utf8 
[mysql]
default-character-set=utf8

Pay attention to the order, and then restart mysql

systemctl restart mysqld.service

Or:

sudo service mysqld restart

1.14.3 user name and password to create oozie database and oozie database

mysql> create database oozie character set utf8;
Query OK, 1 row affected (0.00 sec)
 
mysql> CREATE USER 'oozie'@'%'IDENTIFIED BY 'Admin123456';
Query OK, 0 rows affected (0.00 sec)
 
mysql>  GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'@'%';
Query OK, 0 rows affected (0.00 sec)
 
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

1.14.4 when using ranger, you need to configure the database

CREATE USER 'rangerdba'@'localhost' IDENTIFIED BY 'Admin123456';

GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'localhost';

CREATE USER 'rangerdba'@'%' IDENTIFIED BY 'Admin123456';

GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'%';

GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'localhost' WITH GRANT OPTION;

GRANT ALL PRIVILEGES ON *.* TO 'rangerdba'@'%' WITH GRANT OPTION;

FLUSH PRIVILEGES;

Note that you must configure it in my.cnf of mysql when installing the anger:

log_bin_trust_function_creators=1

After the installation of Ranger, reset the value of log bin trust function creators to the original setting. This parameter is only needed during the installation of Ranger.

1.14.5 configure SAM and schema registry metadata stores in MySQL

create database registry;
create database streamline;

Create user accounts for Registry and SAM, and assign a new password using IDENTIFIED BY:

CREATE USER 'registry'@'%' IDENTIFIED BY 'Admin123456';
CREATE USER 'streamline'@'%' IDENTIFIED BY 'Admin123456';

Empowerment:

GRANT ALL PRIVILEGES ON registry.* TO 'registry'@'%' WITH GRANT OPTION ;
GRANT ALL PRIVILEGES ON streamline.* TO 'streamline'@'%' WITH GRANT OPTION;

Submission of affairs

commit;

1.14.6 druid and Superset need relational data store to store metadata. Use the following command to create

CREATE DATABASE druid DEFAULT CHARACTER SET utf8;
CREATE DATABASE superset DEFAULT CHARACTER SET utf8;

//Create user and assign password
CREATE USER 'druid'@'%' IDENTIFIED BY 'Admin123456';
CREATE USER 'superset'@'%' IDENTIFIED BY 'Admin123456';

//Permissions
GRANT ALL PRIVILEGES ON *.* TO 'druid'@'%' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'superset'@'%' WITH GRANT OPTION;

//Submission
commit;

1.14.7 download MySQL connection Java (executed on all three servers):

yum -y install mysql-connector-java

Check the downloaded jar package to see if there is MySQL connector Java in the directory:

[root@bigdata1 ~]# ls -a /usr/share/java

###1.15 install Ambari on bigdata1 machine

1.15.1 install yum related tools

yum install yum-utils -y
yum repolist
yum install createrepo -y

1.15.2 install Apache httpd

Upload ambari, HDP and HDP-UTILS required for ambari installation to / home/software. As follows:

Install httpd online using yum.
[root@bigdata1 software]# yum install httpd -y

After the installation, the / var/www/html directory (equivalent to Tomcat's webapps directory) will be generated. Enter the / var/www/html directory and create ambari and hdp directories to store the installation files.

[root@bigdata1 software]# cd /home/software
[root@bigdata1 software]# mkdir /var/www/html/ambari
[root@bigdata1 software]# mkdir /var/www/html/hdp
[root@bigdata1 software]# mkdir /var/www/html/hdp/HDP-UTILS-1.1.0.22
[root@bigdata1 software]# mkdir /var/www/html/hdp/HDP-GPL-3.1.4.0
[root@bigdata1 software]# tar -zxvf ambari-2.7.4.0-centos7.tar.gz -C /var/www/html/ambari/
[root@bigdata1 software]# tar -zxvf HDP-3.1.4.0-centos7-rpm.tar.gz -C /var/www/html/hdp/
[root@bigdata1 software]# tar -zxvf HDP-UTILS-1.1.0.22-centos7.tar.gz -C /var/www/html/hdp/HDP-UTILS-1.1.0.22/
[root@bigdata1 software]# tar -zxvf HDP-GPL-3.1.4.0-centos7-gpl.tar.gz -C /var/www/html/hdp/HDP-GPL-3.1.4.0

To start the httpd service:

systemctl start httpd          # Start httpd
systemctl status httpd		   # View httpd status
systemctl enable httpd	       # Set httpd to start automatically

Default port 80, browser input: http://192.168.106.128/


1.16 configure local Repo

1.16.1 configuring Ambari

download

wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.4.0/ambari.repo -O /etc/yum.repos.d/ambari.repo

Modify the configuration file vim /etc/yum.repos.d/ambari.repo

#VERSION_NUMBER=2.7.4.0-118
[ambari-2.7.4.0]
#json.url = http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json
name=ambari Version - ambari-2.7.4.0
#baseurl=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.4.0
baseurl=http://192.168.106.128/ambari/ambari/centos7/
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.106.128/ambari/ambari/centos7/2.7.4.0-118/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

1.16.2 configure HDP and HDP-UTILS

Create configuration file: touch /etc/yum.repos.d/HDP.repo

Download repo (the download address can be found in the place where the tar package is downloaded):

wget -nv http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.0/hdp.repo -O /etc/yum.repos.d/HDP.repo

Modification content:

#VERSION_NUMBER=3.1.4.0-315
[HDP-3.1.4.0]
name=HDP Version - HDP-3.1.4.0
#baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.0
baseurl=http://192.168.106.128/hdp/HDP/centos7/
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.106.128/hdp/HDP/centos7/3.1.4.0-315/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1


[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
#baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7
baseurl=http://192.168.106.128/hdp/HDP-UTILS-1.1.0.22/
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.106.128/hdp/HDP-UTILS-1.1.0.22/HDP-UTILS/centos7/1.1.0.22/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

1.16.3 configure repo of HDP-GPL

wget -nv http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.1.4.0/hdp.gpl.repo -O /etc/yum.repos.d/hdp.gpl.repo

Modification:

[root@bigdata1 yum.repos.d]# cd /etc/yum.repos.d
[root@bigdata1 yum.repos.d]# vim hdp.gpl.repo

Modification:

#VERSION_NUMBER=3.1.4.0-315
[HDP-GPL-3.1.4.0]
name=HDP-GPL Version - HDP-GPL-3.1.4.0
#baseurl=http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.1.4.0
baseurl=http://192.168.106.128/hdp/HDP-GPL-3.1.4.0/HDP-GPL/centos7/
gpgcheck=1
#gpgkey=http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/3.1.4.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
gpgkey=http://192.168.106.128/hdp/HDP-GPL-3.1.4.0/HDP-GPL/centos7/3.1.4.0-315/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1

1.16.4 distribution of Ambari.repo and HDP.repo

Distribute ambari.repo HDP.repo to the same directory of each node:

[root@bigdata1 yum.repos.d]# scp ambari.repo HDP.repo root@bigdata2:$PWD
ambari.repo                                                                                               100%  516    33.7KB/s   00:00    
HDP.repo                                                                                                  100%  845    52.4KB/s   00:00    
[root@bigdata1 yum.repos.d]# scp ambari.repo HDP.repo root@bigdata3:$PWD
ambari.repo                                                                                               100%  516   130.5KB/s   00:00    
HDP.repo                                                                                                  100%  845   272.2KB/s   00:00    
[root@bigdata1 yum.repos.d]# scp hdp.gpl.repo root@bigdata2:$PWD
hdp.gpl.repo                                                                             100%  490    60.0KB/s   00:00    
[root@bigdata1 yum.repos.d]# scp hdp.gpl.repo root@bigdata3:$PWD
hdp.gpl.repo

1.16.5 generate local source

Use the createrepo command to create a yum local source (software warehouse), that is, to build an index for many rpm packages stored in a specific local location, describe the dependency information required by each package, and form metadata.

[root@bigdata1 yum.repos.d]# createrepo /var/www/html/hdp/HDP/centos7/
Spawning worker 0 with 51 pkgs
Spawning worker 1 with 50 pkgs
Spawning worker 2 with 50 pkgs
Spawning worker 3 with 50 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

[root@bigdata1 yum.repos.d]# createrepo /var/www/html/hdp/HDP-UTILS-1.1.0.22/
Spawning worker 0 with 4 pkgs
Spawning worker 1 with 4 pkgs
Spawning worker 2 with 4 pkgs
Spawning worker 3 with 4 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete


[root@bigdata1 yum.repos.d]# createrepo /var/www/html/ambari/ambari/centos7/
Spawning worker 0 with 4 pkgs
Spawning worker 1 with 3 pkgs
Spawning worker 2 with 3 pkgs
Spawning worker 3 with 3 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

[root@bigdata1 yum.repos.d]# createrepo /var/www/html/hdp/HDP-GPL-3.1.4.0/HDP-GPL/centos7
Spawning worker 0 with 1 pkgs
Spawning worker 1 with 1 pkgs
Spawning worker 2 with 1 pkgs
Spawning worker 3 with 1 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

Due to offline installation, if you want to use your own repo, the solution is:
Put the non repo file under / etc/yum.repos.d/ into backup, and then execute:

yum makecache

The final effect is as follows:

1.17 install ambari server

1.17.1Bigdata1 node installation

[root@bigdata1 ~]# yum install -y ambari-server
 Loaded plug-in: fastestmirror
HDP-3.1.4.0                                                                                                          | 2.9 kB  00:00:00     
HDP-UTILS-1.1.0.22                                                                                                   | 2.9 kB  00:00:00     
ambari-2.7.4.0                                                                                                       | 2.9 kB  00:00:00     
Loading mirror speeds from cached hostfile
 Resolving dependencies
 -->Checking transactions
 --->Software package ambari-server.x86_.0.2.7.4.0-118 will be installed
 -->Processing dependency PostgreSQL server > = 8.1, required by package ambari-server-2.7.4.0-118.x86_
-->Checking transactions
 --->Package postgresql-server.x86_.0.9.2.24-1.el7_willbe installed
 -->Processing dependency PostgreSQL LIBS (x86-64) = 9.2.24-1.el7_, required by package postgresql-server-9.2.24-1.el7_.x86_
-->Processing dependency PostgreSQL (x86-64) = 9.2.24-1.el7_, required by package postgresql-server-9.2.24-1.el7_.x86_
-->Processing dependency libpq.so.5()(64bit), which is required by package postgresql-server-9.2.24-1.el7_.x86_
-->Checking transactions
 --->Package postgresql.x86_.0.9.2.24-1.el7_willbe installed
 --->Package postgresql-libs.x86_.0.9.2.24-1.el7_willbe installed
 -->Resolve dependency complete

Dependency resolution

============================================================================================================================================
 Package schema version source size
============================================================================================================================================
Installing:
 ambari-server                        x86_64                    2.7.4.0-118                         ambari-2.7.4.0                    370 M
 Install for dependency:
 postgresql                           x86_64                    9.2.24-1.el7_5                      base                              3.0 M
 postgresql-libs                      x86_64                    9.2.24-1.el7_5                      base                              234 k
 postgresql-server                    x86_64                    9.2.24-1.el7_5                      base                              3.8 M

Transaction summary
============================================================================================================================================
Install 1 package (+ 3 dependent packages)

Total: 377 M
 Installation size: 470 M
Is this ok [y/d/N]: y
Downloading packages:
warning: /var/cache/yum/x86_64/7/ambari-2.7.4.0/packages/ambari-server-2.7.4.0-118.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 07513cad: NOKEY
 Retrieve the key from http://192.168.106.128/ambari/ambari/centos7/2.7.4.0-118/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
 Import GPG key 0x07513CAD:
 User ID: "Jenkins (HDP buildings) < Jenkin @ hortonworks. Com >"
 Fingerprint: df52 ed4f 7a3a 5882 c099 4c66 b973 3a7a 0751 3cad
 From: http://192.168.106.128/ambari/ambari/centos7/2.7.4.0-118/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
 Do you want to continue? [y/N]:y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing: postgresql-libs-9.2.24-1.el7'5.x86'64 1 / 4 
  Installing: postgresql-9.2.24-1.el7_.x86_2 / 4 
  Installing: postgresql-server-9.2.24-1.el7'5.x86'64 3 / 4 
  Installing: ambari-server-2.7.4.0-118.x86 ʄ 4 / 4 
  In verification: postgresql-server-9.2.24-1.el7_.x86_1 / 4 
  In verification: postgresql-libs-9.2.24-1.el7_.x86_2 / 4 
  In verification: ambari-server-2.7.4.0-118.x86 ʃ 64 3 / 4 
  In verification: postgresql-9.2.24-1.el7_.x86_4 / 4 

Installed:
  ambari-server.x86_64 0:2.7.4.0-118                                                                                                        

Installed as a dependency:
  postgresql.x86_64 0:9.2.24-1.el7_5        postgresql-libs.x86_64 0:9.2.24-1.el7_5        postgresql-server.x86_64 0:9.2.24-1.el7_5       

Complete!
[root@bigdata1 ~]#

1.17.2 log in to MySQL and initialize the table of ambari

[root@bigdata1 ~]# mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
sYour MySQL connection id is 3
Server version: 5.7.28 MySQL Community Server (GPL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| ambari             |
| hive               |
| mysql              |
| oozie              |
| performance_schema |
| sys                |
+--------------------+
7 rows in set (0.25 sec)

mysql> use ambari;
Database changed
mysql> show tables;
Empty set (0.00 sec)

mysql> source /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql
# On Ambari's host, install MySQL's JDBC driver. Then add it to Ambari. The command is as follows:
[root@bigdata1 java]# ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar   #After adding this sentence, hive can connect to the library successfully
Using python  /usr/bin/python
Setup ambari-server
Copying /usr/share/java/mysql-connector-java.jar to /var/lib/ambari-server/resources/mysql-connector-java.jar
If you are updating existing jdbc driver jar for mysql with mysql-connector-java.jar. Please remove the old driver jar, from all hosts. Restarting services that need the driver, will automatically copy the new jar to the hosts.
JDBC driver was successfully initialized.
Ambari Server 'setup' completed successfully.
[root@bigdata1 java]#

[root@bigdata1 ~]# ambari-server setup
Using python  /usr/bin/python
Setup ambari-server
Checking SELinux...
SELinux status is 'disabled'
Customize user account for ambari-server daemon [y/n] (n)? y       
Enter user account for ambari-server daemon (root):root       
Adjusting ambari-server permissions and ownership...
Checking firewall status...
Checking JDK...
[1] Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
[2] Custom JDK
==============================================================================
Enter choice (1): 2     #Define your own JDK path
WARNING: JDK must be installed on all hosts and JAVA_HOME must be valid on all hosts.
WARNING: JCE Policy files are required for configuring Kerberos security. If you plan to use Kerberos,please make sure JCE Unlimited Strength Jurisdiction Policy Files are valid on all hosts.
Path to JAVA_HOME: /home/installed/jdk1.8.0_161
Validating JDK on Ambari Server...done.
Check JDK version for Ambari Server...
JDK version found: 8
Minimum JDK version is 8 for Ambari. Skipping to setup different JDK for Ambari Server.
Checking GPL software agreement...
GPL License for LZO: https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Enable Ambari Server to download and install GPL Licensed LZO packages [y/n] (n)? y
Completing setup...
Configuring database...
Enter advanced database configuration [y/n] (n)? y
Configuring database...
==============================================================================
Choose one of the following options:
[1] - PostgreSQL (Embedded)
[2] - Oracle
[3] - MySQL / MariaDB
[4] - PostgreSQL
[5] - Microsoft SQL Server (Tech Preview)
[6] - SQL Anywhere
[7] - BDB
==============================================================================
Enter choice (1): 3
Hostname (localhost): bigdata1     #host name
Port (3306): 3306
Database name (ambari): ambari     #Database name
Username (ambari): ambari
Enter Database Password (bigdata):        #Enter the password of the self created database above
Re-enter password: 
Configuring ambari database...
Should ambari use existing default jdbc /usr/share/java/mysql-connector-java.jar [y/n] (y)? y
Configuring remote database connection properties...
WARNING: Before starting Ambari Server, you must run the following DDL directly from the database shell to create the schema: /var/lib/ambari-server/resources/Ambari-DDL-MySQL-CREATE.sql
Proceed with configuring remote database connection properties [y/n] (y)? y
Extracting system views...
ambari-admin-2.7.4.0.118.jar
....
Ambari repo file contains latest json url http://public-repo-1.hortonworks.com/HDP/hdp_urlinfo.json, updating stacks repoinfos with it...
Adjusting ambari-server permissions and ownership...
Ambari Server 'setup' completed successfully.
[root@bigdata1 ~]#

Modify the configuration file (the following configuration can not be executed because it already exists)

echo server.jdbc.driver.path=/usr/share/java/mysql-connector-java.jar >> /etc/ambari-server/conf/ambari.properties

1.17.3 start ambari server

If the startup fails, close ambari server stop and restart:

[root@bigdata1 ~]# ambari-server start
Using python  /usr/bin/python
Starting ambari-server
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start..............................
Server started listening on 8080

DB configs consistency check: no errors and warnings were found.
Ambari Server 'start' completed successfully.
[root@bigdata1 ~]#

1.18 install Agent

Install agent on 3 servers

yum -y install ambari-agent

Operation result:

Start ambari agent

[root@hadoop4 yum.repos.d]# ambari-agent start

1.19 visit Ambari web page

Default port, username: admin; password: admin http: / / XXX: 8080 / ා / login

1.20 start cluster installation

Click "LAUNCH INSTALL WIZARD" to start the installation wizard to create a cluster and give it a name

Select version:

Select: Use Local Repository

Then click the "Next" button. Configure nodes and keys


Host confirmation


Select big data component, select here


Then click "Next" to allocate nodes:


Then click Next to configure the slave and client


The passwords of the following databases are Admin123456




After the above settings are set, click next


Then click next to confirm the content:

Click DEPLOY to start the deployment.

1.21hadoop HA configuration

In HA configuration, it is better to stop the cluster first. Then select HDFS and click "Enable NameNode HA" in Actions, that is:



Check configuration information


Log in to bigdata1 to execute:

hbase(main):004:0* [root@bigdata1 bin]# 
[root@bigdata1 bin]# sudo su hdfs -l -c 'hdfs dfsadmin -safemode enter'
Safe mode is ON
[root@bigdata1 bin]# sudo su hdfs -l -c 'hdfs dfsadmin -saveNamespace'
Save namespace successful
[root@bigdata1 bin]#


Start HA configuration

Perform the following configuration on bigdata1:

sudo su hdfs -l -c 'hdfs namenode -initializeSharedEdits'

Start HA

Manually execute the following command:
Execute on bigdata1:

sudo su hdfs -l -c 'hdfs zkfc -formatZK'

bigdata2 Implementation:
sudo su hdfs -l -c 'hdfs namenode -bootstrapStandby'


Final installation configuration:

Finally, after adjusting the service allocation, the configuration on bigdata1 is as follows:


The services configured on bigdata2 are as follows:

The configuration on bigdata3 is as follows:

1.22 other knowledge points

1.22.1Ambari Server start, view status, stop

Start ambari server:
ambari-server start

Check the status of Ambari Server
ambari-server status

Stop Ambari Server:
ambari-server stop

1.22.2 data configuration detection during ambari startup

When Ambari Server starts, ambari runs database detection. If any problem is found, Ambari Server start will stop and display the following message log information: DB configs consistency check failed. Ambari writes more log information to / var / log / Ambari Server / Ambari Server check database.log.
You can also stop ambari from starting by executing the following command:

ambari-server start --skip-database-check

1.22.3 precautions when modifying the restart of yarn configuration


sudo su hdfs -l -c 'hdfs dfsadmin -safemode enter'
sudo su hdfs -l -c 'hdfs dfsadmin -saveNamespace'

1.22.4 get Ambari's operation log:

tail -f /var/log/ambari-server/ambari-server.log -n 500

1.1.Ambari unloading

1.1.1. Close all components

Use ambari to shut down all the components in the cluster. If not, kill-9 XXX directly

1.1.2. Close ambari server and ambari agent

ambari-server stop
ambari-agent stop

1.1.3.yum delete all Ambari components

sudo yum remove -y hadoop_3* ranger* zookeeper* atlas-metadata* ambari* spark* slide* hive* oozie* pig* tez* hbase* knox* storm* accumulo* falcon* ambari* smartsense*

1.1.4. Delete various documents

Special note: please check carefully when deleting

When ambari installs hadoop cluster, it will create some users. When clearing the cluster, it is necessary to clear these users and delete the corresponding folders. In this way, we can avoid the problem of incorrect file access rights when the cluster is running. In a word, ambari has deleted all the things he created. Otherwise, when reinstalling, he will report various "no files found" errors.

sudo userdel oozie
sudo userdel hive
sudo userdel ambari-qa
sudo userdel flume 
sudo userdel hdfs 
sudo userdel knox 
sudo userdel storm 
sudo userdel mapred
sudo userdel hbase 
sudo userdel tez 
sudo userdel zookeeper
sudo userdel kafka 
sudo userdel falcon
sudo userdel sqoop 
sudo userdel yarn 
sudo userdel hcat
sudo userdel atlas
sudo userdel spark
sudo userdel ams
sudo userdel zeppelin
 
sudo rm -rf /home/atlas
sudo rm -rf /home/accumulo
sudo rm -rf /home/hbase
sudo rm -rf /home/hive
sudo rm -rf /home/oozie
sudo rm -rf /home/storm
sudo rm -rf /home/yarn
sudo rm -rf /home/ambari-qa
sudo rm -rf /home/falcon
sudo rm -rf /home/hcat
sudo rm -rf /home/kafka
sudo rm -rf /home/mahout
sudo rm -rf /home/spark
sudo rm -rf /home/tez
sudo rm -rf /home/zookeeper
sudo rm -rf /home/flume
sudo rm -rf /home/hdfs
sudo rm -rf /home/knox
sudo rm -rf /home/mapred
sudo rm -rf /home/sqoop

//Here are three cautions
sudo rm -rf /var/lib/ambari*
sudo rm -rf /usr/lib/ambari-*
sudo rm -rf /usr/lib/ams-hbase*

 
sudo rm -rf /etc/ambari-*
sudo rm -rf /etc/hadoop
sudo rm -rf /etc/hbase
sudo rm -rf /etc/hive*
sudo rm -rf /etc/sqoop 
sudo rm -rf /etc/zookeeper  
sudo rm -rf /etc/tez* 
sudo rm -rf /etc/spark2 
sudo rm -rf /etc/phoenix    
sudo rm -rf /etc/kafka  
 
sudo rm -rf /var/run/spark*
sudo rm -rf /var/run/hadoop*
sudo rm -rf /var/run/hbase
sudo rm -rf /var/run/zookeeper
sudo rm -rf /var/run/hive*
sudo rm -rf /var/run/sqoop
sudo rm -rf /var/run/ambari-*
sudo rm -rf /var/log/hadoop*
sudo rm -rf /var/log/hive*
sudo rm -rf /var/log/ambari-*
sudo rm -rf /var/log/hbase
sudo rm -rf /var/log/sqoop
 
sudo rm -rf /usr/lib/ambari-*
 
sudo rm -rf /usr/hdp

sudo rm -rf /usr/bin/zookeeper-*
sudo rm -rf /usr/bin/yarn  
sudo rm -rf /usr/bin/sqoop*  
sudo rm -rf /usr/bin/ranger-admin-start 
sudo rm -rf /usr/bin/ranger-admin-stop 
sudo rm -rf /usr/bin/ranger-kms 
sudo rm -rf /usr/bin/phoenix-psql 
sudo rm -rf /usr/bin/phoenix-*  
sudo rm -rf /usr/bin/mapred   
sudo rm -rf /usr/bin/hive 
sudo rm -rf /usr/bin/hiveserver2 
sudo rm -rf /usr/bin/hbase
sudo rm -rf /usr/bin/hcat 
sudo rm -rf /usr/bin/hdfs 
sudo rm -rf /usr/bin/hadoop  
sudo rm -rf /usr/bin/beeline 

sudo rpm -qa | grep ambari   Get list to delete
sudo rpm -e --nodeps Items in the list

sudo rpm -qa | grep zookeeper
sudo rpm -e --nodeps Items in the list

1.1.5. Clean up the database

Delete ambari Library in mysql

drop database ambari;

1.1.6. Reassemble ambari

After the above cleaning, the installation of ambari and hadoop cluster (including HDFS, YARN+MapReduce2, Zookeeper, AmbariMetrics, Spark) succeeded.

1.23 other configurations

1.23.1 setting dfs permissions

Change the permission of this value to false

That is, the hdfs service modifies the dfs.permissions.enabled parameter to false

Cannot modify mapred.job.name at runtime of 1.23.2ive

The solution is to add:

hive.security.authorization.sqlstd.confwhitelist
mapred.|hive.|mapreduce.|spark.


hive.security.authorization.sqlstd.confwhitelist.append
mapred.|hive.|mapreduce.|spark.

Namely:

1.23.3 modify the file format in hive-site.xml to solve the problem of error in importing SQOOP

Change ORC to textfile format, and the final result is as follows:

1.23.4/usr/hdp/3.1.4.0-315/accumulo does not exist! Accumulo imports will fail


terms of settlement:

mkdir /var/lib/accumulo
echo "export ACCUMULO_HOME=/var/lib/accumulo" >> /etc/profile
source /etc/profile

1.23.5 solutions to problems related to lack of access to hive database by spark

Configure the hive.metastore.uris property of Hive

Change to hive:

1.23.6sqoop imports MySql data into hive, which is stuck

The problem description is as follows:

Problem analysis:
In the version of Hive3, entering the hive command requires entering a user name and password. Guessing that the process is stuck is due to the lack of user name and password input.
terms of settlement:
Editing method:
Edit the beeline-site.xml file of the host, and execute the following command:

[root@bigdata1 conf]# vim /usr/hdp/current/hive-client/conf/beeline-site.xml
//Add password in beeline.hs2.jdbc.url.container=root;user=root;

The amendment is as follows:

<configuration  xmlns:xi="http://www.w3.org/2001/XInclude">

    <property>
      <name>beeline.hs2.jdbc.url.container</name>
      <value>jdbc:hive2://bigdata3:2181,bigdata1:2181,bigdata2:2181/;serviceDiscoveryMode=zooKeeper;user=root;password=root;zooKeeperNamespace=hiveserver2</value>
    </property>

    <property>
      <name>beeline.hs2.jdbc.url.default</name>
      <value>container</value>
    </property>

</configuration>

After adding, and then operate again, it is found that it is normal.

After the modification is saved, the Hive service does not need to be restarted and takes effect directly. At this time, execute the command related to Sqoop again to try. Found that everything was OK.
Note: if the configuration is performed on the ambari interface, and after the reboot, the above configuration needs to be configured again. Otherwise, the phenomenon of sqoop being stuck will occur.

1.23.7kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s!

terms of settlement:
#Append to profile

echo 30 > /proc/sys/kernel/watchdog_thresh

Check out

[root@git-node1 data]# tail -1 /proc/sys/kernel/watchdog_thresh
30

#Provisional entry into force

sysctl -w kernel.watchdog_thresh=30

#Modify the configuration in / etc/sysctl.conf
Add content:

kernel.watchdog_thresh=30

[root@bigdata1 ~]# sysctl -p

Reference website: https://blog.csdn.net/qq_/article/details/84756734

1.23.8Spark cannot read the attributes in hive3.x:

The solution is to configure in hive-site.xml:

hive.strict.managed.tables=false 
hive.create.as.insert.only=false
metastore.create.as.acid=false

Then modify the configuration on the interface.

1.23.9sqoop is imported into mysql data hive, but the problem that spark SQL cannot access is solved

The solution is to manually configure it in beenline-site.xml

<property>
      <name>hive.strict.managed.tables</name>
      <value>false</value>
    </property>

    <property>
      <name>hive.create.as.insert.only</name>
      <value>false</value>
    </property>

    <property>
      <name>metastore.create.as.acid</name>
      <value>false</value>
    </property>
   
    <property>
      <name>hbase.zookeeper.quorum</name>
      <value>bigdata1:2181,bigdata2:2181,bigdata3:2181</value>
    </property>

1.23.10 configure hive to integrate hbase (defined in hive's custom configuration)


1.23.11 error communicating with the metadata (state = 42000, code = 10280) after hive data migration

https://blog.csdn.net/weixin_38256474/article/details/92080701

1.23.12 keeperrorcode = no node for / HBase / hbaseid problem

Problem phenomenon:

terms of settlement:
So the solution is to modify hbase-site.xml and specify hbase.tmp.dir, so that the TMAP directory of HBase will not be cleaned regularly

<configuration>

....(ellipsis)

<property>
        <name>hbase.tmp.dir</name>
        <value>/hbase/tmp</value>
        <description>Temporary directory on the local filesystem.</description>
</property>

</configuration>


Change to (if there is no directory, create the directory manually):

In addition, if CDH was used, zookeeper.znode.parent in hbase-site.xml configured / HBase. Therefore, in order to be compatible with it, this configuration is also changed here to:

1.23.13 configure virtual vcores

The location is:

1.23.14 when importing data from oracle, there are the following problems due to the ojdbc version


terms of settlement:
Download ojdbc8.jar and put it into / usr / HDP / current / sqoop client / Lib

1.23.15 solve the permission problem of hive add jar

Problems in integration of 1.23.16ogg and Ambari Hbase 2.0.2

The questions are:

ERROR 2020-01-10 16:49:33.000116 [main] - org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'userExitDataSource' defined in class path resource [oracle/goldengate/dat
asource/DataSource-context.xml]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [oracle.goldengate.datasource
.GGDataSource]: Factory method 'getDataSource' threw exception; nested exception is java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(Lorg/apache/hadoop/conf/Confi
guration;)V
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'userExitDataSource' defined in class path resource [oracle/goldengate/datasource/DataSource-context.xml]: Bean insta
ntiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [oracle.goldengate.datasource.GGDataSource]: Factory method 'getDataSour
ce' threw exception; nested exception is java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(Lorg/apache/hadoop/conf/Configuration;)V
        at oracle.goldengate.datasource.DataSourceLauncher.<init>(DataSourceLauncher.java:168)
        at oracle.goldengate.datasource.UserExitMain.main(UserExitMain.java:124)
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [oracle.goldengate.datasource.GGDataSource]: Factory method 'getDataSource' threw exception; nested exception is java
.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(Lorg/apache/hadoop/conf/Configuration;)V
        at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:189)
        at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:588)
        ... 11 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(Lorg/apache/hadoop/conf/Configuration;)V
        at oracle.goldengate.handler.hbase.operations.HBase1Writer.open(HBase1Writer.java:64)
        at oracle.goldengate.handler.hbase.operations.HBaseWriterFactory.init(HBaseWriterFactory.java:32)
        at oracle.goldengate.handler.hbase.HBaseHandler.init(HBaseHandler.java:245)
        at oracle.goldengate.datasource.AbstractDataSource.addDataSourceListener(AbstractDataSource.java:592)
        at oracle.goldengate.datasource.factory.DataSourceFactory.getDataSource(DataSourceFactory.java:161)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:162)
        ... 12 more

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'userExitDataSource' defined in class path resource [oracle/goldengate/datasource/DataSource-context.xml]: Bean insta
ntiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [oracle.goldengate.datasource.GGDataSource]: Factory method 'getDataSour
ce' threw exception; nested exception is java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(Lorg/apache/hadoop/conf/Configuration;)V
        at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:599) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1178) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20
.RELEASE]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1072) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:511) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:481) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:312) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:230) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:308) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:197) ~[spring-beans-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:1080) ~[spring-context-4.3.20.RELEASE.jar:4.3.20.RELEASE]
        at oracle.goldengate.datasource.DataSourceLauncher.<init>(DataSourceLauncher.java:168) ~[ggdbutil-19.1.0.0.1.003.jar:19.1.0.0.1.003]
        at oracle.goldengate.datasource.UserExitMain.main(UserExitMain.java:124) [ggcmdui-19.1.0.0.1.003.jar:19.1.0.0.1.003]

This problem is mainly caused by the absence of the above error reporting method in the org.apache.hadoop.hbase.client.HBaseAdmin class in hbase-client-2.0.2.3.1.4.0-315.jar. The solution is to put the Lib package of hbase-1.4.11 into / usr / HDP / 3.1.4.0-315 / HBase patch / hbase-1.4.11-lib

Then configure the following in the configuration file of ogg: opt/ogg/dirprm/hbase.props:

#Sample gg.classpath for Apache HBase
#gg.classpath=/usr/hdp/current/hbase-client/lib/*:/usr/hdp/current/hbase-client/conf/
#Sample gg.classpath for CDH
#gg.classpath=/opt/cloudera/parcels/CDH/lib/hbase/lib/*:/etc/hbase/conf
#Sample gg.classpath for HDP
#gg.classpath=/usr/hdp/current/hbase-client/lib/*:/etc/hbase/conf
gg.classpath=/usr/hdp/3.1.4.0-315/hbase-patch/hbase-1.4.11-lib/*:/usr/hdp/3.1.4.0-315/hbase/lib/conf/:/usr/hdp/3.1.4.0-315/hadoop/client/*

The details are as follows:

1.23.17 configure hive.aux.jars.path on hive interface

The interface effect is as follows (hive.aux.jars.path):

file:///usr/hdp/3.1.4.0-315/hive/auxlib/elasticsearch-adoop-6.7.1.jar,file:///usr/hdp/3.1.4.0-315/hive/auxlib/commons-httpclient-3.1.jar

1.23.18 can't start with home(s) in ambari]]

/usr/lib/ambari-server/web/javascripts/app.js

isAllowedDir: function(value) {
   var dirs = value.replace(/,/g,' ').trim().split(new RegExp("\\s+", "g"));
   for(var i = 0; i < dirs.length; i++){
     if(dirs[i].startsWith('/home') || dirs[i].startsWith('/homes')) {
       return false;
     }
   }
   return true;
 }

Return the method to true; it is OK.

1.23.19Caused by: java.lang.OutOfMemoryError: unable to create new native thread

Using ulimit -a, view the results:

[admin@datacenter3 hbase-1.2.0-cdh5.7.0]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256459
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 262144
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8096
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[admin@datacenter3 hbase-1.2.0-cdh5.7.0]$
//Modify the values of stack size and max user processes to:
[admin@datacenter3 hbase-1.2.0-cdh5.7.0]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256459
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 262144
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 131072
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[admin@datacenter3 hbase-1.2.0-cdh5.7.0]$

That is, the value of stack size is 10240, and the value of max user processes is 131072

In addition, if the problem is memory overflow, it may also be map, reduce memory and other information configuration errors. At this time, we need to modify the memory configuration and virtual memory configuration in map-site.xml and yarn-site.xml.
The configuration changes in map-site.xml are as follows.

<!-- 2020-01-15 New, can be deleted -->
        <property>
            <name>mapreduce.map.java.opts</name>
            <value>-Xms3g -Xmx10g</value>
        </property>

        <property>
            <name>mapreduce.reduce.java.opts</name>
            <value>-Xms6g -Xmx20g</value>
        </property>

        <property>
            <name>mapreduce.map.memory.mb</name>
            <value>10240</value>
        </property>

        <property>
            <name>mapreduce.reduce.input.buffer.percent</name>
            <value>0.5</value>
        </property>

        <property>
            <name>mapreduce.reduce.memory.mb</name>
            <value>20480</value>
        </property>

        <!-- 2020-01-15 Corresponding configuration -->
        <property>
            <name>mapreduce.tasktracker.http.threads</name>
            <value>8192</value>
        </property>

The configuration in yarn is as follows:

<! -- for every 1MB of physical memory used by a task, the maximum virtual memory can be used. The defau lt is 2.1 -- >
<property>
  <name>yarn.nodemanager.vmem-pmem-ratio</name>
  <value>5</value>
</property>

In addition, to restart hadoop and hbase, it is important to check whether a DataNode is active after the restart. You also need to restart hbase, otherwise an error similar to the following will occur:

Caused by: java.io.IOException: Couldn't set up IO streams
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:795)
	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1524)
	at org.apache.hadoop.ipc.Client.call(Client.java:1447)
	... 43 more
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method)
	at java.lang.Thread.start(Thread.java:717)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788)
	... 46 more

1.23.20/usr/bin/hdp-select set zookeeper-server 3.1.4.0-315' returned 1. symlink target /usr/hdp/current/zookeeper-server for zookeeper already exists and it is not a symlink

File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-python-wrap /usr/bin/hdp-select set zookeeper-server 3.1.4.0-315' returned 1. symlink target /usr/hdp/current/zookeeper-server for zookeeper already exists and it is not a symlink

The solution is:
Check whether there is a file generating soft link under / usr/hdp/current. If not, delete it. Execute the following command to rebuild the soft link
Then execute:

[admin@datacenter2 current]$ sudo rm -rf zookeeper-server
[admin@datacenter2 current]$ sudo hdp-select set zookeeper-server 3.1.4.0-315

1.23.21 set the number of copies of Block replication to 3

Block replication has a value of 3

1.23.22Error: java.lang.IllegalArgumentException: KeyValue size too large

Error content:

Error: java.lang.IllegalArgumentException: KeyValue size too large
	at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:952)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:191)
	at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:179)
	at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:143)
	at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:93)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
	at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
	at org.apache.hadoop.hbase.mapreduce.Import$Importer.processKV(Import.java:584)
	at org.apache.hadoop.hbase.mapreduce.Import$Importer.writeResult(Import.java:539)
	at org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:522)
	at org.apache.hadoop.hbase.mapreduce.Import$Importer.map(Import.java:505)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)

When inserting, hbase will check the columns to be inserted one by one, and check whether the size of each column is smaller than the maxKeyValueSize value. When the cell size is larger than the maxKeyValueSize, an exception of KeyValue size too large will be thrown. The problem is fixed. Let's see where the maxKeyValueSize value is set. In the same class (HTable.class) where the validatePut method is located, you can find the information of maxKeyValueSize:

public static int getMaxKeyValueSize(Configuration conf) {
        return conf.getInt("hbase.client.keyvalue.maxsize", -1);
    }

In the above method, we can see that the value of maxKeyValueSize is configured in the configuration file, and the key value of the configuration parameter is hbase.client.keyvalue.maxsize.
On the official website, I found the description of hbase.client.keyvalue.maxsize:

hbase.client.keyvalue.maxsize
The maximum size of a KeyValue instance. This is used to set the upper bound of the size of a single entry in the storage file. Because a KeyValue can't be divided, you can avoid that the region can't be divided because the data is too large.
It's wise to set it to the number divisible by the maximum region size. If set to 0 or less, this check is disabled. The default is 10MB.

Default: 10485760
That is to say, the default size of hbase.client.keyvalue.maxsize is 10M. If the cell size exceeds 10M, an error of KeyValue size too large will be reported.
resolvent:

Method 1. According to the prompts on the official website, modify the configuration file hbase-default.xml, and increase the value of hbase.client.keyvalue.maxsize

<property>
    <name>hbase.client.keyvalue.maxsize</name>
    <value>20971520</value>
  </property>

It is not recommended to modify the configuration file directly. After the modification, hbase needs to be restarted? Tape validation)
Method 2: modify the code and use the configuration object to modify the configuration:

Configuration conf = HBaseConfiguration.create();
conf.set("hbase.client.keyvalue.maxsize","20971520");

Modify hbase:

1.23.23 solve 475175982519323-1 / - ext-10000 / 000000_asthe file is not owned by hive and load data is also not ran as hive

hive-site.xml add hive.load.data.owner = fill in specific user here

1.24Hive1.1.0 upgrade to hive3.x

1.24.1 find two versions of hive

Enter the installation directory of hive (or the location of the installation package), for example, the default installation directory (installation package) of hive in ambari is in / usr/hdp/3.1.4.0-315/hive, and upgrade the sql package in:
/usr/hdp/3.1.4.0-315/hive/scripts/metastore/upgrade/mysql

It can be seen from the above that the maximum version of Ambari is 3.1.1. If you want to find the specific version of Ambari, you can enter Ambari's web interface and view it as follows:
Step 1: click Add Service

Step 2: enter the Add Service Wizard interface to see the version of hive and other components you have installed.

According to the above interface, the specific Ambari version is 3.1.0.

1.24.2 upgrade hive library 1.1.0 to hive 3.1.0

1.24.2.1 backup old database

Enter the MySQL client interface and create a new library:

CREATE DATABASE hive_1.1.0 DEFAULT CHARACTER SET utf8;

Then back up the old hive library to hive_1.1.0

1.24.2.2 find upgrade script by comparing sql

It is preliminarily found that the sql file not checked on the right is the script to be upgraded.

After sorting out again, it is found that the sql script to be upgraded is as follows:
Place the sql to be upgraded in the directory / home/workspace (/ home/workspace / ambari 3.1.4.0-sqls / hive sql / MySQL), that is:

1.24.2.3 upgrade hive Library

Log in to mysql and execute the script through the source command

[root@bigdata1 mysql]# mysql -u root -p

Input password:

Then execute respectively:

mysql> use hive;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/create-user.mysql.sql
mysql> set global validate_password_policy=0;
mysql> set global validate_password_length=1;


mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/create-user.mysql.sql

mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-txn-schema-1.3.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-txn-schema-2.0.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-txn-schema-2.1.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-txn-schema-2.2.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-txn-schema-2.3.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/hive-schema-3.1.1000.mysql.sql

mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-1.1.0-to-1.2.0.mysql.sql

mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-1.2.0-to-1.2.1000.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-1.2.1000-to-2.0.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-2.0.0-to-2.1.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-2.1.0-to-2.1.1000.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-2.1.1000-to-2.1.2000.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-2.1.2000-to-3.0.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-3.0.0-to-3.1.0.mysql.sql
mysql> source /home/workspace/ambari3.1.4.0-sqls/hive-sql/mysql/upgrade-3.1.0-to-3.1.1000.mysql.sql

1.25 backup and recovery of HDFS data

Copy HDFS data from root to local directory file
hadoop fs -copyToLocal hdfs://tqHadoopCluster/ ./dataFolder

put the local HDFS data file to the specified directory

hdfs dfs -put -f dataFolder hdfs://tqHadoopCluster/

1.26hbase data backup and recovery

The following procedure aims at the situation that the old and new clusters are not started online, so hbase export / import is used for data backup and migration

1.26.1 export process

First, enter the machine where hbase is, and then execute the command:

hbase shell

hbase(main):001:0> list
TABLE                                                                                                                                                                                                                                                                                                                                                
test_migration                                                                                                           

//Then execute:
hbase(main):003:0> describe 'test_migration'
Table test_migration is ENABLED                                                                                          
test_migration                                                                                                           
COLUMN FAMILIES DESCRIPTION                                                                                              
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_EN
DING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
EPLICATION_SCOPE => '0'}                                                                                                 
1 row(s) in 0.1710 seconds

//From the above, it is found that there is a column family 'cf1'

hbase(main):004:0> scan test_migration
NameError: undefined local variable or method `test_migration' for #<Object:0x48268eec>

hbase(main):005:0> scan 'test_migration'
ROW                             COLUMN+CELL                                                                              
 1                              column=cf1:age, timestamp=1553738512684, value=18                                        
 1                              column=cf1:name, timestamp=1553738512531, value=zhangsan                                 
 1                              column=cf1:sex, timestamp=1553738512591, value=\xE7\x94\xB7                              
1 row(s) in 0.0950 seconds

Export Data (directly in bash Executed in):
hbase org.apache.hadoop.hbase.mapreduce.Export test_migration /tzq/hbaseData

//Copy data locally:
scp -r admin@datacenter1:/home/admin/hbaseDataTest /home/admin/ambari

//Then copy the data to the local through xftp

1.26.2 import process

First create a command space (in this case, a scenario with a namespace), and then create a related table

hbase>create_namespace 'test'

hbase(main):022:0> create 'test:test_migration','cf1'    (among cf1 Family of columns)

hbase(main):023:0> list
TABLE                                                                                                                    
test:test_migration                                                                                                                                                                                     
user                                                                                                                     
3 row(s)
Took 0.0086 second

put local data on hdfs:

hdfs dfs -put hbaseData /

Then import the data to 'test: Test UU migration' (the following is executed directly in bash)

hbase org.apache.hadoop.hbase.mapreduce.Import 'test:test_migration' /hbaseData

Finally, enter hbase to verify the data:

hbase(main):024:0> scan 'test:test_migration'
ROW                             COLUMN+CELL                                                                              
 1                              column=cf1:age, timestamp=1553738512684, value=18                                        
 1                              column=cf1:name, timestamp=1553738512531, value=zhangsan                                 
 1                              column=cf1:sex, timestamp=1553738512591, value=\xE7\x94\xB7

Discovery data is consistent with the original

1.26.3 count the number of hbase table rows

hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'test:test_migration'

1.27 hive and ElasticSearch integration (on 3 servers)

1.27.1 configure auxlib

Create auxlib in / usr / HDP / current / hive client, and then put: commons-httpclient-3.1.jar, elasticsearch-hadoop-6.7.1.jar in this folder, such as:

If you find that the two add jar in hive is not available, restart hiveserver2.

1.27.2 verify the configuration is correct

After the configuration is completed, enter hive cli and execute as follows:

drop table if exists test_new_hive_02;
CREATE EXTERNAL TABLE test_new_hive_02 (
one_idcardnumber string comment 'ID card No.',
age bigint comment 'ID card No.',
one_birtyday String comment 'ID card No.',
one_name string comment 'Name: only the latest one can be used for more than one'
)STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' 
TBLPROPERTIES(
'es.resource' = 'test_population_info2/_doc',
'es.mapping.id'='one_idcardnumber',
'es.nodes'='192.168.110.182',
'es.index.auto.create' = 'true',
'es.index.read.missing.as.empty' = 'true',
'es.field.read.empty.as.null' = 'true'
);

Note that the index here is created automatically.

insert into test_new_hive_02 values('1112',25,'2019-01-01','Li Si');

If the whole process is executed correctly, you can go to Kibana to find: http://192.168.110.182:5601 / APP / Kibana ා / dev ﹐ tools / console ﹐ g = ()

GET /test_population_info2/_search
{
  "query": {
    "match_all": {}
  }
}

1.28 Oracle golden gate and Ambari hbase integration

slightly

1050 original articles published, 331 praised, 3.87 million visitors+
His message board follow

Tags: MySQL hive HBase sudo

Posted on Fri, 07 Feb 2020 06:30:42 -0500 by ryanyoungsma