Hive Foundation (installation)

1, What is Hive Hive is an essential tool in Hadop ecosystem. It can map the structured data stored in HDES into a table...

1, What is Hive

Hive is an essential tool in Hadop ecosystem.
It can map the structured data stored in HDES into a table in the database, and provides an SQL dialect to query it.
These SQL statements will eventually be translated into MapReduce program for execution. The essence of Hive is a framework generated to simplify users' writing MapReduce programs. It does not store and calculate data, and completely depends on HDFS and MapReduce.

Hive provides an SQL dialect called hive query language (HiveQ or HQL for short) to query the data stored in Hadoop clusters. Hive reduces the difficulty of transferring traditional data analysis system to Hadoop system. All developers who can use SQL language can easily learn and use hive. Without hive, these people must learn new languages and tools before they can be applied to the new production environment. However, hive is different from other SQL based environments (MySQL).

1. Hive features

Hive is an application based on Hadoop. Due to the design of Hadoop, hive cannot provide complete database functions. The biggest limitation is that hive does not support row level update, insert or delete operations. At the same time, because the start process of MapReduce task takes a long time, all hive queries are seriously delayed. Queries that can be completed at the second level in traditional databases often take longer to execute in hive, even if the data set is relatively small. Finally, hive does not support transactions.

2, Hive installation

The operation of Hive depends on Hadoop, so Hadoop needs to be installed before installing Hive.

Hive's basic installation configuration includes the following steps:

1. Check Hadoop environment
2. Install MySQL
3. Install Hive
4. Configure Hive

1. Check hadoop environment

(1) View Hadoop version

The code is as follows:

hadoop version

(2) Start process

The current directory is / home/hadoop
First switch the directory cd /usr/local/hadoop
The code is as follows:

cd /usr/local/hadoop

Start the process and view
The code is as follows:

./sbin/start-dfs.sh ./sbin/start-ysrn.sh jps

2. Install MySQL

(1) Install MySQL

The code is as follows:

sudo apt-get install mysal-server
(2) View account and password

The code is as follows:

sudo cat /etc/mysql/debian.cnf

(3) Log in to MySQL database with default account

The code is as follows:

mysql -u debian-sys-maint -p
(4) Create Hive account

The code is as follows:

CREATE USER 'hive'@'%' IDENTIFIED BY '123456';
(5) Grant Hive users permission to manipulate the database

The code is as follows:

GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%'; FLUSH PRIVILEGES;
(6) Exit MySQL database

The code is as follows:

exit

3. Install Hive

(1) Upload HIve to / home/hadoop (2) Unzip Hive into / usr/local

The code is as follows:

sudo tar -xvf apache-hive-2.3.7-bin.tar.gz -C /usr/local
(3) Enter the / usr/local directory and rename the extracted directory hive

The code is as follows:

cd /usr/local sudo mv apache-hive-2.3.7-bin hive
(4) Modify the owner of hive to hadoop

The code is as follows:

sudo chown -R hadoop hive

4. Configure Hive

(1) Enter hive profile directory

The code is as follows:

cd /usr/local/hive/conf
(2) Create hive-site.xml file configuration information

The code is as follows:

vim hive-site.xml

The configuration contents are as follows:
You need to create a tmp directory under the hive directory

<configuration> <property> <name>system:java.io.tmpdir</name> <value>/usr/local/hive/tmp</value> </property> <property> <name>system:user.name</name> <value>hadoop</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> </configuration>
(3) Enter the dependency Library Directory of hive

The code is as follows:

cd /usr/local/hive/lib
(4) Upload the mysql driver file to the lib directory (5) Enter the configuration file directory of hadoop software

The code is as follows:

cd /usr/local/hadoop/etc/hadoop
(6) Edit the core-site.xml file

The code is as follows:

vim core-site.xml

The configuration contents are as follows:

<property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property>
(7) Enter the Hadoop home directory to edit the environment variable file

The code is as follows:

cd ~ vim .bashrc
(8) Add content to environment variables file

Add the following:

export HADOOP_HOME=/usr/local/hadoop export HIVE_HOME=/usr/local/hive export PATH=$/bin:$/bin:$/sbin:$PATH
(9) Refresh environment variables

The code is as follows:

source.bashrc
(10) Initialize Hive

The code is as follows:

schematool -dbType mysql -initSchema

Initialization succeeded

(11) Query Hive default database list to verify installation

The code is as follows:

hive -e 'show databases'

Installation succeeded

hive file:

https://pan.baidu.com/s/1iCjmb9hdhnnL1kI0VzaCxg

18 November 2021, 09:03 | Views: 7331

Add new comment

For adding a comment, please log in
or create account

0 comments