3, Hive's DML data operation

3.1 data import

3.1.1 loading data into the table (Load)

(1) Grammar
load data [local] inpath 'path of data' [overwrite] into table
student [partition (partcol1=val1,...)];
load data: indicates loading data
Local: indicates loading data from local to hive table; Otherwise, load data from HDFS to hive table
inpath: indicates the path to load data
Overwrite: it means to overwrite the existing data in the table, otherwise it means to append
into table: indicates which table to load
student: represents a specific table
Partition: indicates to upload to the specified partition
(2) Case
1) . create a table

 create table student_xjl(id string,name string) 
row format delimited fields terminated by '\t';

2) . load local files into hive

 load data local inpath 
'/opt/module/hive/datas/student.txt' into table default.student;

3) . load the HDFS file into hive
Upload files to HDFS

dfs -put /opt/module/hive/data/student.txt 
/user/atguigu/hive;

Load data on HDFS

 load data inpath '/user/atguigu/hive/student.txt' into 
table default.student;

3) . load data to overwrite the existing data in the table
Upload files to HDFS

dfs -put /opt/module/data/student.txt /user/atguigu/hive;

Loading data overwrites the existing data in the table

 load data inpath '/user/atguigu/hive/student.txt' 
overwrite into table default.student;

3.1.2 inserting data into a table through a query statement (Insert)

(1) Create a table

 create table student_xjl(id int, name string)
row format delimited fields terminated by '\t'

(2) Insert basic data

insert into table student_xjl 
values (1,'zhangsan'),(2,'lisi'),(3,'wangwu');

Insert into: insert into a table or partition by appending data. The original data will not be deleted
insert overwrite: it will overwrite the existing data in the table
Note: insert does not support inserting partial fields

3.1.3 create tables and load data in query statements (As Select)

Create a table based on the query results (the query results are added to the newly created table)

create table if not exists student1_xjl
as select id,name from student_xjl;

3.1.4 specify the loading data path through Location when creating a table

(1) Upload data to hdfs

 hive (default)> dfs -mkdir /student;
hive (default)> dfs -put /opt/module/datas/student.txt /student;

(2) Create a table and specify the location on hdfs

create external table if not exists student5(
id int, name string)
row format delimited fields terminated by '\t'
location '/student; 

(3) Query data

hive (default)> select * from student5;

3.1.5 Import data into the specified Hive table

Note: export first, and then import the data.

 import table student2
from '/user/hive/warehouse/export/student';

3.2 data export

3.2.1 Insert export

(1) Export query results to local

 insert overwrite local directory 
'/opt/module/hive/data/export/student'
select * from student;

(2) Format and export query results to local

 insert overwrite local directory 
'/opt/module/hive/data/export/student1'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
select * from student;

(3) Export the query results to HDFS (no local)

  insert overwrite directory 
'/opt/module/hive/data/export/student1'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
select * from student;

3.2.2 exporting Hadoop commands to local

3.2.3 Hive Shell command export

Basic syntax: (hive -f/-e execute statement or script > file)

3.2.4 Export to HDFS


export and import are mainly used for Hive table migration between two Hadoop platform clusters.

3.2.5 clear data in the table (Truncate)

Note: Truncate can only delete management tables, not data in external tables

   truncate table student;

Tags: Big Data Hadoop hive

Posted on Sun, 05 Dec 2021 02:14:14 -0500 by shinyo