Differences between Hive internal and external tables

Difference between internal table and external table Tables that are not modified by external are managed table s, and tables that are modified by external are external table s; The data of the internal table is managed by Hive itself, and the external table is managed by HDFS; The data storage location of the internal table is hive.metas ...

Posted on Thu, 19 Mar 2020 11:47:10 -0400 by Jyotsna

Learn sqoop II: the job task of sqoop exporting hadoop data to mysql and sqoop

1.sqoop exports hadoop data to mysql Premise: before exporting data from Hadoop ecosystem to RDBMS database, the target table must exist in the target database. There are three modes of export: 1.1 the default operation is to INSERT data from a file into a table using the INSERT statement. Observe ...

Posted on Fri, 13 Mar 2020 04:01:51 -0400 by eskimowned

The integration of Flink SQL client 1.10 and hive to read real-time data

Looking forward to processing streaming data in the form of pure SQL, flink 1.10 introduced Hive integration, which is available in production, and has stronger streaming SQL processing capacity. Let's try it this time~~   ​   [outline] 1. Environmental preparation 2. SQL Client and hive integrated configuration 3. Reading kafka data with ...

Posted on Wed, 04 Mar 2020 01:51:22 -0500 by xtheonex

Flume deployment and introduction case

1, Flume installation and deployment 1.1 installation address 1. Flume official website addresshttp://flume.apache.org/ 2. Download addresshttp://archive.apache.org/dist/flume/ 3. Document addresshttp://flume.apache.org/FlumeUserGuide.html 1.2 installation and deployment 1. Upload apache-flume-1 ...

Posted on Wed, 26 Feb 2020 01:02:44 -0500 by john_zakaria

spark Learning Journey: spark SQL

Article directory 1. Spark SQL overview 2. The relationship and difference of RDD, DataFrame and Dataset in spark 3. Overview of dataframe 3.1. [official api](http://spark.apache.org/docs/2.4.1/sql-getting-started.html) 3.2. Infer Schema by reflection 3.3. Specify Schema directly through StructType ...

Posted on Fri, 21 Feb 2020 04:49:14 -0500 by alohatofu

Fast learning - Hive enterprise level tuning

Chapter 9 enterprise level optimization 9.1 Fetch Fetch refers to the fact that some queries in Hive can be queried without MapReduce. For example: SELECT * FROM employees; in this case, Hive can simply read the files in the storage directory corresponding to the employee, and then output the query ...

Posted on Mon, 17 Feb 2020 22:19:15 -0500 by lazytiger

Sqoop of big data technology

Sqoop of big data technology Chapter 1 Introduction to Sqoop Sqoop is an open source tool, mainly used in Hadoop(Hive) and traditional databases (mysql, postgresql )For data transfer, you can import data from a relational database (such as mysql, Oracle, Postgres, etc.) into HDFS of Hadoop, or i ...

Posted on Wed, 12 Feb 2020 23:54:15 -0500 by vickie

27. Data governance of yiee data operation system of Duoyi Education - atlas deployment and use

Catalog 1, atlas Compilation and packaging 2, atlas installation configuration 1. Compilation environment 2. Compilation steps 3. Installation steps 4. Hive hook configuration 5. Operation test 3, atlas configuration hive hook 4, Introduction to atlas         1,Base Search         ...

Posted on Tue, 11 Feb 2020 03:02:02 -0500 by bensonang

Ambari 2.7.0 + hdp3.1.4.0 installation, hdfs data backup and recovery, hive data backup and recovery, hbase data backup and recovery

Catalog 1 Ambari + HDP offline installation 1.1 INTRODUCTION 1.1.1 introduction to ambari 1.1.2 HDP 1.1.3 HDP-UTILS 1.2 address of ambari official website 1.3 Ambari and HDP Downloads 1.4 system requirements 1.4.1 software requirements 1.5 modify the maximum number of open files 1.6 cluster node plann ...

Posted on Fri, 07 Feb 2020 06:30:42 -0500 by ryanyoungsma

How to integrate Hive and HBase

Version Description: HDP: 3.0.1.0 Hive: 3.1.0 HBase: 2.0.0 I. Preface Before learning HBase, we had doubts. Although HBase can store hundreds of millions or billions of rows of data, it is not very friendly for data analysis. It only provides a simple quick query ability based on Key values, and cannot perform a large number of conditional qu ...

Posted on Fri, 31 Jan 2020 14:23:36 -0500 by buck2bcr