Difference between internal table and external table
Tables that are not modified by external are managed table s, and tables that are modified by external are external table s;
The data of the internal table is managed by Hive itself, and the external table is managed by HDFS;
The data storage location of the internal table is hive.metas ...
Posted on Thu, 19 Mar 2020 11:47:10 -0400 by Jyotsna
1.sqoop exports hadoop data to mysql
Premise: before exporting data from Hadoop ecosystem to RDBMS database, the target table must exist in the target database.
There are three modes of export:
1.1 the default operation is to INSERT data from a file into a table using the INSERT statement.
Posted on Fri, 13 Mar 2020 04:01:51 -0400 by eskimowned
Looking forward to processing streaming data in the form of pure SQL, flink 1.10 introduced Hive integration, which is available in production, and has stronger streaming SQL processing capacity. Let's try it this time~~
[outline] 1. Environmental preparation 2. SQL Client and hive integrated configuration 3. Reading kafka data with ...
Posted on Wed, 04 Mar 2020 01:51:22 -0500 by xtheonex
1, Flume installation and deployment
1.1 installation address
1. Flume official website addresshttp://flume.apache.org/
2. Download addresshttp://archive.apache.org/dist/flume/
3. Document addresshttp://flume.apache.org/FlumeUserGuide.html
1.2 installation and deployment
1. Upload apache-flume-1 ...
Posted on Wed, 26 Feb 2020 01:02:44 -0500 by john_zakaria
1. Spark SQL overview
2. The relationship and difference of RDD, DataFrame and Dataset in spark
3. Overview of dataframe
3.1. [official api](http://spark.apache.org/docs/2.4.1/sql-getting-started.html)
3.2. Infer Schema by reflection
3.3. Specify Schema directly through StructType
Posted on Fri, 21 Feb 2020 04:49:14 -0500 by alohatofu
Chapter 9 enterprise level optimization
Fetch refers to the fact that some queries in Hive can be queried without MapReduce. For example: SELECT * FROM employees; in this case, Hive can simply read the files in the storage directory corresponding to the employee, and then output the query ...
Posted on Mon, 17 Feb 2020 22:19:15 -0500 by lazytiger
Sqoop of big data technology
Chapter 1 Introduction to Sqoop
Sqoop is an open source tool, mainly used in Hadoop(Hive) and traditional databases (mysql, postgresql )For data transfer, you can import data from a relational database (such as mysql, Oracle, Postgres, etc.) into HDFS of Hadoop, or i ...
Posted on Wed, 12 Feb 2020 23:54:15 -0500 by vickie
1, atlas Compilation and packaging
2, atlas installation configuration
1. Compilation environment
2. Compilation steps
3. Installation steps
4. Hive hook configuration
5. Operation test
3, atlas configuration hive hook
4, Introduction to atlas
Posted on Tue, 11 Feb 2020 03:02:02 -0500 by bensonang
1 Ambari + HDP offline installation
1.1.1 introduction to ambari
1.2 address of ambari official website
1.3 Ambari and HDP Downloads
1.4 system requirements
1.4.1 software requirements
1.5 modify the maximum number of open files
1.6 cluster node plann ...
Posted on Fri, 07 Feb 2020 06:30:42 -0500 by ryanyoungsma
Before learning HBase, we had doubts. Although HBase can store hundreds of millions or billions of rows of data, it is not very friendly for data analysis. It only provides a simple quick query ability based on Key values, and cannot perform a large number of conditional qu ...
Posted on Fri, 31 Jan 2020 14:23:36 -0500 by buck2bcr