Data warehouse construction - DWD floor
1) Analyze user behavior data.
2) Re model the business data using the dimension model.
Log parsing idea
1) Log structure review
(1) Page buried point log
(2) Startup log
2) Log parsing idea
get_ json_ Use of object function
1) Data
[{"name":"Dalang","sex":"male","age":"25"},{"name":"ximen qing","sex":"male","age":"47"}]
2) Take out the first json object
hive (gmall)> select get_json_object('[{"name":"Dalang","sex":"male","age":"25"},{"name":"ximen qing","sex":"male","age":"47"}]','$[0]');
The result is: {"name": "Dalang", "sex": "male", "age": "25"}
3) Get the value of the age field of the first json
hive (gmall)> SELECT get_json_object('[{"name":"Dalang","sex":"male","age":"25"},{"name":"ximen qing","sex":"male","age":"47"}]',"$[0].age");
The result is: 25
Start log table
Startup log analysis idea: each row of data in the startup log table corresponds to a startup record. A startup record should contain the public information and startup information in the log. First filter out all logs containing the start field, and then use get_ json_ The object function parses each field.
1) Create table statement
DROP TABLE IF EXISTS dwd_start_log; CREATE EXTERNAL TABLE dwd_start_log( `area_code` STRING COMMENT 'Area code', `brand` STRING COMMENT 'Mobile phone brand', `channel` STRING COMMENT 'channel', `is_new` STRING COMMENT 'First start', `model` STRING COMMENT 'Mobile phone model', `mid_id` STRING COMMENT 'equipment id', `os` STRING COMMENT 'operating system', `user_id` STRING COMMENT 'member id', `version_code` STRING COMMENT 'app Version number', `entry` STRING COMMENT 'icon Phone icon notice notice install Start after installation', `loading_time` BIGINT COMMENT 'Start loading time', `open_ad_id` STRING COMMENT 'Advertising page ID ', `open_ad_ms` BIGINT COMMENT 'Total advertising time', `open_ad_skip_ms` BIGINT COMMENT 'User skipping advertisement time point', `ts` BIGINT COMMENT 'time' ) COMMENT 'Start log table' PARTITIONED BY (`dt` STRING) -- Create partitions by time STORED AS PARQUET -- use parquet Column storage LOCATION '/warehouse/gmall/dwd/dwd_start_log' -- Specify in HDFS Upper storage location TBLPROPERTIES('parquet.compression'='lzo'); -- use LZO compress
2) Data import
hive (gmall)> insert overwrite table dwd_start_log partition(dt='2020-06-14') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.start.entry'), get_json_object(line,'$.start.loading_time'), get_json_object(line,'$.start.open_ad_id'), get_json_object(line,'$.start.open_ad_ms'), get_json_object(line,'$.start.open_ad_skip_ms'), get_json_object(line,'$.ts') from ods_log where dt='2020-06-14' and get_json_object(line,'$.start') is not null;
3) View data
hive (gmall)> select * from dwd_start_log where dt='2020-06-14' limit 2;
Page log table
Page log analysis idea: each row of data in the page log table corresponds to a page access record. A page access record should contain the public information and page information in the log. First filter out all the logs containing the page field, and then use get_ json_ The object function parses each field.
1) Create table statement
DROP TABLE IF EXISTS dwd_page_log; CREATE EXTERNAL TABLE dwd_page_log( `area_code` STRING COMMENT 'Area code', `brand` STRING COMMENT 'Mobile phone brand', `channel` STRING COMMENT 'channel', `is_new` STRING COMMENT 'First start', `model` STRING COMMENT 'Mobile phone model', `mid_id` STRING COMMENT 'equipment id', `os` STRING COMMENT 'operating system', `user_id` STRING COMMENT 'member id', `version_code` STRING COMMENT 'app Version number', `during_time` BIGINT COMMENT 'Duration MS', `page_item` STRING COMMENT 'target id ', `page_item_type` STRING COMMENT 'Target type', `last_page_id` STRING COMMENT 'Previous page type', `page_id` STRING COMMENT 'page ID ', `source_type` STRING COMMENT 'Source type', `ts` bigint ) COMMENT 'Page log table' PARTITIONED BY (`dt` STRING) STORED AS PARQUET LOCATION '/warehouse/gmall/dwd/dwd_page_log' TBLPROPERTIES('parquet.compression'='lzo');
2) Data import
hive (gmall)> insert overwrite table dwd_page_log partition(dt='2020-06-14') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.ts') from ods_log where dt='2020-06-14' and get_json_object(line,'$.page') is not null;
3) View data
hive (gmall)> select * from dwd_page_log where dt='2020-06-14' limit 2;
Action log table
Action log analysis idea: each row of data in the action log table corresponds to an action record of the user. An action record should contain public information, page information and action information. First filter out the log containing the action field, then "blow up" the action array through the UDTF function (similar to the effect of the explode function), and then use get_ json_ The object function parses each field.
1) Create table statement
DROP TABLE IF EXISTS dwd_action_log; CREATE EXTERNAL TABLE dwd_action_log( `area_code` STRING COMMENT 'Area code', `brand` STRING COMMENT 'Mobile phone brand', `channel` STRING COMMENT 'channel', `is_new` STRING COMMENT 'First start', `model` STRING COMMENT 'Mobile phone model', `mid_id` STRING COMMENT 'equipment id', `os` STRING COMMENT 'operating system', `user_id` STRING COMMENT 'member id', `version_code` STRING COMMENT 'app Version number', `during_time` BIGINT COMMENT 'Duration MS', `page_item` STRING COMMENT 'target id ', `page_item_type` STRING COMMENT 'Target type', `last_page_id` STRING COMMENT 'Previous page type', `page_id` STRING COMMENT 'page id ', `source_type` STRING COMMENT 'Source type', `action_id` STRING COMMENT 'action id', `item` STRING COMMENT 'target id ', `item_type` STRING COMMENT 'Target type', `ts` BIGINT COMMENT 'time' ) COMMENT 'Action log table' PARTITIONED BY (`dt` STRING) STORED AS PARQUET LOCATION '/warehouse/gmall/dwd/dwd_action_log' TBLPROPERTIES('parquet.compression'='lzo');
2) Creating UDTF function -- design idea
3) Creating UDTF functions -- writing code
(1) Create a maven project: GMALL udtf
(2) Create package name: com.huan
(3) Introduce the following dependencies
<dependencies> <!--add to hive rely on--> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>3.1.2</version> </dependency> </dependencies>
(4) Code
package com.huan; import java.util.ArrayList; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils; import org.json.JSONArray; public class ExplodeJSONArray extends GenericUDTF { private PrimitiveObjectInspector inputOI; @Override public void close() throws HiveException { } @Override public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException { // 1. Parameter validity check if(argOIs.length!=1){ throw new UDFArgumentException("explode_json_array Function can only accept 1 parameter"); } ObjectInspector argOI = argOIs[0]; // 2 the first parameter must be string //Judge whether the parameter is a basic data type if(argOI.getCategory()!=ObjectInspector.Category.PRIMITIVE){ throw new UDFArgumentException("explode_json_array Functions can only receive arguments of the basic data type"); } //Force parameter object inspector to base type Object Inspector PrimitiveObjectInspector primitiveOI = (PrimitiveObjectInspector) argOI; inputOI=primitiveOI; //Judge whether the parameter is of String type if(primitiveOI.getPrimitiveCategory()!=PrimitiveObjectInspector.PrimitiveCategory.STRING){ throw new UDFArgumentException("explode_json_array Function can only receive STRING Parameters of type"); } // 3. Define the return value name and type ArrayList<String> fieldNames = new ArrayList<String>(); ArrayList<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>(); fieldNames.add("item"); fieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector); return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs); } @Override public void process(Object[] args) throws HiveException { // 1 get incoming data Object arg = args[0]; String jsonArrayStr = PrimitiveObjectInspectorUtils.getString(arg, inputOI); // 2 convert string to json array JSONArray jsonArray = new JSONArray(jsonArrayStr); // 3 loop once, take out a json in the array and write it out for (int i = 0; i < jsonArray.length(); i++) { String json = jsonArray.getString(i); String[] result = {json}; forward(result); } } }
4) Create function
(1) Pack
(2) Upload gmall-UDTF-1.0-SNAPSHOT.jar to / opt/module of Hadoop 102, and then upload the jar package to / user/hive/jars path of HDFS
[root@hadoop102 module]$ hadoop fs -mkdir -p /user/hive/jars [root@hadoop102 module]$ hadoop fs -put gmall-UDTF-1.0-SNAPSHOT.jar /user/hive/jars
(3) Create a permanent function associated with the developed java class
create function explode_json_array as 'com.huan.ExplodeJSONArray' using jar 'hdfs://hadoop102:8020/user/hive/jars/gmall-UDTF-1.0-SNAPSHOT.jar';
(4) Note: what should I do if I modify the custom function and regenerate the jar package? Just replace the old jar package on the HDFS path and restart the Hive client.
5) Data import
insert overwrite table dwd_action_log partition(dt='2020-06-14') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(action,'$.action_id'), get_json_object(action,'$.item'), get_json_object(action,'$.item_type'), get_json_object(action,'$.ts') from ods_log lateral view explode_json_array(get_json_object(line,'$.actions')) tmp as action where dt='2020-06-14' and get_json_object(line,'$.actions') is not null;
6) View data
select * from dwd_action_log where dt='2020-06-14' limit 2;
Exposure log table
Analysis idea of exposure log: each row of data in the exposure log table corresponds to an exposure record. An exposure record should contain public information, page information and exposure information. First filter out the log containing the display field, then "blow up" the display array through the UDTF function (similar to the effect of the explode function), and then use get_ json_ The object function parses each field.
1) Create table statement
DROP TABLE IF EXISTS dwd_display_log; CREATE EXTERNAL TABLE dwd_display_log( `area_code` STRING COMMENT 'Area code', `brand` STRING COMMENT 'Mobile phone brand', `channel` STRING COMMENT 'channel', `is_new` STRING COMMENT 'First start', `model` STRING COMMENT 'Mobile phone model', `mid_id` STRING COMMENT 'equipment id', `os` STRING COMMENT 'operating system', `user_id` STRING COMMENT 'member id', `version_code` STRING COMMENT 'app Version number', `during_time` BIGINT COMMENT 'app Version number', `page_item` STRING COMMENT 'target id ', `page_item_type` STRING COMMENT 'Target type', `last_page_id` STRING COMMENT 'Previous page type', `page_id` STRING COMMENT 'page ID ', `source_type` STRING COMMENT 'Source type', `ts` BIGINT COMMENT 'app Version number', `display_type` STRING COMMENT 'Exposure type', `item` STRING COMMENT 'Exposure object id ', `item_type` STRING COMMENT 'app Version number', `order` BIGINT COMMENT 'Exposure sequence', `pos_id` BIGINT COMMENT 'Exposure position' ) COMMENT 'Exposure log table' PARTITIONED BY (`dt` STRING) STORED AS PARQUET LOCATION '/warehouse/gmall/dwd/dwd_display_log' TBLPROPERTIES('parquet.compression'='lzo');
2) Data import
insert overwrite table dwd_display_log partition(dt='2020-06-14') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.ts'), get_json_object(display,'$.display_type'), get_json_object(display,'$.item'), get_json_object(display,'$.item_type'), get_json_object(display,'$.order'), get_json_object(display,'$.pos_id') from ods_log lateral view explode_json_array(get_json_object(line,'$.displays')) tmp as display where dt='2020-06-14' and get_json_object(line,'$.displays') is not null;
3) View data
select * from dwd_display_log where dt='2020-06-14' limit 2;
Error log table
Error log analysis idea: each row of data in the error log table corresponds to an error record. In order to facilitate error location, an error record should contain the corresponding public information, page information, exposure information, action information, startup information and error information. First filter out the logs containing the err field, and then use get_ json_ The object function parses all fields.
1) Create table statement
DROP TABLE IF EXISTS dwd_error_log; CREATE EXTERNAL TABLE dwd_error_log( `area_code` STRING COMMENT 'Area code', `brand` STRING COMMENT 'Mobile phone brand', `channel` STRING COMMENT 'channel', `is_new` STRING COMMENT 'First start', `model` STRING COMMENT 'Mobile phone model', `mid_id` STRING COMMENT 'equipment id', `os` STRING COMMENT 'operating system', `user_id` STRING COMMENT 'member id', `version_code` STRING COMMENT 'app Version number', `page_item` STRING COMMENT 'target id ', `page_item_type` STRING COMMENT 'Target type', `last_page_id` STRING COMMENT 'Previous page type', `page_id` STRING COMMENT 'page ID ', `source_type` STRING COMMENT 'Source type', `entry` STRING COMMENT ' icon Phone icon notice notice install Start after installation', `loading_time` STRING COMMENT 'Start loading time', `open_ad_id` STRING COMMENT 'Advertising page ID ', `open_ad_ms` STRING COMMENT 'Total advertising time', `open_ad_skip_ms` STRING COMMENT 'User skipping advertisement time point', `actions` STRING COMMENT 'action', `displays` STRING COMMENT 'exposure', `ts` STRING COMMENT 'time', `error_code` STRING COMMENT 'Error code', `msg` STRING COMMENT 'error message' ) COMMENT 'Error log table' PARTITIONED BY (`dt` STRING) STORED AS PARQUET LOCATION '/warehouse/gmall/dwd/dwd_error_log' TBLPROPERTIES('parquet.compression'='lzo');
Note: This is to process the action array and exposure array. If you need to analyze the association between the error and a single action or exposure, you can use explode first_ json_ The array function "explodes" the array, and then uses get_ json_ The object function gets the specific field.
4) Data import
insert overwrite table dwd_error_log partition(dt='2020-06-14') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.start.entry'), get_json_object(line,'$.start.loading_time'), get_json_object(line,'$.start.open_ad_id'), get_json_object(line,'$.start.open_ad_ms'), get_json_object(line,'$.start.open_ad_skip_ms'), get_json_object(line,'$.actions'), get_json_object(line,'$.displays'), get_json_object(line,'$.ts'), get_json_object(line,'$.err.error_code'), get_json_object(line,'$.err.msg') from ods_log where dt='2020-06-14' and get_json_object(line,'$.err') is not null;
5) View data
hive (gmall)> select * from dwd_error_log where dt='2020-06-14' limit 2;
DWD layer user behavior data loading script
1) Script
(1) Create the script in the / root/bin directory of Hadoop 102
[root@hadoop102 bin]$ vim ods_to_dwd_log.sh
Write the following in the script
#!/bin/bash APP=gmall # If it is the entered date, the entered date will be obtained according to the; If no date is entered, take the day before the current time if [ -n "$2" ] ;then do_date=$2 else do_date=`date -d "-1 day" +%F` fi dwd_start_log=" set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert overwrite table ${APP}.dwd_start_log partition(dt='$do_date') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.start.entry'), get_json_object(line,'$.start.loading_time'), get_json_object(line,'$.start.open_ad_id'), get_json_object(line,'$.start.open_ad_ms'), get_json_object(line,'$.start.open_ad_skip_ms'), get_json_object(line,'$.ts') from ${APP}.ods_log where dt='$do_date' and get_json_object(line,'$.start') is not null;" dwd_page_log=" set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert overwrite table ${APP}.dwd_page_log partition(dt='$do_date') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.ts') from ${APP}.ods_log where dt='$do_date' and get_json_object(line,'$.page') is not null;" dwd_action_log=" set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert overwrite table ${APP}.dwd_action_log partition(dt='$do_date') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(action,'$.action_id'), get_json_object(action,'$.item'), get_json_object(action,'$.item_type'), get_json_object(action,'$.ts') from ${APP}.ods_log lateral view ${APP}.explode_json_array(get_json_object(line,'$.actions')) tmp as action where dt='$do_date' and get_json_object(line,'$.actions') is not null;" dwd_display_log=" set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert overwrite table ${APP}.dwd_display_log partition(dt='$do_date') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.during_time'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.ts'), get_json_object(display,'$.display_type'), get_json_object(display,'$.item'), get_json_object(display,'$.item_type'), get_json_object(display,'$.order'), get_json_object(display,'$.pos_id') from ${APP}.ods_log lateral view ${APP}.explode_json_array(get_json_object(line,'$.displays')) tmp as display where dt='$do_date' and get_json_object(line,'$.displays') is not null;" dwd_error_log=" set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert overwrite table ${APP}.dwd_error_log partition(dt='$do_date') select get_json_object(line,'$.common.ar'), get_json_object(line,'$.common.ba'), get_json_object(line,'$.common.ch'), get_json_object(line,'$.common.is_new'), get_json_object(line,'$.common.md'), get_json_object(line,'$.common.mid'), get_json_object(line,'$.common.os'), get_json_object(line,'$.common.uid'), get_json_object(line,'$.common.vc'), get_json_object(line,'$.page.item'), get_json_object(line,'$.page.item_type'), get_json_object(line,'$.page.last_page_id'), get_json_object(line,'$.page.page_id'), get_json_object(line,'$.page.source_type'), get_json_object(line,'$.start.entry'), get_json_object(line,'$.start.loading_time'), get_json_object(line,'$.start.open_ad_id'), get_json_object(line,'$.start.open_ad_ms'), get_json_object(line,'$.start.open_ad_skip_ms'), get_json_object(line,'$.actions'), get_json_object(line,'$.displays'), get_json_object(line,'$.ts'), get_json_object(line,'$.err.error_code'), get_json_object(line,'$.err.msg') from ${APP}.ods_log where dt='$do_date' and get_json_object(line,'$.err') is not null;" case $1 in dwd_start_log ) hive -e "$dwd_start_log" ;; dwd_page_log ) hive -e "$dwd_page_log" ;; dwd_action_log ) hive -e "$dwd_action_log" ;; dwd_display_log ) hive -e "$dwd_display_log" ;; dwd_error_log ) hive -e "$dwd_error_log" ;; all ) hive -e "$dwd_start_log$dwd_page_log$dwd_action_log$dwd_display_log$dwd_error_log" ;; esac
(2) Increase script execution permission
[root@hadoop102 bin]$ chmod 777 ods_to_dwd_log.sh
2) Script use
(1) Execute script
[root@hadoop102 module]$ ods_to_dwd_log.sh all 2020-06-14
(2) Query import results