0865-5.16.2 - how to build a dolphin scheduler cluster and integrate with a secure CDH

1. Purpose of document preparation

Apache dolphin scheduler (hereinafter referred to as DS) is a distributed, decentralized and easy to expand visual DAG workflow task scheduling platform. It is committed to solving the complex dependencies in the data processing process, so that the scheduling system can be used out of the box in the data processing process. This document mainly introduces how to build a dolphin scheduler cluster and integrate it with a secure CDH cluster.

High reliability: decentralized multi Master and multi Worker service peer-to-peer architecture to avoid excessive pressure on a single Master. In addition, task buffer queue is used to avoid overload.

Easy to use: DAG monitoring interface, all process definitions are visual, customized DAG is completed by dragging tasks, integrated with third-party systems through API and deployed with one click.

Rich usage scenarios: support multi tenancy, suspend and resume operations, closely fit the big data ecology, and provide spark, hive, M / R, Python and sub_ There are nearly 20 task types, such as process and shell.

High scalability: it supports user-defined task types. The scheduler uses distributed scheduling. The scheduling capacity grows linearly with the cluster. The Master and Worker support dynamic uplink and offline.

  • Test environment description

1.CM and CDH versions are 5.16.2

2. Enable Kerberos for the cluster

3. Dolphin scheduler version is 1.3.8

4. Cluster HDFS and Yan services have enabled HA

5. The operating system is RedHat7.6

2. Deployment environment description

This time, three nodes are used to deploy the dolphin scheduler cluster and build a cluster with high availability. The specific deployment nodes and role assignments are as follows:

3. Preparation of preconditions

1. Prepare the latest dolphin scheduler installation package. The address is as follows:


Upload the downloaded installation package to the / root directory of cdh02.fayson.com node.

2. Modify the / etc/profile configuration file on all nodes of the cluster and add JDK environment variables. The configuration is as follows:

export JAVA_HOME=/usr/java/jdk1.8.0_232-cloudera/
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tool.jar:$CLASSPATH

Execute the source /etc/profile command to make it effective immediately

Note: configure Java environment variables on all nodes of the dolphin scheduler cluster.

3. Ensure that psmisc package is installed on all nodes of the cluster. The installation command is as follows:

yum -y install psmisc

4.DS cluster installation depends on Zookeeper. Because Fayson's DS is integrated with CDH cluster, ZK in the cluster is used and is not installed independently

5. Create the hosts mapping of DS deployment users and all nodes of the cluster, and configure the / etc/hosts mapping file on all nodes of the cluster, as follows:

On all nodes of the cluster, use root to execute the following commands, add a dolphin user to the operating system, set the user password to dolphin 123, and configure sudo password free permission for the user

useradd dolphin
echo "dolphin123" | passwd --stdin dolphin
echo 'dolphin  ALL=(ALL)  NOPASSWD: NOPASSWD: ALL' >> /etc/sudoers
sed -i 's/Defaults    requirett/#Defaults    requirett/g' /etc/sudoers

Execute the following command on all nodes to switch to dolphin only user, and verify whether sudo password free permission configuration is successful

su - dolphin
sudo ls /root/

Note: the deployment user dolphin must be created here, and sudo password free must be configured for the user, otherwise the user cannot switch different Linux users through sudo -u to realize multi tenant running jobs.

6. Select the cdh02.fayosn.com node as the master node, and configure the local ssh password free login of node 02 and the password free login to nodes 03 and 04 (dolphin user configuration is used here)

Execute the following commands on all nodes of the cluster to generate secret key files for dolphin users:

su - dolphin
ssh-keygen -t rsa

Add the contents of the public key file in the dolphin user / home/dolphin/.ssh directory to authorized_keys file, and modify the file permission to 600

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
ll ~/.ssh/authorized_keys 

Test whether the password free login configuration of node 02 is successful

ssh localhost

The 02 node is generated as authorized_ Copy the keys file to the / home/dolphin/.ssh directory of other nodes:

scp ~/.ssh/authorized_keys dolphin@cdh[03-04].fayson.com:/home/dolphin/.ssh/

After completing the above operations, verify 02 whether the password free login configuration to other nodes is successful. If the password is not required, the configuration is successful.

7. Unzip the copy of the dolphin scheduler installation package uploaded to node 02 to the / home/dolphin directory

cd /home/dolphin
tar -zxvf apache-dolphinscheduler-1.3.8-bin.tar.gz
chown -R dolphin. apache-dolphinscheduler-1.3.8-bin

4. Install MySQL service

1. Download the installation package of 5.7.32 from the official MySQL website and upload it to the root directory of cdh02.fayson.com

2. Execute the following command on the node where MySQL service is installed to add MySQL users

useradd mysql
id mysql

3. Unzip the MySQL installation package and move it to the / var/lib directory

tar -zxvf mysql-5.7.32-el7-x86_64.tar.gz 
mv mysql-5.7.32-el7-x86_64 /var/lib/mysql
chown -R mysql. /var/lib/mysql

4. Created MySQL data directory

mkdir -p /var/lib/mysql/data
chown mysql. /var/lib/mysql/data/

5. Modify the MYSQL configuration file / etc/my.cnf as follows:

# Set the default character set of mysql client

#Set 3306 port
port = 3306
# Set mysql installation directory
# Set the storage directory of mysql database data
# Maximum connections allowed
# The character set used by the server defaults to the 8-bit encoded latin1 character set
# The default storage engine that will be used when creating new tables
explicit_defaults_for_timestamp = 1

#Slow log location 
#Slow log time
#Enable slow log


6. Enter the MySQL installation directory and execute the following command to initialize the MySQL service

cd /var/lib/mysql/
./bin/mysqld --initialize --user=mysql --basedir=/var/lib/mysql/ --datadir=/var/lib/mysql/data/

Note: the root user password will be generated randomly during the initial process, which needs to be recorded for later login.

7. Execute the following command to add the MySQL command to the system self startup service list

cp /var/lib/mysql/support-files/mysql.server /etc/init.d/mysqld
chmod + /etc/init.d/mysqld

8. Execute the following command to add mysqld service to the self startup list and start MySQL service

systemctl enable mysqld
systemctl start mysqld
systemctl status mysqld

9. Log in to MySQL service with root user and create dolphin database

./bin/mysql -uroot -p

Execute the following SQL statement to modify the root user password

set password=password('!@!Q123456');
flush privileges;

5. Initialize the dolphin scheduler database

1. Since MySQL is selected as the database, you need to copy the JDBC driver of Mysql to the decompressed / home / Dolphin / Apache dolphin scheduler-1.3.8-bin/lib/ directory before initialization

sudo chown dolphin. mysql-connector-java-5.1.49-bin.jar 
mv mysql-connector-java-5.1.49-bin.jar ./apache-dolphinscheduler-1.3.8-bin/lib/

2. Log in as root and execute the following SQL statement to create the DS database

GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphins'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphins'@'localhost' IDENTIFIED BY 'password';
flush privileges;

3. Modify. / apache-dolphin scheduler-1.3.8-bin/conf/datasource.properties configuration file to configure database access information of dolphin scheduler


Note: the default in the configuration file is postgresql, which needs to be modified to MySQL configuration.

4. After completing the above configuration modification, execute the following command to initialize the database

sh apache-dolphinscheduler-1.3.8-bin/script/create-dolphinscheduler.sh

Note: the above log indicates that the database initialization is successful.

6. Dolphin scheduler cluster deployment

After completing the preparatory work for DS cluster deployment, you need to modify the cluster configuration and distribute the installation package to all nodes of the cluster.

6.1 keytab file preparation

Since the DS is connected to a secure CDH cluster and requires a KeyTab file, you need to prepare the KeyTab file and copy it to the same installation directory of all nodes of the cluster. The KeyTab file here is mainly used to access HDFS services, and use the storage capacity of HDFS to store some script files, data files and other resources.

1. Add a principal of hdfs to the kdc service

[root@cdh01 ~]# kadmin.local 
kadmin.local:  addprinc hdfs@FAYSON.COM

2. Execute the following command to generate a keytab file of hdfs (the keytab file is in the current directory by default)

xst -norandkey -k hdfs.keytab hdfs

3. Execute the following command to test whether the generated keytab file is available

kinit -kt hdfs.keytab hdfs

4. Copy the generated keytab file to the / opt/keytabs directory of all nodes in the cluster, and change the directory of the file to dolphin user

sudo mkdir /opt/keytabs
sudo chown -R dolphin. /opt/keytabs/

Note: ensure that the keytabs directory is consistent on all nodes of the cluster, and that the directory owner is consistent with the deployment user.

6.2 modifying configuration files

Since the installation package is placed on node 02, all configurations are subject to the modification of node 02. Next, we mainly introduce the modification of configuration parameters.

1. Modify. / conf / env / Dolphin scheduler_ Env.sh environment variable script. The environment variables are adjusted according to the actual situation. This document only retains Java related environment variables and shields big data related environment variables

export JAVA_HOME=/usr/java/jdk1.8.0_232-cloudera
export PATH=$JAVA_HOME/bin:$PATH

Currently, we do not consider using Spark and Flink related tasks provided by dolphin scheduler, so we do not consider configuring environment variables related to big data.

2. Modify the one click deployment configuration file. / conf/config/install_config.conf, modify the corresponding configuration according to the prompt

Database related configuration parameters:

# postgresql or mysql

Configure Zookeeper information, deployment path and deployment user parameters:


Configure user local data directory, resource storage type and storage path configuration:


CDH cluster yard information configuration parameters:


Kerberos related configuration parameters:


Relevant parameters of role assignment: (taking the production environment as the standard, the high availability of each role needs to be considered, so the configuration needs to be adjusted accordingly)

# api server port

3. Since the HDFS service enables HA, you need to copy the HDFS customer configuration files core-site.xml and hdfs-site.xml of the CDH cluster to the / home / Dolphin / apache-dolphin scheduler-1.3.8-bin/conf directory

cp /etc/hadoop/conf/core-site.xml /home/dolphin/apache-dolphinscheduler-1.3.8-bin/conf
cp /etc/hadoop/conf/hdfs-site.xml /home/dolphin/apache-dolphinscheduler-1.3.8-bin/conf

6.3 distribute installation package and start

1. Use dolphin user to create / opt / Dolphin scheduler directory (i.e. deployment directory) on all nodes of the cluster

mkdir /opt/dolphinscheduler
chown dolphin. /opt/dolphinscheduler
ll /opt/

2. Create the user data directory of dolphin scheduler on all nodes of the cluster (consistent with the path configured in the previous configuration file)

mkdir -p /data3/dolphinscheduler
chown dolphin. /data3/dolphinscheduler
ll /data3/

3. In the / home / Dolphin / apache-dolphin scheduler-1.3.8-bin directory of node 02, execute the following command to install

sh install.sh

Wait for the script to run successfully, and the log indicates that the service has been started

7. Dolphin scheduler function verification

1. After the service is started successfully, enter in the browser

2. Enter admin / Dolphin scheduler 123 to log in

3. Create a fayson tenant (the tenant's name corresponds to the cluster Linux user. If the user does not exist, ds will automatically create it)

4. Add an ordinary user, testa, and bind the user to the fayosn tenant. This user is mainly used to log in to the front page of DS

5. Log in to the DS platform using the beta user

6. Enter the resource center, create a folder or upload a file, and test whether the storage function based on HDFS is normal

Directories and files can also be created normally. The command line confirms whether they exist in HDFS

7. Create a test project

8. Enter the newly created test project project and create a workflow

9. Create a test shell task and perform a simple kinit authentication

10. After saving, click go online and run the test

11. The task runs successfully

8. Summary

1. When the dolphin scheduler selects HDFS as the resource center storage, the configuration parameters related to the CDH cluster need to be configured. If HDFS enables HA, the corresponding core-site.xml and hdfs-site.xml configuration files need to be copied to the conf directory of the dolphin scheduler deployment directory

2. When using the one click deployment script, if the non root user is used, you need to ensure that the deployment directory already exists on all nodes of the cluster, and the owner of the deployment directory is the user running the one click deployment script, otherwise the deployment file cannot be copied to all nodes of the cluster normally during deployment.

3. During deployment of dolphin scheduler, the user's local data directory needs to be configured. The directory needs to be created manually, and the owner of the directory is consistent with the deployment and service startup user. Otherwise, it will fail to use the upload and other functions of the resource center.

4. When entering the dolphin Scheduler interface for use, there must be corresponding tenant information, and the tenant information on the DS platform is consistent with the Linux users of the cluster. When the user does not exist, the DS will automatically create it.

5. After a user binds a tenant, all jobs submitted by the user run in the mode of sudo -u ${tenant}. Therefore, when using local resource files, it is necessary to ensure that the tenant has corresponding access rights, otherwise the job will fail.

6. When running Hive jobs, MapReduce and other jobs running in the cluster through shell script or other methods, you need to ensure that the configuration information of Yarn is correct, otherwise the job will fail to run (for jobs running on Yarn, DS will judge whether the task runs successfully according to the application ID of the job)

Posted on Tue, 02 Nov 2021 03:22:05 -0400 by daboymac