7. Hadoop3.3.1 HA High Availability Cluster QJM (based on Zookeeper, NameNode High Availability + Yarn High Availability)

Previous

Setup of Hadoop3.3.1 HA High Availability Cluster

(NameNode High Availability + Yarn High Availability based on Zookeeper)

Name Node HA of QJM

Use Quorum Journal Manager or regular shared storage
Name Node HA of QJM

Hadoop HA Mode Setup (High Availability)

1. Cluster Planning

There are three virtual machines, master, worker1 and worker2.

There are three namenode s, resourcemanager on worker1, woker2.

masterwoker1worker2
NameNodeyesyesyes
DataNodenoyesyes
JournalNodeyesyesyes
NodeManagernoyesyes
ResourceManagernoyesyes
Zookeeperyesyesyes
ZKFCyesyesyes

Because the virtual machine was not recreated, it was modified on the original basis. So the name is also hadoop1, hadoop2, hadoop3

hadoop1 = master

hadoop2 = worker1

hadoop3 = worker2

2. Zookeeper cluster building:

Reference resources: IV. Zookeeper3.7 Installation

3. Modify Hadoop Cluster Profile

Modify vim core-site.xml

vim core-site.xml

core-site.xml:

<configuration>
<!-- HDFS Main entrance, mycluster Just as the logical name of the cluster, you can change it at will, but be sure to
hdfs-site.xml in dfs.nameservices Values are consistent-->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>

<!-- Default hadoop.tmp.dir Pointing to is/tmp Directory, which will result in namenode and datanode>All data is saved in the volatile catalog and modified here-->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/export/servers/data/hadoop/tmp</value>
    </property>

<!--User role configuration, not configuring this will result in web Page Error-->
    <property>
        <name>hadoop.http.staticuser.user</name>
        <value>root</value>
    </property>

<!--zookeeper Cluster address, where a single unit can be configured, such as a cluster separated by commas-->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
    </property>
    <!-- hadoop link zookeeper Timeout Settings -->
    <property>
        <name>ha.zookeeper.session-timeout.ms</name>
        <value>1000</value>
        <description>ms</description>
    </property>
</configuration>

The Hadoop1, hadoop2, hadoop3 in the zookeeper address specified above are replaced by the hostname of your machine (you need to configure the mapping of hostname to ip first) or ip

Modify hadoop-env.sh

vim hadoop-env.sh

hadoop-env.sh

When using cluster management scripts, the java command will not take effect when using ssh because the configuration of environment variables in the / etc/profile file will not be read when using ssh for remote login, so you need to explicitly configure the absolute path of jdk in the configuration file (if the jdk paths of each node are different, the java_HOME of the cost machine should be changed in hadoop-env.sh).
There are strict restrictions on role permissions in hadoop 3.x, which additionally specifies the user to which the role belongs compared to hadoop 2.x.
This is only for building HDFS clusters. If YARN is involved, you should also modify the configuration in the corresponding yarn-env.sh files.
Add the following at the end of the script:

export JAVA_HOME=/opt/jdk1.8.0_241
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_ZKFC_USER="root"
export HDFS_JOURNALNODE_USER="root"

Modify hdfs-site.xml

vim hdfs-site.xml

hdfs-site.xml

<configuration>

    <!-- Specify number of copies -->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>

    <!-- To configure namenode and datanode Working Directory-Data Store Directory -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/export/servers/data/hadoop/tmp/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/export/servers/data/hadoop/tmp/dfs/data</value>
    </property>

    <!-- Enable webhdfs -->
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>

    <!--Appoint hdfs Of nameservice by cluster1,Need and core-site.xml Consistency in
                 dfs.ha.namenodes.[nameservice id]For in nameservice Each of them NameNode Set the unique identifier.
        Configure a comma-separated NameNode ID List. This will be DataNode Identify as All NameNode. 
        For example, if you use"cluster1"As nameservice ID,And use"nn1"and"nn2"As NameNodes Identifier
    -->
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>

    <!-- cluster There are three below NameNode,Namely nn1,nn2,nn3-->
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2,nn3</value>
    </property>

    <!-- nn1 Of RPC Mailing address -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>hadoop1:9000</value>
    </property>

    <!-- nn1 Of http Mailing address -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>hadoop1:9870</value>
    </property>

    <!-- nn2 Of RPC Mailing address -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>hadoop2:9000</value>
    </property>

    <!-- nn2 Of http Mailing address -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>hadoop2:9870</value>
    </property>

    <!-- nn3 Of RPC Mailing address -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn3</name>
        <value>hadoop3:9000</value>
    </property>

    <!-- nn3 Of http Mailing address -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn3</name>
        <value>hadoop3:9870</value>
    </property>

    <!-- Appoint NameNode Of edits Shared storage location for metadata. that is JournalNode list
                 this url Configuration format: qjournal://host1:port1;host2:port2;host3:port3/journalId
        journalId Recommended Use nameservice,The default port number is 8485 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster</value>
    </property>

    <!-- Appoint JournalNode Local magnetism H Location -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/export/servers/data/hadoop/tmp/journaldata</value>
    </property>

    <!-- open NameNode Failed Auto Switch -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- Configuration failure automatic switch implementation -->
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- Configure the isolation mechanism method, with multiple mechanisms split by line breaks, that is, each mechanism uses a temporary line -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
            sshfence
            shell(/bin/true)
        </value>
    </property>

    <!-- Use sshfence Required for isolation mechanisms ssh No landing -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>

    <!-- To configure sshfence Isolation mechanism timeout -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>

    <property>
        <name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
        <value>60000</value>
    </property>
    
    
    <!--Specify secondary name node-->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop3:9868</value>
    </property>
    
</configuration>

To create a journaldata folder

workers

In hadoop 2.x, this file is called slaves and configures the host addresses of all datanodes. All you need to do is fill in all the datanode host names.

hadoop1
hadoop2
hadoop3

Yarn High Availability

vim mapred-site.xml

Modify mapred-site.xml

<configuration>

        <!-- Appoint mr Frame is yarn mode -->
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>

        <!-- To configure MapReduce JobHistory Server Address, default port 10020 -->
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>hadoop1:10020</value>
        </property>

        <!-- To configure MapReduce JobHistory Server web ui Address, default port 19888 -->
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>hadoop1:19888</value>
        </property>
</configuration>
vim yarn-site.xml

Modify yarn-site.xml

<configuration>
    <!-- open RM High Availability -->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <!-- Appoint RM Of cluster id -->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yrc</value>
    </property>

    <!-- Appoint RM Name -->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <!-- Specify separately RM Address -->
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hadoop2</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hadoop3</value>
    </property>

    <!-- Appoint zk Cluster Address -->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>hadoop1:2181,hadoop2:2181,hadoop2:2181</value>
    </property>
<!--Reducer How to get data-->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
<!--Log aggregation turned on-->
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
<!--Log retention time set to 1 day-->
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>

    <!-- Enable automatic recovery -->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>

    <!-- Formulate resourcemanager The status information of is stored in zookeeper On Cluster -->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
</configuration>

All modified and distributed to other cluster nodes
(in the hadoop/etc path)
scp /export/servers/hadoop-3.3.1/etc/hadoop/* hadoop2:/export/servers/hadoop-3.3.1/etc/hadoop/

scp /export/servers/hadoop-3.3.1/etc/hadoop/* hadoop3:/export/servers/hadoop-3.3.1/etc/hadoop/

Start zookeeper cluster

Start on each machine:

zkServer.sh start
zkServer.sh status

Format namenode, zkfc

First, start journalnode on all virtual machines:

hdfs --daemon start journalnode

Once all are started, on the master(hadoop1) node, format the namenode

hadoop namenode -format

Format the namenode once because it was fully distributed before

However, the datanode, namenode in the cluster is related to the CuluserID in/current/VERSION/

So format it again and start it up, and the other two nodes synchronously formatted namenode s do not conflict

FormZK Same

Then start the namenode separately:

hdfs namenode

Then, on the other two machines, synchronize the formatted namenode:

hdfs namenode -bootstrapStandby

The transfer information should be visible from the master.

After the transfer is complete, on the master node, format zkfc:

hdfs zkfc -formatZK

Start hdfs

On the master node, start dfs first:

start-dfs.sh

Then start yarn:

start-yarn.sh

Start the mapreduce task history server:

mapred --daemon start historyserver

You can see how each node starts its process:

Try HA mode

First look at the status of each namenode host:

hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState nn2
hdfs haadmin -getServiceState nn3

You can see that there are two standbies and one active.

On the master node of active, kill the namenode process:

View the node again at this time

As you can see, nn1 has been switched to active, and Hadoop High Availability Cluster is basically set up.

Tags: Big Data Hadoop Zookeeper hdfs Yarn

Posted on Mon, 22 Nov 2021 20:33:21 -0500 by Hendricus