rpm -qa | grep vim // See what packages vim commands are in appear vim-minimal-7.4.160-4.el7.x86_64 yum install -y vim* //Install vim related packages rpm -qa | grep vim see
HADOOP learning notes
1, Install virtual machine (CentOS)
2, Modify host name (host name of current virtual machine)
1. View the current host name
Command: hostname
2. Modify host name
Command: vi /etc/hostname
(1) Be sure to be in vi's command mode (press esc key in edit mode)
wq Save exit :wq!Force save exit :q sign out :q!forced return
(2) In edit mode: i the position of the current cursor.
\3. start-up-- Command: reboot -h now Supplement: the second way to modify the host name Hostnamectl set -hostname <Self written host name> Bash
3, Modify ip address (ip address of current virtual machine)
Command:
vi /etc/sysconfig/network-scripts/ifcfg-ens33
1. Modify to obtain ip statically
BOOTPROTO="static"
2. Add IP address
IPADDR=192.168.1.100
3. Add gateway
GATEWAY=192.168.1.2
4. Add subnet mask
NETMASK=255.255.255.0
5. Add domain name parser
DNS1=192.168.1.2
4, Modify the mapping between ip address and host name
command: vi /etc/hosts
Add the corresponding ip and the corresponding host name
1, Modify the network configuration of the virtual machine
2, Modify the network configuration of windows
3, Switch of the firewall of the virtual machine (do not turn off if ping is enabled, but turn off if ping is disabled)
1. View firewall status
systemctl status firewalld
2. Turn off the firewall
systemctl stop firewalld
3. The firewall does not start after startup
systemctl disable firewalld
5, View the current ip address of the virtual machine
1. ifconfig -a 2. ip addr from windows go ping What about the virtual machine ip address can ping Pass, indicating that the configuration is successful. Shutdown command: shutdown -h now
6, Open moba to create a new connection service
The following page appears
(1) If you failed to ping192.168.1.100 before, you can't open it
(2) If ping192.168.1.128 succeeds, the new session must be connected to 192.168.1.128
(3) If the virtual machine is not started, the following conditions also occur. Please enter R directly to refresh
We need to create two folders under / opt
(1) Software: put the compressed package of software
Command to create software folder: mkdir software
(2) module: put the unzipped folder of the software
(1) Switch to the software folder
cd /opt/software
(2) Unzip the jdk into the module folder
command
tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module/
(4) Configure jdk environment variables (caution ~ ~)
1. Enter vi /etc/profile
2.Shift+g to the last line
3. After esc --------: wq
4. Enter: source /etc/profile
5. Enter: java -version appears
It worked
(5) Configuring hadoop environment variables
(1) Enter vi /etc/profile
(2) Shift+g to the last line
(3) After that, esc --------: wq
(4) Input: source /etc/profile
(5) Input: hadoop version
1, Local deployment of Hadoop
Goal 1: count the number of occurrences of a word~
-
First, there must be a file containing the content
Create a directory (folder) test under / opt
Command:
Create an input directory (folder) and an output directory (folder) under the / opt/test directory
Command:
Create a file containing the contents in the / opt/test/input directory. (actually editing text in a file)
Command:
-
Use hadoop to execute this file
Switch to/opt/module/hadoop-3.1.3/share/hadoop/mapreduce Directory Executive document: hadoop jar hadoop-mapreduce-examples-3.1.3.jar wordcount /opt/tesd /opt/test/output/count.txt
Command: cat part-r-00000 t/input/ /opt/test/output/count.txt
\3. View the results after execution
Command: c
2, Pseudo distributed deployment of Hadoop
$$
'* * * configure cluster environment
(1) Modify the first configuration
In / opt/module/hadoop-3.1.3/etc/hadoop directory
Set hadoop-env.sh file
Vi hadoop-env.sh
Input / search java in command mode_ HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
(2) Modify the second configuration
In / opt/module/hadoop-3.1.3/etc/hadoop directory
Set up the core-site.xml file
i core-site.xml
(3) Modify the third configuration
In / opt/module/hadoop-3.1.3/etc/hadoop directory
Set hdfs-site.xml file
Vi hdfs-site.xml
Command: vi hdfs-site.xml
Specify the number of HDFS in the configuration
<configuration> <!-- appoint HDFS Number of copies --> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
\2. Start the cluster
(1) format namenode Command: Hdfs namenode -format (2) start-up namenode Command: hdfs --daemon start
HDFS is for storage and YARN is for scheduling.
1. Switch to etc under hadoop (all configuration files are under etc)
2. Configure core-site.xml in hadoop
Vi core-site.xml appoint HDFS in namenode Address of the. Place the command in configuration In label <configuration> <!-- appoint HDFS in NameNode Address of --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop100:9820</value> </property> <!-- appoint Hadoop The storage directory where files are generated at run time --> <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-3.1.3/data/tmp</value> </property> </configuration
3. Configure hdfs-site.xml in hadoop
Command: vi hdfs-site.xml stay configuration Specified in HDFS Number of <configuration> <!-- appoint HDFS Number of copies --> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
4. Format NameNode (format it at the first startup, and do not always format it later)
Format command: hdfs namenode –format
5. Start namenode
Command: hdfs - -daemon start namenode
6. Start datanode
Command: hdfs --daemon start datanode
7. Configure yarn-site.xml
Command: vi yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <!-- Reducer How to get data --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- appoint YARN of ResourceManager Address of --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop100</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> </configuration>
8. Configure mapred-site.xml
Command:[root@hadoop100 hadoop]# vi mapred-site.xml <configuration> </configuration>
9. Start the resourcemanager
Command:[root@hadoop100 hadoop]# yarn --daemon start resourcemanager
10. nodemanager
Command:[ root@hadoop100 hadoop]# yarn --daemon start nodemanager
11. Jsp view java process
Command: jsp
12. user/input\
Command: hdfs dfs -mkdir -p /user/input
13. Upload files to HDFS
Command: hdfs dfs – put the file name to be uploaded and the address to be uploaded
Case: HDFS DFS - wcinput / wc.input / user / input/
14. Check the file directory of hdfs
Command: HDFS - ls file path
Note that the root directory is not the root directory of linux
Case: hdfs dfs -ls /user/input/
15. View the file contents in hdfs
Command: hdfs dfs – cat file name
Case: hdfs dfs -cat /user/input/wc.input
16. Executive documents
Command: hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount execution file location output file location
Case: hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /user/input /user/output
- View the results after execution
Command: hdfs dfs -cat output file path/*
Case:
hdfs dfs -cat /user/output/
18. Stop the process hdfsdaemon stop namenode
Hdfs maintains an abstract directory * * * ''
$$
Hadoop is fully distributed
1. Namenode: stores the metadata of the file.
2. Datanode: store file fast data and block data in the local file system.
3. Secondary Namenode: Backup metadata of Namenode every once in a while.
hadoop100 | hadoop101 | hadoop102 | |
---|---|---|---|
HDFS | Namenode Datanode | Datanode | Secondary Namenode |
YARN | nodemanager | ResourceManger nodemanager | nodemanager |
Start:
1)start-up hdfs relevant hdfs --daemon start namenode hdfs --daemon start datanode 2)start-up yarn relevant yarn --daemon start resourcemanager yarn --daemon start nodemanage
YARN architecture
1)
(1) Main role of resource manager (RM)
(2) Processing client requests
(3) Start or monitor the ApplicationMaster
(4) Resource allocation and scheduling
1. Cluster configuration
Core profile
Configuration: hadoop-env.sh (in / opt/module/hadoop-3.1.3/etc/hadoop directory)
Get the installation path of JDK in Linux system:
[soft863@ hadoop100 ~]# echo $JAVA_HOME /opt/module/jdk1.8.0_212
Modify JAVA_HOME path in hadoop-env.sh file: / add content lookup
export JAVA_HOME=/opt/module/jdk1.8.0_212
1. Configure core-site.xml namenode
cd $HADOOP_HOME/etc/hadoop vim core-site.xml The contents of the document are as follows: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop1000:9820</value> </property> <!-- hadoop.data.dir Is a custom variable, which will be used in the following configuration file --> <property> <name>hadoop.data.dir</name> <value>/opt/module/hadoop-3.1.3/data</value> </property> </configuration>
2.HDFS configuration file datanode
to configure hdfs-site.xml vim hdfs-site.xml The contents of the document are as follows: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- namenode Data storage location --> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.data.dir}/name</value> </property> <!-- datanode Data storage location --> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.data.dir}/data</value> </property> <!-- secondary namenode Data storage location --> <property> <name>dfs.namenode.checkpoint.dir</name> <value>file://${hadoop.data.dir}/namesecondary</value> </property> <!-- datanode The restart timeout is 30 s,Resolve compatibility issues, skip --> <property> <name>dfs.client.datanode-restart.timeout</name> <value>30</value> </property> <!-- set up web End access namenode Address of --> <property> <name>dfs.namenode.http-address</name> <value>hadoop1000:9870</value> </property> <!-- set up web End access secondary namenode Address of --> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop1002:9868</value> </property> </configuration>
3.YARN configuration file
to configure yarn-site.xml vim yarn-site.xml The contents of the document are as follows: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop101</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> </configuration>
4.MapReduce configuration file
to configure mapred-site.xml vim mapred-site.xml The contents of the document are as follows: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
2. Cluster distribution
scp -r (recursive) (full copy)
rsync -av (differentiated copy)
hold/etc/hadoop/Copy directory to hadoop1001: [root@hadoop1000 opt]# cd /opt [root@hadoop1000 opt]# scp -r hadoop/ root@hadoop1001:/opt/module/hadoop-3.1.3/etc/ hold/etc/hadoop/Copy directory to hadoop1002: [root@hadoop1000 opt]# scp -r hadoop/ root@hadoop1002:/opt/module/hadoop-3.1.3/etc/ hold /etc/profile copy to hadoop100 hadoop101 [root@hadoop102 opt]# rsync -av /etc/profile hadoop101:/etc [root@hadoop102 opt]# rsync -av /etc/profile hadoop100:/etc stay hadoop100 and hadoop101 They should be carried out separately source /etc/profile [root@hadoop100 opt]# source /etc/profile [root@hadoop101 opt]# source /etc/profile
3. Distributed cluster formatting
The distributed cluster should be formatted before starting for the first time
Before formatting, delete the data directory and logs directory under the hadoop installation directory on the three servers
[root@hadoop1001 opt]# cd /opt/module/hadoop-3.1.3 [root@hadoop1001 opt]# rm -rf data [root@hadoop1001 opt]# rm -rf logs
Perform formatting on the server on which the specified namenode is running:
(namenode specifies the running on Hadoop 100)
[root@hadoop1000 hadoop-3.1.3]# hdfs namenode –format
ssh password free login
1. Generate public and private keys at each node and copy them
Hadoop1000:
Generate public and private keys
[root@hadoop100] ssh-keygen -t rsa Then hit (three returns) Copy the public key to the target machine for password free login [root@hadoop1000] ssh-copy-id hadoop1000 [root@hadoop1000] ssh-copy-id hadoop1001 [root@hadoop1000] ssh-copy-id hadoop1002
Hadoop101:
Generate public and private keys [root@hadoop1001] ssh-keygen -t rsa Then hit (three returns) Copy the public key to the target machine for password free login [root@hadoop1001] ssh-copy-id hadoop1000 [root@hadoop1001] ssh-copy-id hadoop1001 [root@hadoop1001] ssh-copy-id hadoop1002
Hadoop102:
Generate public and private keys [root@hadoop1002] ssh-keygen -t rsa Then hit (three returns) Copy the public key to the target machine for password free login [root@hadoop1002] ssh-copy-id hadoop1000 [root@hadoop1002] ssh-copy-id hadoop1001 [root@hadoop1002] ssh-copy-id hadoop1002
Start the cluster with a script
1. Modify hadoop configuration file
Add a few lines of data at the top of the start-dfs.sh and stop-dfs.sh files on Hadoop 100
[root@hadoop100] cd /opt/module/hadoop-3.1.3/sbin
[root@hadoop100] vi start-dfs.sh
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
[root@hadoop100] vi stop-dfs.sh
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root
Add a few lines of data at the top of the start-yarn.sh and stop-yarn.sh files
[root@hadoop100] vi start-yarn.sh
[root@hadoop100] vi stop-yarn.sh
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
Modify workers on Hadoop 100:
[root@hadoop100] cd /opt/module/hadoop-3.1.3/etc/hadoop
[root@hadoop100] vi workers
hadoop100
hadoop101
hadoop102
Synchronize the above changes to Hadoop 101 and Hadoop 102:
[root@hadoop100] rsync -av /opt/module/hadoop-3.1.3/sbin/ hadoop101:/opt/module/hadoop-3.1.3/sbin/ [root@hadoop100] rsync -av /opt/module/hadoop-3.1.3/sbin/ hadoop102:/opt/module/hadoop-3.1.3/sbin/ [root@hadoop100] rsync -av /opt/module/hadoop-3.1.3/etc/hadoop/ hadoop101:/opt/module/hadoop-3.1.3/etc/hadoop/ [root@hadoop100] rsync -av /opt/module/hadoop-3.1.3/etc/hadoop/ hadoop102:/opt/module/hadoop-3.1.3/etc/hadoop/
Start stop cluster
Start the cluster:
If hadoop related programs have been started on the cluster, you can stop them first.
Execute the following script on Hadoop 100 to start hdfs:
[root@hadoop100] start-dfs.sh
Execute the following script on Hadoop 101 to start yarn:
[root@hadoop101] start-yarn.sh
Stop cluster:
Execute the following script on Hadoop 100 to stop hdfs:
[root@hadoop100] stop-dfs.sh
Execute the following script on Hadoop 101 to stop yarn:
[root@hadoop101] stop-yarn.sh
Hive's erection and installation
Mysql installation
1, Download mysql (also available in nailing group)
https://dev.mysql.com/downloads/mysql/5.7.html#downloads
[external chain picture transfer failed. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-svecy1gz-1638703515077)( file:///C: \Users\ADMINI~1\AppData\Local\Temp\ksohtml\wps79B7.tmp.jpg)]
2, Upload it to / opt/software under linux
3, Check whether Mysql has been installed in the current system
rpm -qa|grep mariadb mariadb-libs-5.5.56-2.el7.x86_64 //If yes, uninstall with the following command
4, RPM - E -- nodeps mariadb LIBS / / uninstall mariadb with this command
5, Unzip it to / opt/module
The command is: tar -xf file to unzip - C location to unzip
6, Install the corresponding rpm file
If a problem is reported:
[the external chain picture transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-7imrrab0-163870355077)( file:///C: \Users\ADMINI~1\AppData\Local\Temp\ksohtml\wps79C8.tmp.jpg)]
1. Get plug-in: yum install -y libaio
2. Execute commands (in order):
sudo rpm -ivh --nodeps mysql-community-common-5.7.36-1.el7.x86_64.rpm sudo rpm -ivh --nodeps mysql-community-libs-5.7.36-1.el7.x86_64.rpm sudo rpm -ivh --nodeps mysql-community-libs-compat-5.7.36-1.el7.x86_64.rpm sudo rpm -ivh --nodeps mysql-community-client-5.7.36-1.el7.x86_64.rpm sudo rpm -ivh --nodeps mysql-community-server-5.7.36-1.el7.x86_64.rpm
7, Switch to / etc
8, cat my.cnf
[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-xyl4ntu7-1638703515078)( file:///C: \Users\ADMINI~1\AppData\Local\Temp\ksohtml\wps79C9.tmp.jpg)]
9, Switch to / var/lib/mysql and delete all files rm -rf*
10, Reset mysql: mysqld --initialize --user=mysql
11, View the generated random password: cat /var/log/mysqld.log
[external chain picture transfer failed. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-jyjkbp8a-1638703515078)( file:///C: \Users\ADMINI~1\AppData\Local\Temp\ksohtml\wps79CA.tmp.jpg)]
10, Start MySQL service
systemctl start mysqld
Log in to MySQL
mysql -uroot -p Enter password: Enter the temporarily generated password
Login succeeded
The password of the root user must be modified first, otherwise an error will be reported when performing other operations
mysql> set password = password("New password");
11. Modify the root user in the user table under the mysql database to allow any ip connection
mysql> update mysql.user set host='%' where user='root'; mysql> flush privileges;
Hive installation
1. Download the installation package: apache-hive-3.1.2-bin.tar.gz
Upload to linux system / opt/software / path
2. Decompression software
cd /opt/software/
tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/module/
3. Modify system environment variables
vim /etc/profile
Add content:
export HIVE_HOME=/opt/module/apache-hive-3.1.2-bin export PATH=$PATH:$HIVE_HOME/sbin:$HIVE_HOME/bin
Restart environment configuration:
source /etc/profile
4. Modify hive environment variable
cd /opt/module/apache-hive-3.1.2-bin/bin/
Edit the hive-config.sh file
vi hive-config.sh
New content:
export JAVA_HOME=/opt/module/jdk1.8.0_212 export HIVE_HOME=/opt/module/apache-hive-3.1.2-bin export HADOOP_HOME=/opt/module/hadoop-3.2.0 export HIVE_CONF_DIR=/opt/module/apache-hive-3.1.2-bin/conf
5. Copy hive profile
cd /opt/module/apache-hive-3.1.2-bin/conf/ cp hive-default.xml.template hive-site.xml
6. Modify Hive configuration file and find the corresponding location for modification
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>Username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.1.100:3306/hive?useUnicode=true&characterEncoding=utf8&useSSL=false&serverTimezone=GMT</value> <description> JDBC connect string for a JDBC metastore. To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. </description> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> <description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> <description> Enforce metastore schema version consistency. True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn't match with one from in Hive jars. </description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/opt/module/apache-hive-3.1.2-bin/tmp/${user.name}</value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>system:java.io.tmpdir</name> <value>/opt/module/apache-hive-3.1.2-bin/iotmp</value> <description/> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/opt/module/apache-hive-3.1.2-bin/tmp/${hive.session.id}_resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <property> <name>hive.querylog.location</name> <value>/opt/module/apache-hive-3.1.2-bin/tmp/${system:user.name}</value> <description>Location of Hive run time structured log file</description> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/opt/module/apache-hive-3.1.2-bin/tmp/${system:user.name}/operation_logs</value> <description>Top level directory where operation logs are stored if logging functionality is enabled</description> </property> <property> <name>hive.metastore.db.type</name> <value>mysql</value> <description> Expects one of [derby, oracle, mysql, mssql, postgres]. Type of database used by the metastore. Information schema & JDBCStorageHandler depend on it. </description> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> <description>Whether to include the current database in the Hive prompt.</description> </property> <property> <name>hive.cli.print.header</name> <value>true</value> <description>Whether to print the names of the columns in query output.</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/opt/hive/warehouse</value> <description>location of default database for the warehouse</description> </property>
7. Upload mysql driver package to / opt/module/apache-hive-3.1.2-bin/lib /
Driver package: mysql-connector-java-8.0.15.zip. Extract the jar package from it
8. Make sure there is a database named hive in the mysql database
9. Initialize metabase
schematool -dbType mysql -initSchema
10. Make sure Hadoop starts
11. Start hive
hive
12. Check whether the startup is successful
show databases;
ZOOKEEPER!
Premise: turn off the firewall
1. Decompress
cd /opt/module/ tar -zxvf apache-zookeeper-3.5.5-bin.tar.gz
2. Create data files and catalog files
Create two folders data and log in the following directory of zookeeper
cd /opt/module/apache-zookeeper-3.5.5-bin/ mkdir data mkdir log
3. Copy profile
cd /opt/module/apache-zookeeper-3.5.5-bin/conf/ cp zoo_sample.cfg zoo.cfg
Profile changes
vi zoo.cfg
The number of milliseconds of each tick tickTime=2000 The number of ticks that the initial synchronization phase can take initLimit=10 The number of ticks that can pass between sending a request and getting an acknowledgement syncLimit=5 the directory where the snapshot is stored. do not use /tmp for storage, /tmp here is just example sakes. dataDir=/opt/module/apache-zookeeper-3.5.5-bin/data dataLogDir=/opt/module/apache-zookeeper-3.5.5-bin/log the port at which the clients will connect clientPort=2181 the maximum number of client connections. increase this if you need to handle more clients #maxClientCnxns=60 # Be sure to read the maintenance section of the administrator guide before turning on autopurge. # #http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 Purge task interval in hours Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.0=192.168.1.100:2888:3888 server.1=192.168.1.101:2888:3888 server.2=192.168.1.102:2888:3888
4. Create server myid
Create a myid file in the data directory. The value in the file can be given any value, but it should correspond to the above service server.x
cd /opt/module/apache-zookeeper-3.5.5-bin/data/ touch myid
5. Cluster copy
scp -r /opt/module/apache-zookeeper-3.5.5-bin root@hadoop101:/opt/module/apache-zookeeper-3.5.5-bin scp -r /opt/module/apache-zookeeper-3.5.5-bin root@hadoop102:/opt/module/apache-zookeeper-3.5.5-bin
6. Cluster myid change
Enter each node and modify the myid value
Add cluster system environment variable: vi /etc/profile
export ZOOKEEPER_HOME=/opt/module/apache-zookeeper-3.5.5-bin export PATH=$PATH:$ZOOKEEPER_HOME/bin
Save the system environment variable: source /etc/profile
Turn off the cluster firewall
7. Cluster startup
Enter each node to start
cd /opt/module/apache-zookeeper-3.5.5-bin zkServer.sh start zkServer.sh status
zkCli connection verification
zkCli.sh -server hadoop1001:2181
hbase build!
ZZZZZ supporting installation:
Hadoop 3.1.3
Zookeeper3.5.7
Hbase2.2.0
1, File decompression
cd /opt/module/ tar -zxvf hbase-2.2.0-bin.tar.gz
2, Modify startup variable
System environment variable increase
vi /etc/profile export HBASE_HOME=/opt/module/hbase-2.2.0 export PATH=$PATH:$HBASE_HOME/bin
Save the system environment variable: source /etc/profile
Modify hbase variable
cd /opt/module/hbase-2.2.0/conf/ vi hbase-env.sh Use the find command while viewing to change the configuration file export JAVA_HOME=/opt/module/jdk1.8.0_212/ export HBASE_MANAGES_ZK=false
3, Configuration file
Configure hbase-site.xml file
vi hbase-site.xml
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop1000:9820/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop1000</value> </property> <property> <name>hbase.master.info.port</name> <value>60010</value> </property> <property> <name>hbase.master.maxclockskew</name> <value>180000</value> <description>Time difference of regionserver from master</description> </property> <property> <name>hbase.coprocessor.abortonerror</name> <value>false</value> </property> <property> <name>hbase.unsafe.stream.capability.enforce</name> <value>false</value> </property> </configuration>
Note that if you use an external zk, hbase.cluster.distributed needs to be set to true
Configuration file of regional servers: Hadoop 1000
4, Start
Start in sequence (hbase only needs the master node to start)
Zookeeper,Hadoop,Hbase
Hbase startup mode:
start-hbase.sh
Note: if HRegionServer is still not started, you can try the following statement
bin/hbase-daemon.sh start regionserver
5, Check
Web view: http://hadoop100:60010/master-status
Note: master web does not run by default. You need to configure the port in the configuration file
If Zookeeper cannot be started, check / usr/local/soft/hbase-2.2.0/logs/ log information
Consider deleting all hbase nodes in zk, and then restart to try
6, hbase shell usage
hbase shell
A table named myHbase is created. There is a column cluster named myCard in the table, and five version information is reserved
create 'myHbase',{NAME => 'myCard',VERSIONS => 5}
View list
All indication and column names need to be enclosed in quotation marks
1. View status:
status
2. View all tables:
list
3. Exit:
quit
hbase.master.info.port 60010 hbase.master.maxclockskew 180000 Time difference of regionserver from master hbase.coprocessor.abortonerror false hbase.unsafe.stream.capability.enforce false ```</property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop1000</value>
Note that if you use an external zk, hbase.cluster.distributed needs to be set to true
Configuration file of regional servers: Hadoop 1000
4, Start
Start in sequence (hbase only needs the master node to start)
Zookeeper,Hadoop,Hbase
Hbase startup mode:
start-hbase.sh
Note: if HRegionServer is still not started, you can try the following statement
bin/hbase-daemon.sh start regionserver
5, Check
Web view: http://hadoop100:60010/master-status
Note: master web does not run by default. You need to configure the port in the configuration file
If Zookeeper cannot be started, check / usr/local/soft/hbase-2.2.0/logs/ log information
Consider deleting all hbase nodes in zk, and then restart to try
6, hbase shell usage
hbase shell
A table named myHbase is created. There is a column cluster named myCard in the table, and five version information is reserved
create 'myHbase',{NAME => 'myCard',VERSIONS => 5}
View list
All indication and column names need to be enclosed in quotation marks
1. View status:
status
2. View all tables:
list
3. Exit:
quit