Simple Distributed Cluster Cluster Setup for HDFS

Simple Distributed Cluster Cluster Setup for HDFS Preface This paper describes a simple HDFS fully distributed cluster setup operation, which is a simple distributed cluster because it is not a highly available HDFS.The next article will describe how to build a distributed cluster of HDFS for HA. ...

Posted on Sat, 20 Jun 2020 21:56:51 -0400 by abhi_10_20

Day 3: HBase API

API call More commonly used in the work is to call and implement operations similar to HBase shell through HBase API. Environmental preparation IDEA + Maven + HBase <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.or ...

Posted on Sat, 20 Jun 2020 03:37:46 -0400 by Ting

hadoop high availability installation (HA)

High availability hadoop installation HA Write at the beginning Installation procedure 1. Distribute JDK 1.1 install jdk 2. Synchronization time 3. Configuration file check before installation 4. Secret key free settings of NN and other three machines 5. No secret key between two NN 6. Modify some c ...

Posted on Mon, 15 Jun 2020 03:30:10 -0400 by ludachris

Linux CentOS 7.5 builds a highly available Hadoop distributed cluster environment

1.Linux environment preparation 1.1 turn off the firewall (all three virtual machines execute) firewall-cmd --state #View firewall status systemctl start firewalld.service #Turn on the firewall systemctl stop firewalld.service #Turn off firewall systemctl disable firewalld.service #Do not start firewall 1.2 configure static ...

Posted on Sun, 14 Jun 2020 05:40:55 -0400 by chris9902

A RPC framework based on Netty in 20 minutes

Netty is a high-performance network transmission framework, which is widely used as a basic communication component by RPC framework. For example, in Dubbo protocol, it is used for inter node communication, and in Hadoop, Avro component uses it for data file sharing. So let's try to use netty to impl ...

Posted on Fri, 12 Jun 2020 05:13:55 -0400 by florida_guy99

Painless setting up hadoop cluster and running Wordcount program

catalog Pre preparation View local network information View network connection status Change network information Change host name Clone virtual machine to obtain slave1 and slave2 nodes Configure parameter information of slave1 and slave2 Mapping host name to ip Configure ssh password free login ...

Posted on Sun, 07 Jun 2020 06:55:31 -0400 by jrolands

MapperReduce serialization job - sorting

1 data source: the last product about simple statistics of mobile traffic     2. Requirements: sort in reverse order according to the total flow value, and then get the output 3. General logic (1) FlowSort class: for serialization and deserialization, the implementation of sorting logic interface (2) FlowSortMapper class: encapsulating data (3 ...

Posted on Tue, 26 May 2020 10:10:14 -0400 by TwistedLogix

The running in period of Flink and Hive

There is a lot of feedback from readers. Please refer to the previous article< Hive is finally waiting, Flink >When Flink is deployed and Hive is integrated, there are some bug s and compatibility problems. Although waiting, it is not available. So I added this article as a sister article. review In the previous article, the author use ...

Posted on Mon, 25 May 2020 03:57:37 -0400 by soccerstar_23

Submit Spark tasks remotely to yarn cluster

Reference article:How to submit spark tasks to yarn cluster remotely in idea Several modes of running spark tasks: 1, local mode, write code in idea and run directly. 2,standalone mode, need to jar package program, upload to cluster, spark-submit submit to cluster run 3,yarn mode (local,client,cluster) as above, also requires jar packa ...

Posted on Thu, 21 May 2020 20:09:40 -0400 by twilightnights

Using IDEA to submit MapReduce jobs to pseudo distributed Hadoop remotely

Environmental Science VirtualBox 6.1 IntelliJ IDEA 2020.1.1 Ubuntu-18.04.4-live-server-amd64 jdk-8u251-linux-x64 hadoop-2.7.7 Install pseudo distributed Hadoop Install pseudo distributed reference: Hadoop installation tutorial single machine / pseudo distributed configuration Hadoop 2.6.0 (2.7.1) / Ubuntu 14.04 (16.04) Let's not go over it he ...

Posted on Thu, 14 May 2020 03:50:06 -0400 by Helios