Hadoop 8-day course - the third day, MapReduce details

MR is good at processing offline text files. MR+yarn extends the simple operation logic to the distributed program logic in the context of massive data. If I write distributed programs, I will first face a problem, how to distribute my computing logic to other nodes? How to summarize the running results on all nodes? The overall framework of MR ...

Posted on Sun, 05 Apr 2020 06:46:21 -0400 by Rohan Hill

Step by step how Vue implements a high-performance Tree component that can render Big Data

Thank you for your reference- http://bjbsair.com/2020-04-01/tech-info/18404.html background In the project, a tree component of 5000 + nodes needs to be rendered, but the performance is very poor after the introduction of element Tree component. It is very obvious whether the node is scrolling, expanding / collapsing or clicking on the node. R ...

Posted on Sat, 04 Apr 2020 10:01:51 -0400 by j4v1

Pandas Basic Properties and Methods

Series Basic Functions: axes returns a list of row axis labels. Dtype returns the data type (dtype) of the object. Empty Returns True if the series is empty. ndim returns the dimension of the underlying data, which is defined by default: 1. size returns the number of elements in the underlying data. values returns the series as ndar ...

Posted on Sat, 28 Mar 2020 02:19:13 -0400 by jronyagz

Java object-oriented programming idea

1. What is object-oriented programming Object Oriented Programming (OOP) is not unique to Java, but a kind of programming idea, which is implemented by Java, C + +, Python. Its essence is the abstract thinking process and object-oriented method embodied by the establishment of model. The model is used to reflect the characteristics of things i ...

Posted on Fri, 27 Mar 2020 07:24:36 -0400 by azylka

In the middle of the night, I used python to crawl the whole doutu website, but I didn't agree to fight

QQ, wechat doodles are always difficult to fight. It's easy to climb doodles directly. I have a map of the whole website. If I don't accept it, I will fight. There's not much nonsense. The selected website is a map. Let's take a brief look at the structure of the website document info From the above picture, we can see that there are multiple ...

Posted on Wed, 25 Mar 2020 10:22:31 -0400 by activeradio

Java serialization 101 dataoutputstream, PrintStream method details

1, java.io.DataOutputStream; data byte output stream 1. The "int i = 2" in memory can be written into the hard disk file, not the string, but the binary data, which can be of type.   package com.bjpowernode.java_learning; import java.io.*; ​ public class D101_1_DataOutputStream { public static void main(String[] args) throws IO ...

Posted on Mon, 23 Mar 2020 10:51:14 -0400 by kcgame

About kafka monitoring tool

Summary Apache Kafka is a fast, scalable, high-throughput, fault-tolerant distributed "publish subscribe" message system, written in Scala and Java language, which can deliver messages from one endpoint to another. Compared with traditional message middleware (such as ActiveMQ, RabbitMQ), Kafka has the characteristics of high throug ...

Posted on Fri, 20 Mar 2020 04:04:07 -0400 by jeev

Differences between Hive internal and external tables

Difference between internal table and external table Tables that are not modified by external are managed table s, and tables that are modified by external are external table s; The data of the internal table is managed by Hive itself, and the external table is managed by HDFS; The data storage location of the internal table is hive.metas ...

Posted on Thu, 19 Mar 2020 11:47:10 -0400 by Jyotsna

How to install CDP DC7.0.3 in RedHat 7.7

Tags (space separated): building big data platform 1: Overview of CDP DC 7.0.3 2: System environment initialization 3: Build CDP DC 7.0.3 1: Overview of CDP DC 7.0.3 1.1 CDP dc 7.0.3 CDP DC7.0.3 is the first on premise version integrating all components of CDH and HDP after the merger of Cloudera and Hortonworks. CDP Data Center i ...

Posted on Thu, 19 Mar 2020 07:12:15 -0400 by iceman400

hadoop high availability installation and configuration

Cluster planning host | basic software | running process   data1  |  jdk,zk,hadoop  |  NameNode,zkfc,zk,journalNode,ResourceManager        data2  |  jdk,zk,hadoop  |  NameNode,zkfc,zk,journalNode,ResourceManager,datanode,NodeManager   data3  |  jdk,zk,hadoop  |  zk,journalNode,datanode,NodeManager   I. SSH password free login   1.data1:  SSH k ...

Posted on Thu, 05 Mar 2020 00:04:15 -0500 by eazyefolife