[HBase] operate HBase in Java

Preordering JAVA1.8 Hadoop 2.7.7 HBase 2.0.5 Gradle 5.3.1 Dependence: compile group: 'org.apache.hbase', name: 'hbase-client', version: '2.0.5' Source code: https://github.com/Hayaking/HBaseOperator Create connection @Logger public class HBaseUtils { static java.util.logging.Logger logger = getLogger( "haya" ); ...

Posted on Sun, 17 Nov 2019 13:20:34 -0500 by not_skeletor

Data service analysis of Spark project

First time included Specific steps of offline data calculation For the first time, there will be an original data. First, do the following: Create a directory of / people on hdfs and copy the csv file to it -- stay hdfs Create on people_in bin/hadoop dfs -mkdir /people bin/hdfs dfs -mkdir -p /people -- Local people01.cs ...

Posted on Sun, 17 Nov 2019 11:30:57 -0500 by sirmanson

Installation of HBase on Centos7 and configuration of Eclipse+Maven

I. configuration process of Eclipse+Maven 1. Install and configure JDK Configure the JDK environment on Windows 2. Install Eclipse 3. Install Maven Extract the Maven package, and put the extracted folder \ apache-maven-3.6.0 in the eclipse directory. Configure the environment variable of maven, add the installation Pat ...

Posted on Wed, 06 Nov 2019 11:59:54 -0500 by moneytree

Installation and configuration of link monitoring tool pinpoint

Tags: APMInitial knowledge of pinpoint call chain tool === In this paper, we will focus on the architecture, installation and deployment of the pinpoint tool; 1. Introduction to pinpoint tool:                         . W ...

Posted on Sun, 03 Nov 2019 01:37:11 -0400 by ubersmuck

Flume configuration TailDirSource, FileChannel, HDFSSink and KafkaSink stand-alone test

Article directory Version selection Technology selection Stand alone configuration zookeeper kafka flume Startup program hadoop zookeeper kafka flume test Version selection assembly Version number Scala 2.11.x Hadoop 2.6.0-cdh5.7.0 Kafka (apache) 2.11-0.10.2.2 Flume 1.6.0-cdh5.7.0 Zookeeper 3. ...

Posted on Sun, 03 Nov 2019 00:07:28 -0400 by catalin.1975

mapreduce implementation -- Tencent big data QQ common friend recommendation system [people you may know]

Based on Tencent big data QQ common friend recommendation system, we implement Test data: the front represents QQ users; the back represents QQ friends of users A:B,C,D,F,E,O B:A,C,E,K C:F,A,D,I D:A,E,F,L E:B,C,D,M,L F:A,B,C,D,E,O,M G:A,C,D,E,F H:A,C,D,E,O I:A,O J:B,O K:A,C,D L:D,E,F M:E,F,G O:A,H,I,J It can be implemented in two st ...

Posted on Fri, 01 Nov 2019 23:21:05 -0400 by jinwu

java client cannot upload file to hdfs

019-07-01 16:45:24,933 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 58.211.111.42:63048 Call#3 Retry#0 java.io.IOException: File /a1.txt could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excl ...

Posted on Fri, 01 Nov 2019 13:23:10 -0400 by rmbarnes82

Big data case - MapReduce's reduce end table consolidation (data skew)

Code download address: https://github.com/tazhigang/big-data-github.git I. demand: merge the data in the commodity information table into the order data table according to the commodity id II. Data preparation Data preparation: ==============================order.txt==================================== 1001 01 1 1002 02 2 1003 03 3 1001 0 ...

Posted on Thu, 31 Oct 2019 17:03:26 -0400 by mfallon

Read odps table data

Preface This is my first time to write a blog. During this period of work, I will learn new things almost every day. As an ordinary person, I also feel the general memory of myself, so I want to record the technical points I encounter in my daily work through the way of blog, and share them when I am ...

Posted on Tue, 29 Oct 2019 12:59:38 -0400 by fahrvergnuugen

XVII. hadoop compression

I. The significance of data compression in hadoop 1. Basic overview Compression technology can reduce the number of reading and writing sections of the underlying hdfs. And it can reduce the network bandwidth resources occupied in the process of data transmission and reduce the disk space occupied. In MapReduce, both the shuffle and merge proc ...

Posted on Tue, 29 Oct 2019 00:14:36 -0400 by Shygirl