scala -- examples of using Iterable sets, Seq sets, Set sets, and Map sets

1. Iterable 1.1 general Iteratable represents a set that can be iterated. It inherits the Traversable trait and is also the parent trait of other sets. Most importantly, it defines the method to obtain the iterator: def iterator: Iterator[A]. This is an abstract method. Its concrete implementation class needs to implement this method to reali ...

Posted on Mon, 06 Dec 2021 00:01:17 -0500 by lorddraco98

Summary of scala collections

I. Overview This collection is similar to java except that scala is reimplemented according to its syntax Divided into variable and immutable sets Common list of collections   List Summary          Stores the same type of sequential storage structure        &nbsp ...

Posted on Sat, 04 Dec 2021 12:32:38 -0500 by bombas79

Flink CDC Series - Build Streaming ETL on MySQL and Postgres

This tutorial will show you how to quickly build streaming ETL for MySQL and Postgres based on Flink CDC.Flink-CDC project address:https://github.com/ververica/flink-cdc-connectorsThis tutorial's demo is based on a Docker environment and will be done in the Flink SQL CLI, involving only SQL, without a single line of Java/Scala code, or with an ...

Posted on Wed, 01 Dec 2021 23:09:41 -0500 by abhishek

Memory overflow caused by Spark reading Snappy compressed files on HDFS

There are some files growing every day on HDFS. At present, Snappy compression is used. Suddenly, one day, OOM 1. Reasons: Because snappy cannot split slices, a file will be read by a task. After reading and decompressing, the data will expand many times. If the number of files is too large and your parallelism is very large, it will lead to ...

Posted on Fri, 19 Nov 2021 01:54:04 -0500 by tstout2

Why is the bottom layer of public or private in Scala?

Students who have learned about Scala should know that Scala cancels the use of many keywords, such as public, static, etc; But this also brings us some puzzles. This article will answer one of them: why is the bottom layer of public or private in scala? The attribute in the class is expressed as "public" without a scope qualifier We ...

Posted on Wed, 17 Nov 2021 00:14:26 -0500 by Nile

Simple introduction to Scala

It should not be easy to master a language, but it should not be difficult to simply master some usage and use it. Scala and java have many commonalities. From the perspective of Java, it should be easy to enter Scala. Configuring the Scala environment Here I configure Scala version 2.11.8 1. First, configure the local environment variable ...

Posted on Thu, 04 Nov 2021 13:53:30 -0400 by dercof

spark source code analysis (based on the yarn cluster pattern) - talk about RDD and dependency

We know that RDD is a particularly important concept in spark. It can be said that all logic of spark needs to rely on RDD. In this article, we briefly talk about RDD in spark. The definition of RDD in spark is as follows: abstract class RDD[T: ClassTag]( @transient private var _sc: SparkContext, @transient private var deps: Seq[Depend ...

Posted on Tue, 02 Nov 2021 04:05:18 -0400 by Elle0000

Pattern Matching in Scala [Simple Pattern Matching, Match Type, Guard, Match Sample Tight, Match Set, Pattern Matching in Variable Declarations, Match for1 Expression]

Pattern match ing There is a very powerful matching mechanism in Scala, such as: Judging Fixed ValuesType QueryQuick data acquisition Simple pattern matching A pattern match contains a series of alternatives, each starting with the keyword case, and each alternative contains a pattern and one or more expressions. Arrow symbol=>se ...

Posted on Sun, 31 Oct 2021 14:47:29 -0400 by bdlang

Action operator of rdd

An action operator is an operator that triggers an action. Triggering an action means real calculation data. collect Collect is to collect data from the executor side to the driver side. For example, a simple wordcount program: object CollectAction { def main(args: Array[String]): Unit = { val conf: SparkConf = new SparkConf().set ...

Posted on Thu, 14 Oct 2021 20:09:17 -0400 by pkSML

byKey series of rdd operators

There are some operators of xxxByKey in spark. Let's see. groupByKey explain Suppose we want to group some string lists: object GroupByKeyOperator { def main(args: Array[String]): Unit = { val conf = new SparkConf().setMaster("local[*]").setAppName("RDD") val context = new SparkContext(conf) val rdd: RDD[String] = contex ...

Posted on Thu, 14 Oct 2021 19:38:43 -0400 by ErnesTo