Author: Yang Yunlong, core member of Hangao PG laboratory, Senior Database Engineer, good at HGDB, post gresql, Oracle and other mainstream databases.
Introduction to table connection methods (Nested Loop/Hash Join/Merge Join/Join) database version (oracle11.2.0.4 and PostgreSQL 13.1)
- Environmental structure
- Nested Loop
- Hash Join
- Merge Join
- Advantages and disadvantages of three connection methods
- Join connection type
- union introduction
- Self connection
- Reference documents
Environmental structure
-- oracle11.2.0.4 --City, country drop table country purge; CREATE TABLE country ( country_id int primary key, country_name VARCHAR(50) NOT NULL ); drop table city purge; CREATE TABLE city ( city_id int primary key, city_name VARCHAR(50) NOT NULL, country_id int NOT NULL ); begin for i in 1 .. 10 loop insert into country values(i,'country'||i); end loop; commit; end; begin for i in 1 .. 10000 loop insert into city values(i,'city'||i,ceil(i/1000)); end loop; commit; end; execute dbms_stats.gather_table_stats(ownname =>'SCOTT',tabname => 'CITY' ,estimate_percent=> 100 ,cascade => true); execute dbms_stats.gather_table_stats(ownname =>'SCOTT',tabname => 'COUNTRY' ,estimate_percent=> 100 ,cascade => true); -- PostgreSQL 13.1 drop table country; CREATE TABLE country ( country_id integer primary key, country_name text NOT NULL ); drop table city purge; CREATE TABLE city ( city_id integer primary key, city_name text NOT NULL, country_id integer NOT NULL ); insert into country values (generate_series(1,10),'country'||generate_series(1,10)); insert into city values(generate_series(1,10000),'city'||generate_series(1,10000),ceil(random()*(10-1)+1)); analyze city; analyze country;
Nested Loop
As shown in the figure above, it is introduced in Nested Loop mode. Its algorithm: the driver table returns a row of data, and transmits the value to the driven table through the connection column. The driven table will be scanned as many times as many rows the driver table returns. Driven table index mode index unique scan or range scan. When there are few driven table data subsets, Nested Loop is a better choice, that is, it is suitable for queries with small result set. Generally, more than 10000 rows are considered as large rows, which will become inefficient. Examples are as follows:
Oracle execution example
SQL> explain plan for select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id=99; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 2738185913 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 93 | 2 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 1 | 93 | 2 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| CITY | 1 | 53 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SYS_C0018242 | 1 | | 1 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 1 | 40 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | SYS_C0018239 | 1 | | 0 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("CITY"."CITY_ID"=99) 5 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID") 18 rows selected.
Sentence comparison
--The following way oracle Will choose hash Full table scan explain plan for select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id>=10; --Mandatory use nested loop mode explain plan for select /*+ leading(city) use_nl(country) */ city_name,country_name from city,country where city.country_id=country.country_id and city.city_id>=10; SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 103883790 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9991 | 907K| 10005 (1)| 00:02:01 | | 1 | NESTED LOOPS | | 9991 | 907K| 10005 (1)| 00:02:01 | | 2 | NESTED LOOPS | | 9991 | 907K| 10005 (1)| 00:02:01 | |* 3 | TABLE ACCESS FULL | CITY | 9991 | 517K| 11 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SYS_C0018239 | 1 | | 0 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 1 | 40 | 1 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("CITY"."CITY_ID">=10) 4 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID") Note ----- - dynamic sampling used for this statement (level=2) 22 rows selected. --As shown above, the driver performs 9991 times, that is (10000)-9),Resource consumption increased significantly
Postgresql execution example
mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id=99; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.44..16.49 rows=1 width=40) (actual time=0.031..0.036 rows=1 loops=1) -> Index Scan using city_pkey on city (cost=0.29..8.30 rows=1 width=12) (actual time=0.015..0.017 rows=1 loops=1) Index Cond: (city_id = 99) -> Index Scan using country_pkey on country (cost=0.15..8.17 rows=1 width=36) (actual time=0.006..0.006 rows=1 loops=1) Index Cond: (country_id = city.country_id) Planning Time: 0.621 ms Execution Time: 0.098 ms (7 rows) --Return multiple lines, mandatory nested loop,Than the default selection hash join More than 30 ms mydb=# set enable_hashjoin=off; SET mydb=# set enable_mergejoin=off; SET mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id and city.city_id>=10; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.15..1934.29 rows=9991 width=40) (actual time=0.050..43.354 rows=9991 loops=1) -> Seq Scan on city (cost=0.00..188.00 rows=9991 width=12) (actual time=0.025..4.870 rows=9991 loops=1) Filter: (city_id >= 10) Rows Removed by Filter: 9 -> Index Scan using country_pkey on country (cost=0.15..0.17 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=9991) Index Cond: (country_id = city.country_id) Planning Time: 0.393 ms Execution Time: 45.366 ms (8 rows)
PG Nested loop description
Postgresql uses nested loops to execute queries. It can iterate to represent all entries in 1 and all entries in 2, and then issue a row when the row pairs in Table 1 and table 2 meet the filter criteria. Nested loops are the only connection algorithm Postgresql can use to handle any connection. For example, in the following statement, Oracle will select Merge Join, and Postgresql will select nested lo OP, please refer to the Merge Join section for details.
--Even if the amount of data is large, the following statement pg Also choose Nested Loop explain select a.city_name,b.city_id from city a,city2 b where a.country_id<b.country_id;
Hash Join
When two tables are associated, they mainly rely on hash operation to obtain the table connection mode of the result set, and only support equivalence. Algorithm: two tables are connected equivalently, return a large amount of data, select fewer tables as the driving table, and read the relevant columns of the driving table into the work area in PGA (PG is put into the memory work_mem) , then hash the connected columns of the drive table to generate a hash table, then read the driven table and hash the associated columns, and then probe the hash table in PGA to find the data association. If the hash table is too large to be constructed in memory at one time, it will be divided into several partition s and written to the temporary segment of the disk, which will increase the cost of writing and reduce the efficiency
The figure above shows the Hash process. If the statistics are accurate, the database will automatically select the best execution plan. When the ORDERED prompt is used, the first table in the FROM clause will be used to create a Hash table.
Oracle example
-- country As drive table,Whatever you use join(left/right/full)Way, Oracle The small table will be selected as the driving table. SQL> explain plan for select city_name,country_name from city,country where city.country_id=country.country_id; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------- Plan hash value: 114462077 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10000 | 781K| 14 (0)| 00:00:01 | |* 1 | HASH JOIN | | 10000 | 781K| 14 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| COUNTRY | 10 | 400 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| CITY | 10000 | 390K| 11 (0)| 00:00:01 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("CITY"."COUNTRY_ID"="COUNTRY"."COUNTRY_ID") Note ----- - dynamic sampling used for this statement (level=2) 19 rows selected.
Postgresql example
mydb=# explain analyze select city_name,country_name from city,country where city.country_id=country.country_id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Join (cost=38.58..227.91 rows=10000 width=40) (actual time=0.185..12.086 rows=10000 loops=1) Hash Cond: (city.country_id = country.country_id) -> Seq Scan on city (cost=0.00..163.00 rows=10000 width=12) (actual time=0.026..3.735 rows=10000 loops=1) -> Hash (cost=22.70..22.70 rows=1270 width=36) (actual time=0.064..0.067 rows=10 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 17kB -> Seq Scan on country (cost=0.00..22.70 rows=1270 width=36) (actual time=0.025..0.031 rows=10 loops=1) Planning Time: 1.953 ms Execution Time: 13.983 ms (8 rows)
In order to create a hash table using a small table, the optimizer will convert it. For example, right join is changed to Left Join
mydb=# explain analyze select * from country right join city on city.country_id=country.country_id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=38.58..227.91 rows=10000 width=52) (actual time=0.087..11.883 rows=10000 loops=1) Hash Cond: (city.country_id = country.country_id) -> Seq Scan on city (cost=0.00..163.00 rows=10000 width=16) (actual time=0.020..3.035 rows=10000 loops=1) -> Hash (cost=22.70..22.70 rows=1270 width=36) (actual time=0.044..0.047 rows=10 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 17kB -> Seq Scan on country (cost=0.00..22.70 rows=1270 width=36) (actual time=0.024..0.028 rows=10 loops=1) Planning Time: 0.413 ms Execution Time: 13.588 ms (8 rows)
Merge Join
When two tables are connected, sorting and merging operations are used to obtain the connection mode of the result set. Sorting mainly deals with non equivalent management. Algorithm: first sort the two tables according to the connection column. The nested loop matches the data from the index of the driven table, and the sorting merge connection is from memory (work area in PGA) Strictly speaking, there are no driven tables in the matching data in the. Fewer tables can be considered as driven tables. HASH JOIN only needs to put the driven tables into PGA, but sorting join merging needs to put the result sets of both tables into PGA
The figure above shows the general process of Merge. If the statistics are accurate, the database will automatically select the best execution plan.
Oracle example
col PLAN_TABLE_OUTPUT for a100 set lines 200 pages 999 explain plan for select a.city_name,b.country_name from city a,country b where a.country_id<b.country_id; SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 1026867539 ------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 45000 | 1054K| | 62 (2)| 00:00:01 | | 1 | MERGE JOIN | | 45000 | 1054K| | 62 (2)| 00:00:01 | | 2 | SORT JOIN | | 10 | 120 | | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| COUNTRY | 10 | 120 | | 2 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | SYS_C0018239 | 10 | | | 1 (0)| 00:00:01 | |* 5 | SORT JOIN | | 10000 | 117K| 408K| 60 (2)| 00:00:01 | | 6 | TABLE ACCESS FULL | CITY | 10000 | 117K| | 11 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 5 - access(INTERNAL_FUNCTION("A"."COUNTRY_ID")<INTERNAL_FUNCTION("B"."COUNTRY_ID")) filter(INTERNAL_FUNCTION("A"."COUNTRY_ID")<INTERNAL_FUNCTION("B"."COUNTRY_ID")) 19 rows selected.
Postgresql example
merge join requires that the keys of two joined tables are sorted first. If there is an index, there is no sorting process. The advantage of merge join is that both tables are scanned only once. If there is a nested loop, one table is scanned once and the other table is cycled many times
In Postgresql, Merge Join is only applicable to natural connection and equivalent connection. As shown in the following example, even if enable_nestloop is turned off, pg will be transformed into Nested Loop in this connection mode, and the initial cost estimation is very large. Therefore, Postgresql is different from Oracle mechanism.
mydb=# analyze city; ANALYZE mydb=# analyze country; ANALYZE mydb=# set enable_nestloop=off; SET mydb=# explain analyze select a.city_name,b.country_name from city a,country b where a.country_id<b.country_id; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=10000000000.00..10000016637.12 rows=333333 width=18) (actual time=0.104..419.608 rows=400687 loops=1) Join Filter: (a.country_id < b.country_id) Rows Removed by Join Filter: 599313 -> Seq Scan on city a (cost=0.00..1636.00 rows=100000 width=13) (actual time=0.066..20.058 rows=100000 loops=1) -> Materialize (cost=0.00..1.15 rows=10 width=13) (actual time=0.000..0.001 rows=10 loops=100000) -> Seq Scan on country b (cost=0.00..1.10 rows=10 width=13) (actual time=0.014..0.019 rows=10 loops=1) Planning Time: 0.649 ms Execution Time: 457.499 ms (8 rows)
hash join is the fastest connection mode only when it is executed in memory. When pg thinks that the hash connection needs more memory, it will select merge join, which can use the disk more effectively. In Postgresql, merge join is only applicable to connections with equal conditions. Its working principle is better than the first row (minimum value) of each output , if they are equal, add a row to the result and delete the two rows that have been compared from input; if the fields are different, delete the one with the smallest id and continue to compare with the next smallest.
Postgres can perform merging on disk very effectively by saving a small part of the data in memory, merging the data, writing the merging results to disk, and then reading more input from disk to memory. This method reads and writes a large amount of data on disk at one time, which is more efficient than Hash random reading.
mydb=# explain select a.city_name,b.city_id from city a,city2 b where a.country_id=b.country_id and a.city_id<30000; QUERY PLAN -------------------------------------------------------------------------------------- Hash Join (cost=3274.21..3764393.24 rows=333242482 width=13) Hash Cond: (a.country_id = b.country_id) -> Index Scan using city_pkey on city a (cost=0.29..1052.87 rows=30033 width=13) Index Cond: (city_id < 30000) -> Hash (cost=1634.63..1634.63 rows=99863 width=8) -> Seq Scan on city2 b (cost=0.00..1634.63 rows=99863 width=8) (6 rows) mydb=# set enable_hashjoin=off; SET mydb=# explain select a.city_name,b.city_id from city a,city2 b where a.country_id=b.country_id and a.city_id<30000; QUERY PLAN -------------------------------------------------------------------------------------------- Merge Join (cost=13213.55..5012000.94 rows=333242482 width=13) Merge Cond: (a.country_id = b.country_id) -> Sort (cost=3286.46..3361.55 rows=30033 width=13) Sort Key: a.country_id -> Index Scan using city_pkey on city a (cost=0.29..1052.87 rows=30033 width=13) Index Cond: (city_id < 30000) -> Sort (cost=9927.09..10176.74 rows=99863 width=8) Sort Key: b.country_id -> Seq Scan on city2 b (cost=0.00..1634.63 rows=99863 width=8) (9 rows)
The Merge Join method works better when memory is not enough
When the amount of data is very large, Merge Join will be selected. If there is no index, Materialize will be selected to cache the results in memory or files to avoid multiple full table scans, as shown below:
mydb=# create unlogged table tbl1(id int, info text) with (autovacuum_enabled=off); CREATE TABLE mydb=# create unlogged table tbl2(id int, info text) with (autovacuum_enabled=off); CREATE TABLE mydb=# insert into tbl1 select generate_series(1,10000000); INSERT 0 10000000 mydb=# insert into tbl2 select generate_series(1,10000000); INSERT 0 10000000 mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id; QUERY PLAN ---------------------------------------------------------------------------------- Merge Join (cost=2075531.84..2370522825.11 rows=157893676470 width=72) Merge Cond: (tbl1.id = tbl2.id) -> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36) Sort Key: tbl1.id -> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36) -> Materialize (cost=1037765.92..1065863.40 rows=5619496 width=36) -> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36) Sort Key: tbl2.id -> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36) (9 rows) mydb=# explain select * from tbl1,tbl2 where tbl1.id<tbl2.id; QUERY PLAN ---------------------------------------------------------------------------- Nested Loop (cost=0.00..720393977232.90 rows=10526245098005 width=72) Join Filter: (tbl1.id < tbl2.id) -> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36) -> Materialize (cost=0.00..172443.44 rows=5619496 width=36) -> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36) (5 rows) mydb=# set enable_mergejoin=off; SET mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id; QUERY PLAN ---------------------------------------------------------------------------- Hash Join (cost=214589.66..5526742276.56 rows=157893676470 width=72) Hash Cond: (tbl1.id = tbl2.id) -> Seq Scan on tbl1 (cost=0.00..100442.96 rows=5619496 width=36) -> Hash (cost=100442.96..100442.96 rows=5619496 width=36) -> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36) (5 rows) mydb=# explain select * from tbl1,tbl2 where tbl1.id=tbl2.id and tbl1.id<2000; QUERY PLAN ---------------------------------------------------------------------------------- Merge Join (cost=1449858.75..790941515.17 rows=52631216124 width=72) Merge Cond: (tbl1.id = tbl2.id) -> Sort (cost=412092.83..416775.74 rows=1873165 width=36) Sort Key: tbl1.id -> Seq Scan on tbl1 (cost=0.00..114491.70 rows=1873165 width=36) Filter: (id < 2000) -> Materialize (cost=1037765.92..1065863.40 rows=5619496 width=36) -> Sort (cost=1037765.92..1051814.66 rows=5619496 width=36) Sort Key: tbl2.id -> Seq Scan on tbl2 (cost=0.00..100442.96 rows=5619496 width=36) (10 rows)
Advantages and disadvantages of three connection methods
There is also a connection called cross join / Cartesian product. There are no conditions for the association of two tables. The result set is the product of the number of rows of two tables. This method will not be used. It will not be discussed this time.
category | Nested Loop | Hash Join | Merge Join |
---|---|---|---|
Service conditions | Any condition | Equivalent connection | Non equivalent, except '< >' |
Use resources | CPU, disk IO | Memory, temporary space | Memory, temporary space |
characteristic | High selection index or restricted search has high timeliness and can quickly return the first search results | Many table records, lack of effective index and high efficiency | Non equivalent connection, high efficiency |
shortcoming | Large amount of data and low efficiency, especially when there is no index or insufficient restrictions | Requires a lot of memory and may use disk | All tables need to be sorted, and the results are returned finally. PG sorts the memory in batch order |
Join connection type
inner Join
--The result contains only records that meet the connection conditions select a.id,a.col1,b.col2 from a,b where a.id=b.id; --or select a.id,a.col1,b.col2 from a inner join b on a.id=b.id; --or select id,a.col1,b.col2 from a inner join b using(id);
Full outer Join
SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name; ID NAME ID NAME ---------- ------ ---------- ------ 1 aaa 1 aaa 2 bbb 3 ccc --If there is no matching record, null As values, the same result set can be removed SELECT * FROM TableA FULL OUTER JOIN TableB ON TableA.name = TableB.name where tablea.id is null or tableb.id is null; ID NAME ID NAME ---------- ------ ------ ------ 2 bbb 3 ccc
left/right join
--As shown in the figure above, the A Table all results, B It's no use null As value,Except for the same value, right And left Just the opposite SELECT * FROM TableA left JOIN TableB ON TableA.name = TableB.name; ID NAME ID NAME ---------- ------ ---------- ------ 1 aaa 1 aaa 3 ccc --You can remove the same result set SELECT * FROM TableA left JOIN TableB ON TableA.name = TableB.name where tableb.id is null; ID NAME ID NAME ---------- ------ ---------- ------ 3 ccc
union introduction
The UNION operator is used to combine the result sets of two or more SELECT statements.
Note that the SELECT statement inside the UNION must have the same number of columns. Columns must also have similar data types. At the same time, the order of columns in each SELECT statement must be the same. UNION only selects records and has the function of de duplication, while UNION ALL lists all records.
SELECT name FROM TableA UNION SELECT name FROM TableB; NAME ------------------------------ aaa bbb SELECT name FROM TableA UNION all SELECT name FROM TableB; NAME ------------------------------ aaa aaa bbb
In some cases, union can be used instead of or to provide query efficiency
select * from t1 where owner='SCOTT' or object_id in (select object_id from t2); --The above statement uses FILTER,Can adjust sql,perhaps t2.object_id Create index select * from t1 where owner='SCOTT' union select * from t1 where object_id in (select object_id from t2); --Column to row example select employee,child_1 as child from personnel union all select employee,child_2 as child from personnel;
Self connection
Paging query
--Four layer paging is more suitable for paging queries with large pages. The small returned result set is used as the driven table and the driven table is scanned. with tmp as (SELECT * FROM ( SELECT rid, ROWNUM as RN FROM (SELECT rowid as rid FROM t1 where owner='SYS' order by object_id desc ) WHERE ROWNUM <= 500 ) WHERE RN > 490 ) select /*+ use_nl(a) leading(b) */ owner,object_name,object_id,rn from t1 a,tmp b where a.rowid=b.rid; --three layers select Is commonly used select * from (select owner,object_name,object_id,rownum as rn from(select owner,object_name,object_id from t1 where owner='SYS' order by object_id desc) where rownum<=500) where rn>490;
Delete duplicate lines
--Extreme function delete from products p1 where rowid < ( select max(p2.rowid) from products p2 where p1.name = p2.name and p1.price =p2.price ); --Use non equivalent connection delete from products p1 where exists ( select * from products p2 where p1.name = p2.name and p1.price = p2.price and p1.rowid < p2.rowid );
The associated subquery compares rows with rows eg
--Seek the same year as the previous year's turnover select year,sale from sales s1 where sale = ( select sale from sales s2 where s2.year = s1.year-1 ) order by year; --or select s1.year,s1.sale from sales s1,sales s2 where s2.sale =s1.sale and s2.year = s1.year-1 order by year; --Comparison of the same period select s1.year,s1.sale, case when s1.sale = s2.sale then 'Flat' when s1.sale > s2.sale then 'increase' when s1.sale < s2.sale then 'reduce' else '-' end as var from sales s1,sales s2 where s2.year = s1.year -1 order by year;
Reference documents
- Join introduce: https://www.cnblogs.com/xqzt/p/4469673.html
- Join introduce: https://logicalread.com/
- pg join method: http://www.jasongj.com/2015/03/07/Join1/
- pg join method: https://developer.aliyun.com/article/70992
- pg join method: https://www.interdb.jp/pg/pgsql03.html
- pg Merge Joins: https://malisper.me/postgres-merge-joins/
- Table connection method reference: Book core ideas of SQL optimization
- SQL connection example reference: Advanced SQL tutorial