Multi GROUP BY merge

1, GROUPING SETS

Grouping sets allows the user to specify multiple column lists to group. Sets the column of the given word list that does not belong to the grouping column to NULL
For example, the demand makes statistics on order sales by store grouping, order group grouping, store and order group grouping respectively, and obtains the result set of the three (inserted into the wide table). It can be implemented through union all multiple group by

select store_id,
       null,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by store_id
union all
select null,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by group_id
union all
select store_id,
       group_id,
       sum(coalesce(order_amouunt, 0))
from yp_dwb.dwb_order_detail
group by store_id, group_id;

Such SQL will be long. When the grouping dimensions of wide table requirements are more, the SQL will be longer and longer. At this time, the same operation can be realized through grouping sets, while reducing repeated SQL statements

select store_id,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by
    grouping sets (store_id, group_id, (store_id, group_id));

At the same time, SQL with complex group syntax such as grouping sets, cube and rollup will only read the basic data source once, while SQL with union all will read the basic data source three times (or more). Therefore, when the data source is changing frequently, the use of union all splicing may produce some strange results

2, CUBE

The CUBE operation will generate all possible grouping sets results for the provided column

select store_id,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by
cube (store_id, group_id);
-- Equivalent to
select store_id,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by
grouping sets ((store_id, group_id), store_id, group_id, ());

3, ROLLUP

rollup operation is special and will be phased out at the end

select store_id,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by
rollup (store_id, group_id);
-- Equivalent to
select store_id,
       group_id,
       sum(coalesce(order_amount, 0))
from yp_dwb.dwb_order_detail
group by
grouping sets ((store_id, group_id), (store_id), ())

That is, rollup((a),(b), ©) Equivalent to grouping sets((a, b, c), (a, b), (a), ())

4, GROUPING function

  • Grouping returns a collection of binary values converted to decimal numbers indicating which columns exist in the grouping. It must be used in conjunction with GROUPING SETS, ROLLUP, CUBE or GROUP BY, and its parameters must exactly match the columns referenced in the responding GROUPING SETS, ROLLUP, CUBE or GROUP BY clause
  • For a given grouping, if the grouping contains a response column, set the bit to 0, otherwise set it to 1
select group_id,
       store_id,
       sum(coalesce(goods_amount, 0)),
       grouping(group_id) as ging,
       grouping(store_id) as sing,
       grouping(group_id, store_id) as g_s_ing
from yp_dwb.dwb_order_detail
group by
grouping sets (group_id, store_id, (group_id, store_id));

  • When the grouping(group_id) column is 0, you can see that the group_id column has values. When it is 1, on the contrary, it proves that the current row is grouped and counted according to the group_id;
  • Similarly, for grouping(store_id), when it is 0, the styore_id column has a value, and when it is 1, the store_id is empty, which proves that the current row is grouped and counted according to the store_id;
  • grouping(group_id, store_id) is the number converted from the combination of binary values of grouping(group_id) and grouping(store_id):
    • Group according to group_id, then group_id=0, store_id=1. After combination, it is 01. Binary is converted to decimal to get the number 1;
    • Group by store_id, then group_id=1, store_id=0, the combined value is 10, and binary is converted to decimal to get the number 2;
    • Group by group_id and store_id at the same time, then group_id=0, store_id=0, 00 after combination, binary is converted to decimal to get the number 0
      Therefore, the goruping operation can be used to determine which field the current data is grouped according to

Tags: Database Big Data SQL

Posted on Thu, 04 Nov 2021 09:23:39 -0400 by jmicozzi