Day465. Trigger & other new features of mysql 8 mysql

trigger

In actual development, we often encounter such situations: there are two or more interrelated tables, such as commodity information and inventory information stored in two different data tables respectively. When we add a new commodity record, in order to ensure the integrity of the data, we must add an inventory record to the inventory table at the same time.

In this way, we must write the two associated operation steps into the program and wrap them with transactions to ensure that the two operations become an atomic operation, either all or none. In case of special circumstances, you may also need to manually maintain the data, so it is easy to forget one step, resulting in data loss.

At this time, we can use triggers** You can create a trigger to let the insertion of commodity information data automatically trigger the insertion of inventory data** In this way, you don't have to worry about missing data caused by forgetting to add inventory data.

1. Trigger overview

MySQL supports triggers from version 5.0.2. MySQL triggers, like stored procedures, are a program embedded in the MySQL server.

Triggers trigger an operation by events, including INSERT, UPDATE and DELETE events. The so-called event refers to the user's action or triggering a certain behavior. If a trigger program is defined, when the database executes these statements, it is equivalent to an event, and the trigger will be automatically fired to perform the corresponding operation.

Triggers can be used to automatically execute some database logic when performing insert, update and delete operations on the data in the data table.

2. Trigger creation

2.1 create trigger syntax

The syntax structure for creating triggers is:

CREATE TRIGGER Trigger Name  
{BEFORE|AFTER} {INSERT|UPDATE|DELETE} ON Table name 
FOR EACH ROW 
Statement block executed by trigger;

explain:

  • Table name: indicates the object monitored by the trigger.

  • BEFORE|AFTER: indicates the trigger time. BEFORE means triggered BEFORE the event; AFTER indicates that it is triggered AFTER the event.

  • INSERT|UPDATE|DELETE: indicates the triggered event.

    • INSERT indicates that it is triggered when a record is inserted;
    • UPDATE indicates that it is triggered when the record is updated;
    • DELETE indicates that it is triggered when a record is deleted.
  • Statement block executed by trigger: it can be a single SQL statement or a composite statement block composed of BEGIN... END structure.

2.2 code examples

Example 1:

1. Create data table:

CREATE TABLE test_trigger (
id INT PRIMARY KEY AUTO_INCREMENT,
t_note VARCHAR(30)
);


CREATE TABLE test_trigger_log (
id INT PRIMARY KEY AUTO_INCREMENT,
t_log VARCHAR(30)
);

2. Create trigger: create a trigger with the name before_ Trigger for insert to test_ Before inserting data into the trigger data table, add test_trigger_ Insert before in the log data table_ Log information of the insert.

DELIMITER //

CREATE TRIGGER before_insert
BEFORE INSERT ON test_trigger 
FOR EACH ROW
BEGIN
	INSERT INTO test_trigger_log (t_log)
	VALUES('before_insert');

END //

DELIMITER ;

3. Xiang test_ Insert data into trigger data table

INSERT INTO test_trigger (t_note) VALUES ('test BEFORE INSERT trigger');

4. View test_ trigger_ Data in log data table

mysql> SELECT * FROM test_trigger_log;
+----+---------------+
| id | t_log         |
+----+---------------+
|  1 | before_insert |
+----+---------------+
1 row in set (0.00 sec)

Example 2:

1. Create a file named after_ Trigger for insert to test_ After inserting data into the trigger data table, add the data to test_trigger_ Insert after into the log data table_ Log information of the insert.

DELIMITER //

CREATE TRIGGER after_insert
AFTER INSERT ON test_trigger
FOR EACH ROW
BEGIN
	INSERT INTO test_trigger_log (t_log)
	VALUES('after_insert');
END //

DELIMITER ;

2. Xiang test_ Insert data into the trigger data table.

INSERT INTO test_trigger (t_note) VALUES ('test AFTER INSERT trigger');

3. View test_ trigger_ Data in log data table

mysql> SELECT * FROM test_trigger_log;
+----+---------------+
| id | t_log         |
+----+---------------+
|  1 | before_insert |
|  2 | before_insert |
|  3 | after_insert  |
+----+---------------+
3 rows in set (0.00 sec)

**Example 3: * * define the trigger "salary_check_trigger". Based on the INSERT event of the employee table "employees", check whether the salary of the new employee to be added is greater than the salary of his leader before INSERT. If it is greater than the salary of the leader, report sqlstate_ The error with value 'HY000' makes the addition fail.

DELIMITER //

CREATE TRIGGER salary_check_trigger
BEFORE INSERT ON employees FOR EACH ROW
BEGIN
	DECLARE mgrsalary DOUBLE;
	SELECT salary INTO mgrsalary FROM employees WHERE employee_id = NEW.manager_id;

	IF NEW.salary > mgrsalary THEN
		SIGNAL SQLSTATE 'HY000' SET MESSAGE_TEXT = 'Salary higher than leader's salary error';
	END IF;
END //

DELIMITER ;

The NEW keyword in the above trigger declaration process represents the NEW record of the INSERT add statement.

3. View and delete triggers

3.1 viewing triggers

Viewing triggers is to view the definition, status and syntax information of triggers that already exist in the database.

Method 1: view the definitions of all triggers in the current database

SHOW TRIGGERS\G

Method 2: view the definition of a trigger in the current database

SHOW CREATE TRIGGER Trigger Name 

Method 3: information from the system library_ Query the trigger information of "salary_check_trigger" in the TRIGGERS table of schema.

SELECT * FROM information_schema.TRIGGERS;

3.2 delete trigger

Triggers are also database objects. DROP statements are also used to delete triggers. The syntax format is as follows:

DROP TRIGGER  IF EXISTS Trigger Name ;

4. Advantages and disadvantages of trigger

4.1 advantages

1. Triggers ensure data integrity.

Suppose we use the purchase order header table (demo.importhead) to save the overall information of the purchase order, including purchase order number, supplier number, warehouse number, total purchase quantity, total purchase amount and acceptance date.

Use the purchase order details (demo.importdetails) to save the details of purchased goods, including purchase order number, commodity number, purchase quantity, purchase price and purchase amount.

Whenever we enter, delete and modify a purchase order detail, the data in the purchase order detail will change. At this time, the total quantity and total amount in the purchase order header table must be recalculated. Otherwise, the total quantity and total amount in the purchase order header table are not equal to the total quantity and total amount in the purchase order detail table, which is the data inconsistency.

To solve this problem, we can use triggers to automatically trigger two-step operations whenever there are data insertion, modification and deletion operations in the purchase order details:

1) Recalculate the total quantity and total amount in the purchase order details;

2) Update the total quantity and total amount in the purchase order header table with the value calculated in step 1.

In this way, the value of total quantity and total amount in the purchase order header table is always the same as the value of total quantity and total amount calculated in the purchase order details. The data is consistent and will not conflict with each other.

2. Triggers can help us log operations.

Using triggers, you can record exactly what happened at what time. For example, the trigger to record and modify the stored value of members is a good example. This is very helpful for us to better locate the cause of the problem in the specific scenario when the restore operation is executed.

3. Triggers can also be used to check the validity of data before operating data.

For example, when the supermarket purchases goods, the warehouse keeper needs to enter the purchase price. However, human operation is easy to make mistakes. For example, when entering the quantity, the barcode is scanned in; When you enter the amount, you look at a series of lines. The entered price far exceeds the sales price, resulting in huge losses on the book... These can be triggered to check the corresponding data before the actual insertion or update operation, prompt errors in time, and prevent the wrong data from entering the system.

4.2 disadvantages

1. One of the biggest problems with triggers is poor readability.

Because triggers are stored in the database and driven by events, this means that triggers may not be controlled by the application layer. This is very challenging for system maintenance.

For example, create triggers to modify member stored value operations. If there is something wrong with the operation in the trigger, the member stored value update will fail. Let me demonstrate with the following code:

mysql> update demo.membermaster set memberdeposit=20 where memberid = 2;
ERROR 1054 (42S22): Unknown column 'aa' in 'field list'

The result shows that the system prompts an error and the field "aa" does not exist.

This is because there is one more field in the data insertion operation in the trigger, and the system prompts an error. However, if you don't understand the trigger, you may think that there is a problem with the update statement itself or the structure of the member information table. Maybe you will add a field called "aa" to the member information table to try to solve this problem, but it will only be in vain.

2. Changes in relevant data may cause trigger errors.

In particular, the change of data table structure may lead to trigger error, and then affect the normal operation of data operation. These will affect the efficiency of error cause troubleshooting in the application due to the concealment of the trigger itself.

4.3 precautions

Note that if a foreign key constraint is defined in the child table and the foreign key specifies the ON UPDATE/DELETE CASCADE/SET NULL clause, modifying the key value referenced by the parent table or deleting the record row referenced by the parent table will also cause the modification and deletion of the child table. At this time, the triggers defined based on the UPDATE and DELETE statements of the child table will not be activated.

For example, the DELETE statement based on the t_employee table defines the trigger t1, while the department number (did) field of the child table defines the foreign key constraint, references the department number (did) of the primary key column of the t_department table of the parent table, and the foreign key adds the "ON DELETE SET NULL" clause. If the t_department table of the parent table is deleted at this time When there are department records matching records in the sub table employee table (t_employee), the department number (did) of the matching records in the sub table employee table (t_employee) will be modified to NULL, but trigger t1 will not be activated at this time. Trigger t1 is activated only when a DELETE statement is executed directly on the child table (t_employee).

Other new features of MySQL 8

1. Overview of new features of MySQL 8

MySQL jumped directly from version 5.7 to release version 8.0. It can be seen that this is an exciting milestone version. MySQL 8 has made significant improvements and enhancements in functions. Developers have reconstructed the source code of MySQL. The most prominent point is the improvement of multiple MySQL Optimizer optimizers. It not only improves the speed, but also brings users better performance and better experience.

1.1 new features in MySQL 8.0

1. Simpler NoSQL support
NoSQL generally refers to non relational databases and data stores. With the rapid development of Internet platform, the traditional relational database can not meet the demand. Since version 5.6, MySQL has supported simple NoSQL storage. MySQL 8 optimizes this function to realize NoSQL function in a more flexible way and no longer relies on schema.

2. Better indexing
In query, using index correctly can improve the efficiency of query. Hidden index and descending index are added in MySQL 8. Hidden indexes can be used to test the impact of removing indexes on query performance. When there are mixed multi column indexes in the query, the use of descending index can improve the performance of the query.

3. More complete JSON support
MySQL supports the storage of native JSON data from 5.7. MySQL 8 optimizes this function and adds the aggregation function JSON_ARRAYAGG() and JSON_OBJECTAGG(), which aggregates parameters into JSON arrays or objects, adds the inline operator - > >, which is an enhancement of the column path operator - > and improves the JSON sorting and optimizes the JSON update operation.

4. Security and account management
Caching is added in MySQL 8_ sha2_ Password authorization plug-in, role, password history and FIPS mode support. These features improve the security and performance of the database and enable the database administrator to manage the account more flexibly.

5. Changes in InnoDB
InnoDB is the default storage engine of MySQL and the preferred engine for transactional databases. It supports transactional security tables (ACID), row locking and foreign keys. In MySQL 8, InnoDB has made a lot of improvements and optimizations in self increment, indexing, encryption, deadlock, shared lock, etc., and supports atomic data definition language (DDL), which improves data security and provides better support for transactions.

6. Data dictionary
In previous versions of MySQL, dictionary data was stored in metadata files and non transaction tables. Starting from MySQL 8, a new transaction data dictionary is added, in which database object information is stored, and these data dictionaries are stored in the internal transaction table.

7. Atomic data definition statement
MySQL 8 began to support atomic data definition statements (automatic DDL), that is, atomic DDL. Currently, only the InnoDB storage engine supports atomic DDL. Atomic data definition statement (DDL) combines the data dictionary update, storage engine operation and binary log writing related to DDL operation into a single atomic transaction, which makes the transaction commit or rollback even if the server crashes.
The tables created by the storage engine that supports atomic operations support atomic operations when performing DROP TABLE, CREATE TABLE, ALTER TABLE, RENAME TABLE, TRUNCATE TABLE, CREATE TABLESPACE, DROP TABLESPACE and other operations, that is, the transaction is either completely successful or rolled back after failure without partial commit.
For statements copied from MySQL 5.7 to MySQL 8, you can add IF EXISTS or IF NOT EXISTS statements to avoid errors.

8. Resource management
MySQL 8 began to support the creation and management of resource groups, allowing threads running in the server to be assigned to specific groups, so that threads can execute according to the available resources in the group. Group properties can control the resources in the group and enable or limit the consumption of resources in the group. The database administrator can change these properties appropriately according to different workloads.
At present, CPU time is a controllable resource, which is represented by the concept of "virtual CPU". This term includes the number of CPU cores, hyper threads, hardware threads, etc. The server determines the number of virtual CPUs available at startup. Database administrators with corresponding permissions can associate these CPUs with resource groups and allocate threads to resource groups.
The resource group component provides an SQL interface for resource group management in MySQL. The properties of the resource group are used to define the resource group. There are two default groups in mysql, system group and user group. The default group cannot be deleted and its properties cannot be changed. For user-defined groups, all properties can be initialized when a resource group is created. Except for name and type, other properties can be changed after creation.
In some platforms, or when MySQL is configured, the function of resource management will be limited or even unavailable. For example, if a thread pool plug-in is installed or a Mac OS system is used, resource management will be unavailable. In FreeBSD and Solaris systems, the resource thread priority is invalidated. In Linux systems, only cap is configured_ SYS_ Nice attribute, resource management priority can play a role.

9. Character set support
The default character set in MySQL 8 has been changed from latin1 to utf8mb4, and a Japanese specific set, utf8mb4, has been added for the first time_ ja_ 0900_ as_ cs.

10. Optimizer enhancements
MySQL optimizer began to support hidden indexes and descending indexes. The hidden index will not be used by the optimizer. There is no need to delete the index when verifying the necessity of the index. Hide the index first. If the optimizer performance has no impact, the index can be really deleted. Descending indexes allow the optimizer to sort multiple columns and allow inconsistent sorting order.

11. Common table expression
Common Table Expressions are called CTEs for short. MySQL now supports both recursive and non recursive forms of CTEs. CTE names the temporary result set by using the WITH statement before the SELECT statement or other specific statements.

The basic syntax is as follows:

WITH cte_name (col_name1,col_name2 ...) AS (Subquery)
SELECT * FROM cte_name;

Subquery represents a subquery. Before the subquery, use the WITH statement to name the result set cte_name, CTE can be used in subsequent queries_ Name to query.

12. Window function
MySQL 8 starts to support window functions. Most aggregate functions that existed in previous versions can also be used as window functions in MySQL 8.

13. Regular expression support
MySQL after 8.0.4 adopts the international component library supporting Unicode to realize regular expression operation. This method can not only provide full Unicode support, but also multi byte secure coding. Regexp is added to MySQL_ LIKE(),EGEXP_INSTR(),REGEXP_REPLACE() and REGEXP_SUBSTR() and other functions to improve performance. In addition, regexp_stack_limit and regexp_ time_ The limit system variable can control resource consumption through the matching engine.

14. Internal temporary table
TempTable storage engine replaces MEMORY storage engine as the default storage engine for internal temporary tables. TempTable storage engine provides efficient storage for VARCHAR and VARBINARY columns. internal_ tmp_ mem_ storage_ The engine session variable defines the storage engine of the internal temporary table. There are two optional values, TempTable and MEMORY. TempTable is the default storage engine. TempTable_ max_ The ram system configuration item defines the maximum amount of MEMORY that the TempTable storage engine can use.

15. Logging
In MySQL 8, the error log subsystem consists of a series of MySQL components. These components are composed of the system variable log_error_services to filter and write log events.

16. Backup lock
The new backup lock allows data operation statements to be executed during online backup while blocking operations that may cause snapshot inconsistencies. The new backup lock is supported by the LOCK INSTANCE FOR BACKUP and UNLOCK INSTANCE syntax, which requires backup administrator privileges.

17. Enhanced MySQL replication
MySQL 8 replication supports binary log records that partially update JSON documents. The records use a compact binary format, thus saving space for recording complete JSON documents. When using statement based logging, this compact logging is done automatically, and you can use the new binlog_ row_ value_ The options system variable value is set to PARTIAL_JSON to enable.

1.2 old features removed by MySQL 8.0

If the application developed on MySQL version 5.7 uses the features removed by MySQL 8.0, the statement may fail or produce different execution results. In order to avoid these problems, for applications that use removed features, efforts should be made to correct and avoid the use of these features, and alternative methods should be used as far as possible.

1. Query cache
The query cache has been removed. The deleted items are:
**(1) Statements: * * FLUSH QUERY CACHE and RESET QUERY CACHE.
**(2) System variable: * * query_cache_limit,query_cache_min_res_unit,query_cache_size,query_cache_type,query_cache_wlock_invalidate.
**(3) Status variable: * * Qcache_free_blocks,Qcache_free_memory,Qcache_hits,Qcache_inserts,Qcache_lowmem_prunes,Qcache_not_cached,Qcache_queries_in_cache,Qcache_total_blocks.
**(4) Thread status: * * checking privileges on cached query, checking query cache for query, invalidating query cache entries, sending cached result to client, storing result in query cache, waiting for query cache lock.

2. Encryption related
Deleted encryption related contents include: ENCODE(), DECODE(), ENCRYPT(), DES_ENCRYPT() and DES_DECRYPT() function, configuration item des key file, system variable have_ Des of crypt, FLUSH statement_ KEY_ File option, HAVE_CRYPT CMake option.
For the removed ENCRYPT() function, consider using SHA2() instead. For other removed functions, use AES_ENCRYPT() and AES_ Replace with decrypt().

3. Spatial function correlation
In MySQL version 5.7, multiple spatial functions have been marked as obsolete. These obsolete functions have been removed in MySQL 8 and only the corresponding ST has been retained_ And MBR functions.

4.\N and NULL
In SQL statements, the parser no longer treats \ N as NULL, so NULL should be used instead of \ N in SQL statements. This change will not affect the import and export of operation files using LOAD DATA INFILE or SELECT... Into output. In such operations, NULL is still equivalent to \ N.

5. mysql_install_db
In MySQL distribution, MySQL has been removed_ install_ For DB program, the data dictionary initialization needs to call mysqld with the – initialize or – initialize execute option instead of the implementation. In addition, – bootstrap and INSTALL_SCRIPTDIR CMake has also been deleted.

6. General partition handler
The universal partition handler has been removed from the MySQL service. In order to implement a given table partition, the storage engine used by the table needs its own partition handler.
There are two MySQL storage engines that provide local partition support, namely InnoDB and NDB. In MySQL 8, only InnoDB is supported.

7. System and state variable information
In information_ In schema database, system and state variable information is no longer maintained. GLOBAL_VARIABLES,SESSION_VARIABLES,GLOBAL_STATUS,SESSION_ The status table has been deleted. In addition, the system variable show_compatibility_56 has also been deleted. The deleted state variables are Slave_heartbeat_period,Slave_last_heartbeat,Slave_received_heartbeats,Slave_retried_transactions,Slave_running. The above deleted content can be replaced by the corresponding content in the performance mode.

8.mysql_plugin tool
mysql_ The plugin tool is used to configure MySQL server plug-ins. It has been deleted. You can use the – plugin load or – plugin load Add option to load the plug-ins when the server is started, or use the INSTALL PLUGIN statement to load the plug-ins at runtime to replace the tool.

2. New feature 1: window function

2.1 comparison before and after using window function

Suppose I have such a data table, which shows the sales of a shopping website in each city and each district:

CREATE TABLE sales(
id INT PRIMARY KEY AUTO_INCREMENT,
city VARCHAR(15),
county VARCHAR(15),
sales_value DECIMAL

);

INSERT INTO sales(city,county,sales_value)
VALUES
('Beijing','Haidian',10.00),
('Beijing','Sunrise',20.00),
('Shanghai','Huangpu',30.00),
('Shanghai','Changning',10.00);

Query:

mysql> SELECT * FROM sales;
+----+------+--------+-------------+
| id | city | county | sales_value |
+----+------+--------+-------------+
|  1 | Beijing | Haidian    |          10 |
|  2 | Beijing | Sunrise    |          20 |
|  3 | Shanghai | Huangpu    |          30 |
|  4 | Shanghai | Changning    |          10 |
+----+------+--------+-------------+
4 rows in set (0.00 sec)

**Demand: * * now calculate the total sales of the website in each city, the total sales in the whole country, the ratio of the sales of each district to the sales of the city, and the ratio to the total sales.

If you use grouping and aggregation functions, you need to calculate in several steps.

The first step is to calculate the total sales amount and deposit it into temporary table a:

CREATE TEMPORARY TABLE a       -- Create temporary table
SELECT SUM(sales_value) AS sales_value -- Calculate total amount
FROM sales;

Take a look at the temporary table a:

mysql> SELECT * FROM a;
+-------------+
| sales_value |
+-------------+
|          70 |
+-------------+
1 row in set (0.00 sec)

The second step is to calculate the total sales of each city and store it in temporary table b:

CREATE TEMPORARY TABLE b    -- Create temporary table
SELECT city,SUM(sales_value) AS sales_value  -- Calculate total city sales
FROM sales
GROUP BY city;

View temporary table b:

mysql> SELECT * FROM b;
+------+-------------+
| city | sales_value |
+------+-------------+
| Beijing |          30 |
| Shanghai |          40 |
+------+-------------+
2 rows in set (0.00 sec)

The third step is to calculate the proportion of sales in each district to the total amount of the city, and the proportion to the total sales amount. We can obtain the required results through the following connection query:

mysql> SELECT s.city AS city,s.county AS area,s.sales_value AS District Sales,
    -> b.sales_value AS City Sales,s.sales_value/b.sales_value AS Market ratio,
    -> a.sales_value AS Total sales,s.sales_value/a.sales_value AS Total ratio
    -> FROM sales s
    -> JOIN b ON (s.city=b.city) -- Provisional statistical results of linked cities
    -> JOIN a                   -- Provisional statement of connection total amount
    -> ORDER BY s.city,s.county;
+------+------+----------+----------+--------+----------+--------+
| city  | area   | District Sales  | City Sales   | Market ratio  | Total sales  | Total ratio  |
+------+------+----------+----------+--------+----------+--------+
| Shanghai  | Changning |       10 |       40 | 0.2500 |       70 | 0.1429 |
| Shanghai  | Huangpu |       30 |       40 | 0.7500 |       70 | 0.4286 |
| Beijing  | Sunrise |       20 |       30 | 0.6667 |       70 | 0.2857 |
| Beijing  | Haidian |       10 |       30 | 0.3333 |       70 | 0.1429 |
+------+------+----------+----------+--------+----------+--------+
4 rows in set (0.00 sec)

The results show that the market sales amount, market sales proportion, total sales amount and total sales proportion are calculated.

The same query is much simpler if you use the window function. We can implement it with the following code:

mysql> SELECT city AS city,county AS area,sales_value AS District Sales,
    -> SUM(sales_value) OVER(PARTITION BY city) AS City Sales,  -- Calculate City Sales
    -> sales_value/SUM(sales_value) OVER(PARTITION BY city) AS Market ratio,
    -> SUM(sales_value) OVER() AS Total sales,   -- Calculate total sales
    -> sales_value/SUM(sales_value) OVER() AS Total ratio
    -> FROM sales
    -> ORDER BY city,county;
+------+------+----------+----------+--------+----------+--------+
| city  | area   | District Sales  | City Sales   | Market ratio  | Total sales  | Total ratio  |
+------+------+----------+----------+--------+----------+--------+
| Shanghai  | Changning |       10 |       40  | 0.2500 |       70 | 0.1429 |
| Shanghai  | Huangpu |       30 |       40  | 0.7500 |       70 | 0.4286 |
| Beijing  | Sunrise |       20 |       30  | 0.6667 |       70 | 0.2857 |
| Beijing  | Haidian |       10 |       30  | 0.3333 |       70 | 0.1429 |
+------+------+----------+-----------+--------+----------+--------+
4 rows in set (0.00 sec)

The results show that we get the same results as the above query.

Using the window function, the query is completed in one step. Moreover, because the temporary table is not used, the execution efficiency is higher. Obviously, in this scenario where the results of grouping statistics are used to calculate each record, it is better to use the window function.

2.2 window function classification

MySQL supports window functions from version 8.0. The function of window function is similar to grouping data in query. The difference is that grouping operation will aggregate the grouping results into one record, while window function places the results in each data record.

Window functions can be divided into static window functions and dynamic window functions.

  • The window size of the static window function is fixed and will not vary due to different records;
  • The window size of dynamic window function will change with different records.

The URL of the MySQL official website window function is https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_row-number.

Generally, window functions can be divided into sequence number functions, distribution functions, front and back functions, head and tail functions and other functions, as shown in the following table:

2.3 grammatical structure

The syntax structure of the window function is:

function OVER([PARTITION BY Field name ORDER BY Field name ASC|DESC])

Or:

function OVER Window name WINDOW Window name AS ([PARTITION BY Field name ORDER BY Field name ASC|DESC])
  • OVER keyword specifies the scope of the function window.
    • If the contents in the following brackets are omitted, the window will contain all records that meet the WHERE condition, and the window function will calculate based on all records that meet the WHERE condition.
    • If the parentheses after the OVER keyword are not empty, you can use the following syntax to set the window.
  • Window name: set an alias for the window to identify the window.
  • PARTITION BY clause: specifies which fields window functions are grouped by. After grouping, the window function can be executed separately in each grouping.
  • ORDER BY clause: specifies which fields the window function sorts by. Perform the sorting operation to make the window functions number in the order of the sorted data records.
  • FRAME clause: defines rules for a subset in a partition, which can be used as a sliding window.

2.4 classified explanation

Create table:

CREATE TABLE goods(
id INT PRIMARY KEY AUTO_INCREMENT,
category_id INT,
category VARCHAR(15),
NAME VARCHAR(30),
price DECIMAL(10,2),
stock INT,
upper_time DATETIME

);

Add data:

INSERT INTO goods(category_id,category,NAME,price,stock,upper_time)
VALUES
(1, 'Women's wear/Women's Boutique', 'T Shirt', 39.90, 1000, '2020-11-10 00:00:00'),
(1, 'Women's wear/Women's Boutique', 'Dress', 79.90, 2500, '2020-11-10 00:00:00'),
(1, 'Women's wear/Women's Boutique', 'Sweater', 89.90, 1500, '2020-11-10 00:00:00'),
(1, 'Women's wear/Women's Boutique', 'Jeans', 89.90, 3500, '2020-11-10 00:00:00'),
(1, 'Women's wear/Women's Boutique', 'Pleated skirt', 29.90, 500, '2020-11-10 00:00:00'),
(1, 'Women's wear/Women's Boutique', 'Woolen coat', 399.90, 1200, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Bicycle', 399.90, 1000, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Mountain Bike', 1399.90, 2500, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Alpenstocks', 59.90, 1500, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Riding equipment', 399.90, 3500, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Sport coat', 799.90, 500, '2020-11-10 00:00:00'),
(2, 'outdoor sport', 'Skate', 499.90, 1200, '2020-11-10 00:00:00');

Next, verify the function of each window function against the data in the goods table.

1. Sequence number function

1.ROW_NUMBER() function

ROW_ The number() function can display the sequence numbers in the data in order.

For example: query the information of each commodity in descending order under each commodity category in the goods data table.

mysql> SELECT ROW_NUMBER() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    -> id, category_id, category, NAME, price, stock
    -> FROM goods;
+---------+----+-------------+---------------+------------+---------+-------+
| row_num | id | category_id | category      | NAME       | price   | stock |
+---------+----+-------------+---------------+------------+---------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater        |   89.90 |  1500 |
|       3 |  4 |           1 | Women's wear/Women's Boutique   | Jeans      |   89.90 |  3500 |
|       4 |  2 |           1 | Women's wear/Women's Boutique   | Dress      |   79.90 |  2500 |
|       5 |  1 |           1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |  1000 |
|       6 |  5 |           1 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |   500 |
|       1 |  8 |           2 | outdoor sport       | Mountain Bike   | 1399.90 |  2500 |
|       2 | 11 |           2 | outdoor sport       | Sport coat     |  799.90 |   500 |
|       3 | 12 |           2 | outdoor sport       | Skate        |  499.90 |  1200 |
|       4 |  7 |           2 | outdoor sport       | Bicycle      |  399.90 |  1000 |
|       5 | 10 |           2 | outdoor sport       | Riding equipment    |  399.90 |  3500 |
|       6 |  9 |           2 | outdoor sport       | Alpenstocks      |   59.90 |  1500 |
+---------+----+-------------+---------------+------------+---------+-------+
12 rows in set (0.00 sec)

For example: query the information of the three commodities with the highest price under each commodity category in the goods data table.

mysql> SELECT *
    -> FROM (
    ->  SELECT ROW_NUMBER() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    ->  id, category_id, category, NAME, price, stock
    ->  FROM goods) t
    -> WHERE row_num <= 3;
+---------+----+-------------+---------------+------------+---------+-------+
| row_num | id | category_id | category      | NAME       | price   | stock |
+---------+----+-------------+---------------+------------+---------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat    |  399.90  |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater        |   89.90 |  1500 |
|       3 |  4 |           1 | Women's wear/Women's Boutique   | Jeans      |   89.90  |  3500 |
|       1 |  8 |           2 | outdoor sport       | Mountain Bike   | 1399.90  |  2500 |
|       2 | 11 |           2 | outdoor sport       | Sport coat     |  799.90  |   500 |
|       3 | 12 |           2 | outdoor sport       | Skate        |  499.90  |  1200 |
+---------+----+-------------+---------------+------------+----------+-------+
6 rows in set (0.00 sec)

In the commodity category named "women's clothing / women's boutique", the price of two commodities is 89.90 yuan, namely sanitary clothes and jeans. The serial number of both products should be 2, not 2 for one and 3 for the other. At this point, you can use the RANK() function and sense_ RANK() function.

2. RANK() function

Use RANK() function to sort serial numbers in parallel and skip repeated serial numbers, such as 1, 1 and 3.

For example, use the RANK() function to obtain the commodity information sorted from high to low price of each category in the goods data table.

mysql> SELECT RANK() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    -> id, category_id, category, NAME, price, stock
    -> FROM goods;
+---------+----+-------------+---------------+------------+---------+-------+
| row_num | id | category_id | category      | NAME       | price   | stock |
+---------+----+-------------+---------------+------------+---------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater        |   89.90 |  1500 |
|       2 |  4 |           1 | Women's wear/Women's Boutique   | Jeans      |   89.90 |  3500 |
|       4 |  2 |           1 | Women's wear/Women's Boutique   | Dress      |   79.90 |  2500 |
|       5 |  1 |           1 | Women's wear/Women's Boutique   | T Shirt         |   39.90 |  1000 |
|       6 |  5 |           1 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |   500 |
|       1 |  8 |           2 | outdoor sport       | Mountain Bike   | 1399.90 |  2500 |
|       2 | 11 |           2 | outdoor sport       | Sport coat     |  799.90 |   500 |
|       3 | 12 |           2 | outdoor sport       | Skate        |  499.90 |  1200 |
|       4 |  7 |           2 | outdoor sport       | Bicycle      |  399.90 |  1000 |
|       4 | 10 |           2 | outdoor sport       | Riding equipment    |  399.90 |  3500 |
|       6 |  9 |           2 | outdoor sport       | Alpenstocks      |   59.90 |  1500 |
+---------+----+-------------+---------------+------------+---------+-------+
12 rows in set (0.00 sec)

For example, use RANK() function to obtain the information of the four products with the highest price in the category of "women's clothing / women's boutique" in the goods data table.

mysql> SELECT *
    -> FROM(
    ->  SELECT RANK() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    ->  id, category_id, category, NAME, price, stock
    ->  FROM goods) t
    -> WHERE category_id = 1 AND row_num <= 4;
+---------+----+-------------+---------------+----------+--------+-------+
| row_num | id | category_id | category      | NAME     | price  | stock |
+---------+----+-------------+---------------+----------+--------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat  | 399.90 |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater      |  89.90 |  1500 |
|       2 |  4 |           1 | Women's wear/Women's Boutique   | Jeans    |  89.90 |  3500 |
|       4 |  2 |           1 | Women's wear/Women's Boutique   | Dress    |  79.90 |  2500 |
+---------+----+-------------+---------------+----------+--------+-------+
4 rows in set (0.00 sec)

It can be seen that the serial numbers obtained by using RANK() function are 1, 2, 2 and 4. The commodity serial numbers of the same price are the same, and the subsequent commodity serial numbers are discontinuous. Repeated serial numbers are skipped.

3.DENSE_RANK() function

DENSE_RANK() function sorts serial numbers in parallel and does not skip duplicate serial numbers, such as 1, 1 and 2.

Example: use DENSE_RANK() function obtains the commodity information sorted from high to low price of each category in the goods data table.

mysql> SELECT DENSE_RANK() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    -> id, category_id, category, NAME, price, stock
    -> FROM goods;
+---------+----+-------------+---------------+------------+---------+-------+
| row_num | id | category_id | category      | NAME       | price   | stock |
+---------+----+-------------+---------------+------------+---------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater        |   89.90 |  1500 |
|       2 |  4 |           1 | Women's wear/Women's Boutique   | Jeans      |   89.90 |  3500 |
|       3 |  2 |           1 | Women's wear/Women's Boutique   | Dress      |   79.90 |  2500 |
|       4 |  1 |           1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |  1000 |
|       5 |  5 |           1 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |   500 |
|       1 |  8 |           2 | outdoor sport       | Mountain Bike   | 1399.90 |  2500 |
|       2 | 11 |           2 | outdoor sport       | Sport coat     |  799.90 |   500 |
|       3 | 12 |           2 | outdoor sport       | Skate        |  499.90 |  1200 |
|       4 |  7 |           2 | outdoor sport       | Bicycle       |  399.90 |  1000 |
|       4 | 10 |           2 | outdoor sport       | Riding equipment     |  399.90 |  3500 |
|       5 |  9 |           2 | outdoor sport       | Alpenstocks       |   59.90 |  1500 |
+---------+----+-------------+---------------+------------+---------+-------+
12 rows in set (0.00 sec)

Example: use dense_ The rank() function obtains the information of the four products with the highest price in the category of "women's clothing / women's boutique" in the goods data table.

mysql> SELECT *
    -> FROM(
    ->  SELECT DENSE_RANK() OVER(PARTITION BY category_id ORDER BY price DESC) AS row_num,
    ->  id, category_id, category, NAME, price, stock
    ->  FROM goods) t
    -> WHERE category_id = 1 AND row_num <= 3;
+---------+----+-------------+---------------+----------+--------+-------+
| row_num | id | category_id | category      | NAME     | price  | stock |
+---------+----+-------------+---------------+----------+--------+-------+
|       1 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat  | 399.90 |  1200 |
|       2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater     |  89.90  |  1500 |
|       2 |  4 |           1 | Women's wear/Women's Boutique   | Jeans    |  89.90 |  3500 |
|       3 |  2 |           1 | Women's wear/Women's Boutique   | Dress    |  79.90 |  2500 |
+---------+----+-------------+---------------+----------+--------+-------+
4 rows in set (0.00 sec)

As you can see, use dense_ The row numbers obtained by rank() function are 1, 2, 2 and 3. The commodity serial numbers of the same price are the same. The subsequent commodity serial numbers are continuous, and the repeated serial numbers are not skipped.

2. Distribution function

1.PERCENT_RANK() function

PERCENT_ The rank() function is a percentage function of the rank value. Calculate as follows.

 (rank - 1) / (rows - 1)

Where, the value of rank is the sequence number generated by RANK() function, and the value of rows is the total number of records in the current window.

For example: calculate the percent of goods under the category named "women's clothing / women's boutique" in the goods data sheet_ Rank value.

#Writing method 1:
SELECT RANK() OVER (PARTITION BY category_id ORDER BY price DESC) AS r,
PERCENT_RANK() OVER (PARTITION BY category_id ORDER BY price DESC) AS pr,
id, category_id, category, NAME, price, stock
FROM goods
WHERE category_id = 1;

#Method 2:
mysql> SELECT RANK() OVER w AS r,
    -> PERCENT_RANK() OVER w AS pr,
    -> id, category_id, category, NAME, price, stock
    -> FROM goods
    -> WHERE category_id = 1 WINDOW w AS (PARTITION BY category_id ORDER BY price DESC);
+---+-----+----+-------------+---------------+----------+--------+-------+
| r | pr  | id | category_id | category      | NAME     | price  | stock |
+---+-----+----+-------------+---------------+----------+--------+-------+
| 1 |   0 |  6 |           1 | Women's wear/Women's Boutique   | Woolen coat  | 399.90 |  1200 |
| 2 | 0.2 |  3 |           1 | Women's wear/Women's Boutique   | Sweater     |  89.90 |  1500 |
| 2 | 0.2 |  4 |           1 | Women's wear/Women's Boutique   | Jeans   |  89.90 |  3500  |
| 4 | 0.6 |  2 |           1 | Women's wear/Women's Boutique   | Dress   |  79.90 |  2500  |
| 5 | 0.8 |  1 |           1 | Women's wear/Women's Boutique   | T Shirt      |  39.90 |  1000 |
| 6 |   1 |  5 |           1 | Women's wear/Women's Boutique   | Pleated skirt   |  29.90  |   500 |
+---+-----+----+-------------+---------------+----------+--------+-------+
6 rows in set (0.00 sec)

2.CUME_DIST() function

CUME_ The dist() function is mainly used to query the proportion less than or equal to a value.

Example: query the proportion in the goods data table that is less than or equal to the current price.

mysql> SELECT CUME_DIST() OVER(PARTITION BY category_id ORDER BY price ASC) AS cd,
    -> id, category, NAME, price
    -> FROM goods;
+---------------------+----+---------------+------------+---------+
| cd                  | id | category      | NAME       | price   |
+---------------------+----+---------------+------------+---------+
| 0.16666666666666666 |  5 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |
|  0.3333333333333333 |  1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |
|                 0.5 |  2 | Women's wear/Women's Boutique   | Dress      |   79.90 |
|  0.8333333333333334 |  3 | Women's wear/Women's Boutique   | Sweater        |   89.90 |
|  0.8333333333333334 |  4 | Women's wear/Women's Boutique   | Jeans      |   89.90 |
|                   1 |  6 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |
| 0.16666666666666666 |  9 | outdoor sport       | Alpenstocks      |   59.90 |
|                 0.5 |  7 | outdoor sport       | Bicycle      |  399.90 |
|                 0.5 | 10 | outdoor sport       | Riding equipment     |  399.90 |
|  0.6666666666666666 | 12 | outdoor sport       | Skate        |  499.90 |
|  0.8333333333333334 | 11 | outdoor sport       | Sport coat    |  799.90 |
|                   1 |  8 | outdoor sport       | Mountain Bike   | 1399.90 |
+---------------------+----+---------------+------------+---------+
12 rows in set (0.00 sec)

3. Front and back functions

1. LAG(expr,n) function

The LAG(expr,n) function returns the expr value of the first n rows of the current row.

Example: query the difference between the previous commodity price and the current commodity price in the goods data table.

mysql> SELECT id, category, NAME, price, pre_price, price - pre_price AS diff_price
    -> FROM (
    ->  SELECT  id, category, NAME, price,LAG(price,1) OVER w AS pre_price
    ->  FROM goods
    ->  WINDOW w AS (PARTITION BY category_id ORDER BY price)) t;
+----+---------------+------------+---------+-----------+------------+
| id | category      | NAME       | price   | pre_price | diff_price |
+----+---------------+------------+---------+-----------+------------+
|  5 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |      NULL |       NULL |
|  1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |     29.90 |      10.00 |
|  2 | Women's wear/Women's Boutique   | Dress      |   79.90 |     39.90 |      40.00 |
|  3 | Women's wear/Women's Boutique   | Sweater       |   89.90 |     79.90 |      10.00 |
|  4 | Women's wear/Women's Boutique   | Jeans      |   89.90 |     89.90 |       0.00 |
|  6 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |     89.90 |     310.00 |
|  9 | outdoor sport       | Alpenstocks      |   59.90 |      NULL |       NULL |
|  7 | outdoor sport       | Bicycle      |  399.90 |     59.90 |     340.00 |
| 10 | outdoor sport       | Riding equipment    |  399.90 |    399.90 |       0.00 |
| 12 | outdoor sport       | Skate       |  499.90 |    399.90 |     100.00 |
| 11 | outdoor sport       | Sport coat    |  799.90 |    499.90 |     300.00 |
|  8 | outdoor sport       | Mountain Bike  | 1399.90 |    799.90 |     600.00 |
+----+---------------+------------+---------+-----------+------------+
12 rows in set (0.00 sec)

2. LEAD(expr,n) function

The LEAD(expr,n) function returns the expr value of the next N lines of the current line.

Example: query the difference between the last commodity price and the current commodity price in the goods data table.

mysql> SELECT id, category, NAME, behind_price, price,behind_price - price AS diff_price
    -> FROM(
    ->  SELECT id, category, NAME, price,LEAD(price, 1) OVER w AS behind_price
    ->  FROM goods WINDOW w AS (PARTITION BY category_id ORDER BY price)) t;
+----+---------------+------------+--------------+---------+------------+
| id | category      | NAME       | behind_price | price   | diff_price |
+----+---------------+------------+--------------+---------+------------+
|  5 | Women's wear/Women's Boutique   | Pleated skirt      |        39.90 |   29.90 |      10.00 |
|  1 | Women's wear/Women's Boutique   | T Shirt        |        79.90 |   39.90 |      40.00 |
|  2 | Women's wear/Women's Boutique   | Dress      |        89.90 |   79.90 |      10.00 |
|  3 | Women's wear/Women's Boutique   | Sweater        |        89.90 |   89.90 |       0.00 |
|  4 | Women's wear/Women's Boutique   | Jeans      |       399.90 |   89.90 |     310.00 |
|  6 | Women's wear/Women's Boutique   | Woolen coat     |         NULL |  399.90 |       NULL |
|  9 | outdoor sport       | Alpenstocks       |       399.90 |   59.90 |     340.00 |
|  7 | outdoor sport       | Bicycle       |       399.90 |  399.90 |       0.00 |
| 10 | outdoor sport       | Riding equipment     |       499.90 |  399.90 |     100.00 |
| 12 | outdoor sport       | Skate        |       799.90 |  499.90 |     300.00 |
| 11 | outdoor sport       | Sport coat     |      1399.90 |  799.90 |     600.00 |
|  8 | outdoor sport       | Mountain Bike   |         NULL | 1399.90 |       NULL |
+----+---------------+------------+--------------+---------+------------+
12 rows in set (0.00 sec)

4. Head and tail function

1.FIRST_VALUE(expr) function

FIRST_ The value (expr) function returns the value of the first expr.

Example: sort by price to query the price information of the first commodity.

mysql> SELECT id, category, NAME, price, stock,FIRST_VALUE(price) OVER w AS first_price
    -> FROM goods WINDOW w AS (PARTITION BY category_id ORDER BY price);
+----+---------------+------------+---------+-------+-------------+
| id | category      | NAME       | price   | stock | first_price |
+----+---------------+------------+---------+-------+-------------+
|  5 | Women's wear/Women's Boutique   | Pleated skirt      |   29.90 |   500 |       29.90 |
|  1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |  1000 |       29.90 |
|  2 | Women's wear/Women's Boutique   | Dress      |   79.90 |  2500 |       29.90 |
|  3 | Women's wear/Women's Boutique   | Sweater       |   89.90 |  1500 |       29.90 |
|  4 | Women's wear/Women's Boutique   | Jeans      |   89.90 |  3500 |       29.90 |
|  6 | Women's wear/Women's Boutique   | Woolen coat    |  399.90 |  1200 |       29.90 |
|  9 | outdoor sport       | Alpenstocks      |   59.90 |  1500 |       59.90 |
|  7 | outdoor sport       | Bicycle      |  399.90 |  1000 |       59.90 |
| 10 | outdoor sport       | Riding equipment    |  399.90 |  3500 |       59.90 |
| 12 | outdoor sport       | Skate       |  499.90 |  1200 |       59.90 |
| 11 | outdoor sport       | Sport coat    |  799.90 |   500 |       59.90 |
|  8 | outdoor sport       | Mountain Bike  | 1399.90 |  2500 |       59.90 |
+----+---------------+------------+---------+-------+-------------+
12 rows in set (0.00 sec)

2.LAST_VALUE(expr) function

LAST_ The value (expr) function returns the value of the last expr.

Example: sort by price to query the price information of the last commodity.

mysql> SELECT id, category, NAME, price, stock,LAST_VALUE(price) OVER w AS last_price
    -> FROM goods WINDOW w AS (PARTITION BY category_id ORDER BY price);
+----+---------------+------------+---------+-------+------------+
| id | category      | NAME       | price   | stock | last_price |
+----+---------------+------------+---------+-------+------------+
|  5 | Women's wear/Women's Boutique   | Pleated skirt     |   29.90 |   500 |      29.90 |
|  1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |  1000 |      39.90 |
|  2 | Women's wear/Women's Boutique   | Dress     |   79.90 |  2500 |      79.90 |
|  3 | Women's wear/Women's Boutique   | Sweater       |   89.90 |  1500 |      89.90 |
|  4 | Women's wear/Women's Boutique   | Jeans     |   89.90 |  3500 |      89.90 |
|  6 | Women's wear/Women's Boutique   | Woolen coat   |  399.90 |  1200 |     399.90 |
|  9 | outdoor sport       | Alpenstocks     |   59.90 |  1500 |      59.90 |
|  7 | outdoor sport       | Bicycle     |  399.90 |  1000 |     399.90 |
| 10 | outdoor sport       | Riding equipment   |  399.90 |  3500 |     399.90 |
| 12 | outdoor sport       | Skate       |  499.90 |  1200 |     499.90 |
| 11 | outdoor sport       | Sport coat   |  799.90 |   500 |     799.90 |
|  8 | outdoor sport       | Mountain Bike | 1399.90 |  2500 |    1399.90 |
+----+---------------+------------+---------+-------+------------+
12 rows in set (0.00 sec)

5. Other functions

1.NTH_VALUE(expr,n) function

NTH_ The value (expr, n) function returns the value of the nth expr.

For example: query the price information of the second and third in the goods data table.

mysql> SELECT id, category, NAME, price,NTH_VALUE(price,2) OVER w AS second_price,
    -> NTH_VALUE(price,3) OVER w AS third_price
    -> FROM goods WINDOW w AS (PARTITION BY category_id ORDER BY price);
+----+---------------+------------+---------+--------------+-------------+
| id | category      | NAME       | price   | second_price | third_price |
+----+---------------+------------+---------+--------------+-------------+
|  5 | Women's wear/Women's Boutique   | Pleated skirt     |   29.90 |         NULL |        NULL |
|  1 | Women's wear/Women's Boutique   | T Shirt        |   39.90 |        39.90 |        NULL |
|  2 | Women's wear/Women's Boutique   | Dress     |   79.90 |        39.90 |       79.90 |
|  3 | Women's wear/Women's Boutique   | Sweater       |   89.90 |        39.90 |       79.90 |
|  4 | Women's wear/Women's Boutique   | Jeans     |   89.90 |        39.90 |       79.90 |
|  6 | Women's wear/Women's Boutique   | Woolen coat   |  399.90 |        39.90 |       79.90 |
|  9 | outdoor sport       | Alpenstocks     |   59.90 |         NULL |        NULL |
|  7 | outdoor sport       | Bicycle     |  399.90 |       399.90 |      399.90 |
| 10 | outdoor sport       | Riding equipment   |  399.90 |       399.90 |      399.90 |
| 12 | outdoor sport       | Skate       |  499.90 |       399.90 |      399.90 |
| 11 | outdoor sport       | Sport coat   |  799.90 |       399.90 |      399.90 |
|  8 | outdoor sport       | Mountain Bike | 1399.90 |       399.90 |      399.90 |
+----+---------------+------------+---------+--------------+-------------+
12 rows in set (0.00 sec)

2. NTILE(n) function

The NTILE(n) function divides the ordered data in the partition into n buckets and records the bucket number.

For example: divide the goods in the goods table into three groups according to the price.

mysql> SELECT NTILE(3) OVER w AS nt,id, category, NAME, price
    -> FROM goods WINDOW w AS (PARTITION BY category_id ORDER BY price);
+----+----+---------------+------------+---------+
| nt | id | category      | NAME       | price   |
+----+----+---------------+------------+---------+
|  1 |  5 | Women's wear/Women's Boutique | Pleated skirt     |   29.90 |
|  1 |  1 | Women's wear/Women's Boutique | T Shirt        |   39.90 |
|  2 |  2 | Women's wear/Women's Boutique | Dress     |   79.90 |
|  2 |  3 | Women's wear/Women's Boutique | Sweater       |   89.90 |
|  3 |  4 | Women's wear/Women's Boutique | Jeans     |   89.90 |
|  3 |  6 | Women's wear/Women's Boutique | Woolen coat   |  399.90 |
|  1 |  9 | outdoor sport      | Alpenstocks     |   59.90 |
|  1 |  7 | outdoor sport      | Bicycle     |  399.90 |
|  2 | 10 | outdoor sport      | Riding equipment   |  399.90 |
|  2 | 12 | outdoor sport      | Skate       |  499.90 |
|  3 | 11 | outdoor sport      | Sport coat   |  799.90 |
|  3 |  8 | outdoor sport      | Mountain Bike | 1399.90 |
+----+----+---------------+------------+---------+
12 rows in set (0.00 sec)

2.5 summary

The characteristic of window function is that it can be grouped and sorted within the group. In addition, the window function will not reduce the number of rows in the original table because of grouping, which is very useful for us to count and sort based on the data in the original table.

3. New feature 2: common table expression

Common Table Expressions (or Common Table Expressions) are called CTE (Common Table Expressions) for short. CTE is a named temporary result set whose scope is the current statement. CTE can be understood as a reusable subquery. Of course, it is somewhat different from subquery. CTE can refer to other CTEs, but subquery cannot refer to other subqueries. Therefore, you can consider replacing subqueries.

According to the different syntax structure and execution mode, common table expressions are divided into ordinary common table expressions and recursive common table expressions.

3.1 common table expression

The syntax structure of a common table expression is:

WITH CTE name 
AS (Subquery)
SELECT|DELETE|UPDATE sentence;

A common table expression is similar to a subquery, but unlike a subquery, it can be referenced multiple times and can be referenced by other common table expressions.

Example: query the details of the employee's department.

mysql> SELECT * FROM departments
    -> WHERE department_id IN (
    ->                  SELECT DISTINCT department_id
    ->                  FROM employees
    ->                  );
+---------------+------------------+------------+-------------+
| department_id | department_name  | manager_id | location_id |
+---------------+------------------+------------+-------------+
|            10 | Administration   |        200 |        1700 |
|            20 | Marketing        |        201 |        1800 |
|            30 | Purchasing       |        114 |        1700 |
|            40 | Human Resources  |        203 |        2400 |
|            50 | Shipping         |        121 |        1500 |
|            60 | IT               |        103 |        1400 |
|            70 | Public Relations |        204 |        2700 |
|            80 | Sales            |        145 |        2500 |
|            90 | Executive        |        100 |        1700 |
|           100 | Finance          |        108 |        1700 |
|           110 | Accounting       |        205 |        1700 |
+---------------+------------------+------------+-------------+
11 rows in set (0.00 sec)

This query can also be completed by using common table expressions:

mysql> WITH emp_dept_id
    -> AS (SELECT DISTINCT department_id FROM employees)
    -> SELECT *
    -> FROM departments d JOIN emp_dept_id e
    -> ON d.department_id = e.department_id;
+---------------+------------------+------------+-------------+---------------+
| department_id | department_name  | manager_id | location_id | department_id |
+---------------+------------------+------------+-------------+---------------+
|            90 | Executive        |        100 |        1700 |            90 |
|            60 | IT               |        103 |        1400 |            60 |
|           100 | Finance          |        108 |        1700 |           100 |
|            30 | Purchasing       |        114 |        1700 |            30 |
|            50 | Shipping         |        121 |        1500 |            50 |
|            80 | Sales            |        145 |        2500 |            80 |
|            10 | Administration   |        200 |        1700 |            10 |
|            20 | Marketing        |        201 |        1800 |            20 |
|            40 | Human Resources  |        203 |        2400 |            40 |
|            70 | Public Relations |        204 |        2700 |            70 |
|           110 | Accounting       |        205 |        1700 |           110 |
+---------------+------------------+------------+-------------+---------------+
11 rows in set (0.00 sec)

The example shows that the common table expression can play the role of sub query. If you need to use a subquery in the future, you can define a common table expression before the query, and then use it to replace the subquery in the query. Moreover, compared with subqueries, a common table expression has an advantage, that is, the query after defining the common table expression can reference the common table expression multiple times like a table, while the subquery cannot.

3.2 recursive common table expression

Recursive common table expression is also a common table expression. However, in addition to the characteristics of common common table expression, it also has its own characteristics, that is, it can call itself. Its syntax structure is:

WITH RECURSIVE
CTE name AS (Subquery)
SELECT|DELETE|UPDATE sentence;

The recursive common table expression consists of two parts: seed query and recursive query, which are connected through the keyword UNION [ALL]. The seed query here means to obtain the initial value of recursion. This query will be run only once to create the initial data set, and then the recursive query will be executed until no new query data is generated and returned recursively.

**Case: * * for our commonly used Employees table, include employees_ id,last_name and manager_id three fields. If a is the manager of b, then we can call b the subordinate of A. if b is the manager of c, then c is the subordinate of b and the subordinate of A.

Next, we try to list the information of all employees with subordinate identities with query statements.

If we use the knowledge we have learned before to solve it, it will be more complex. It will take at least 4 queries to solve it:

  • The first step is to find out the first generation of managers, that is, those who do not take anyone else as the manager, and store the results in the temporary table;

  • The second step is to find out all the people with the first generation manager as the manager, get a subordinate set, and store the results in the temporary table;

  • The third step is to find out all the following managers, get a subordinate set, and store the results in the temporary table.

  • Step 4: find out all the following subordinates as managers and get a result set.

If the result set of step 4 is empty, the calculation ends, and the result set of step 3 is the lower subordinate set we need. Otherwise, we must continue with step 4 until the result set is empty. For example, the above data table needs to go to step 5 to get an empty result set. And finally, we have to go to step 6: merge the result sets of step 3 and step 4, so as to finally get the result set we need.

If you use recursive common table expressions, it is very simple. Let me introduce the specific ideas.

  • Use the seed query in the recursive common table expression to find the first generation manager. The field n represents the generation, and the initial value is 1, indicating that it is the first generation manager.

  • Use the recursive query in the recursive common table expression to find out the person who takes the person in the recursive common table expression as the manager, and add 1 to the value of the generation. Until no one takes the person in the recursive common table expression as the manager, recursive return.

  • In the final query, select all people whose generations are greater than or equal to 3. They must be subordinates of the third generation and above, that is, subordinates. So we get the result set we need.

It seems that there are three steps here. In fact, it is three parts of a query, which only needs to be executed once. Moreover, there is no need to save the intermediate results with a temporary table, which is much simpler than the method just described.

Code implementation:

WITH RECURSIVE cte 
AS 
(
SELECT employee_id,last_name,manager_id,1 AS n FROM employees WHERE employee_id = 100 -- Seed query to find the first generation of leaders
UNION ALL
SELECT a.employee_id,a.last_name,a.manager_id,n+1 FROM employees AS a JOIN cte
ON (a.manager_id = cte.employee_id) -- Recursive query to find the person led by the person with recursive common table expression
)
SELECT employee_id,last_name FROM cte WHERE n >= 3; 

In short, recursive common table expressions are very useful for querying tree structured data with a common root node. It can easily find out the data of all nodes without hierarchical restrictions. If you use other query methods, it will be more complex.

3.3 summary

Common table expressions can replace subqueries and can be referenced multiple times. The recursive common table expression is very efficient for the tree structure data with a common root node for queries, and can easily handle queries that are difficult to handle by other query methods.

Tags: MySQL SQL

Posted on Tue, 30 Nov 2021 10:22:34 -0500 by matt86