To learn about MySQL partition related concepts, usage scenarios and restrictions, click here
Sample tables for this article
Create table
CREATE TABLE `student_1` ( `ID` int(11) NOT NULL AUTO_INCREMENT, `NAME` varchar(30) NOT NULL, `SEX` char(2) NOT NULL, `AGE` int(11) NOT NULL, `CLASS` varchar(10) NOT NULL, `GRADE` varchar(20) NOT NULL, `HOBBY` varchar(100) DEFAULT NULL, `CREATE_TIME` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`ID`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Insert 20 million data
DROP PROCEDURE IF EXISTS inst_data; CREATE PROCEDURE inst_data() BEGIN DECLARE incr INT DEFAULT 1; WHILE incr <= 20000000 DO INSERT INTO `study`.`student_1` (`NAME`, `SEX`, `AGE`, `CLASS`, `GRADE`, `HOBBY` ,`CREATE_TIME`) VALUES ( CONCAT( 'jsontom', incr), CASE WHEN incr % 2 = 0 THEN 'male' ELSE 'female' END, FLOOR(18 + ( RAND() * 7 )), CONCAT(1 + FLOOR(RAND() * 10 ), 'class' ), CONCAT(1 + FLOOR(RAND() * 10 ), 'grade' ), CONCAT( 'Love to see',1 + FLOOR(RAND() * 9 ),'A film'), NOW() ); SET incr = incr + 1; END WHILE; SELECT CONCAT('Number of inserts ',incr); END; CALL inst_data ( );
range partition management
range partitioning tables for existing data
If you want to use one or some fields as partition fields, they must all be included in the primary key or a unique index.
The direct partition in the above table will report the following errors
1503 - A PRIMARY KEY must include all columns in the table's partitioning function
The following partition fields are CREATE_TIME, the table has no other unique index except the primary key id.
In the first way, we combine id and create_ The time field is used as the federated primary key.
The second way is to create a unique index containing CREATE_TIME field.
In this paper, we adopt the first one, which is selected according to the actual situation in reality.
The partition statement is as follows
ALTER TABLE student_1 PARTITION BY RANGE ( UNIX_TIMESTAMP(CREATE_TIME) ) ( PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 10:04:42') ), PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 12:44:02') ), PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 15:23:22') ), PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 18:02:42') ), PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 20:42:03') ), PARTITION pmax VALUES LESS THAN (UNIX_TIMESTAMP('2021-11-21 00:00:00')) );
New partition
ALTER TABLE student_1 ADD PARTITION (PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-26 20:42:03') ));
Reorganize partition
Note: for the new partition generated after merging / splitting partitions, the amount of data queried for the new partition is zero, but the data is real.
Merge partitions
Merge p1 and p2 partitions into p1_p2 partition
ALTER TABLE student_1 REORGANIZE PARTITION p1,p2 INTO ( PARTITION p1_p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 12:44:02') ) );
Split partition
Put p1_ The p2 partition is split into p1 and p2 partitions
ALTER TABLE student_1 REORGANIZE PARTITION p1_p2 INTO ( PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 10:04:42') ), PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 12:44:02') ) );
View the amount of data in the partition
SELECT PARTITION_NAME, TABLE_ROWS FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'student_1';
delete a partition
Note: when a partition is deleted, all data in the partition is also deleted.
ALTER TABLE student_1 DROP PARTITION p1;
Cancel partition
In this way, the data can be retained and restored to an ordinary table
alter table student_1 remove partitioning;
Efficiency comparison
For the following SQL
select * from student_1 where CREATE_TIME = STR_TO_DATE('2021-11-19 18:02:42','%Y-%m-%d %H:%i:%s')
Queries that do not use indexes and partitions
The full table scanning query took 16.960s
Queries using indexes
Add index
alter table student_1 add index crtm(CREATE_TIME);
Query time: 1.961s
Use partition query
Add partition
ALTER TABLE student_1 PARTITION BY RANGE ( UNIX_TIMESTAMP(CREATE_TIME) ) ( PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 10:04:42') ), PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 12:44:02') ), PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 15:23:22') ), PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 18:02:42') ), PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-11-19 20:42:03') ), PARTITION pmax VALUES LESS THAN (UNIX_TIMESTAMP('2021-11-21 00:00:00')) );
The query took 3.692s
Use partition + index query
Partition the table and create_ Adding an index to the time field takes 0.614s to query
summary
Efficiency from high to low
Use partition + index > use index > use partition > do not use index and partition
Therefore, when the effect of using index is poor or the cost of index maintenance is high, we can consider introducing partition to improve efficiency.
be careful
When accessing partitioned tables, it is very important to bring partitioned columns (not expressions containing columns) in the WHERE condition. Sometimes it seems redundant, so that the optimizer can filter out partitions that do not need to be accessed. Without such a condition, all partitions will be accessed, which is expensive.
EXPLAIN PARTITIONS to see if the SQL statement performs partition filtering