SQL statement optimization instance of MySQL multi table associated query

Today, let's practice how to optimize the SQL statements of two or more tables combined query.

Let's create two tables and prepare some data first

create table teacher(
  tid int(4) primary key,
  cid int(4) not null
);
insert teacher values(1,2);
insert teacher values(2,1);
insert teacher values(3,3);


create table course(
  cid int(4) ,
  cname varchar(20)
);
insert course values(1,'java');
insert course values(2,'javaweb');
insert course values(3,'spring');

Next let's write a join query and see the execution plan.

mysql> explain select * from teacher t left outer join course c on t.cid=c.cid where c.cname='java';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | c     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where                                |
|  1 | SIMPLE      | t     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |    33.33 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

From the execution plan result, we can see that the type is all, which means that there is no optimization, and the using join buffer appears. What does it mean? There is an sql optimizer in the mysql logic layer. Because the performance of the sql we wrote is too poor, it automatically adds the connection buffer for us.

So how can we optimize it? Even if the table we should add to, remember the rules below.

  • Small tables drive large tables. When writing sql statements, write small tables first and then large tables
  • Index added to frequently used fields
  • Left connected to the left table and right connected to the right table

Let's go on. According to the above rules, if we want to add indexes, we need to add them to the left connected table. That is, after adding indexes to the cid of the teacher table, we can see the execution plan.

alter table teacher add index index_cid(cid);

mysql> explain select * from teacher t left outer join course c on t.cid=c.cid where c.cname='java';
+----+-------------+-------+------------+------+---------------+-----------+---------+---------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref                 | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+-----------+---------+---------------------+------+----------+-------------+
|  1 | SIMPLE      | c     | NULL       | ALL  | NULL          | NULL      | NULL    | NULL                |    3 |    33.33 | Using where |
|  1 | SIMPLE      | t     | NULL       | ref  | index_cid     | index_cid | 4       | explain_test1.c.cid |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+-----------+---------+---------------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)

From the execution plan result, the teacher table has been optimized to the ref level, which greatly improves the efficiency. Because the retrieval condition is to search by cname, let's try to add an index to cname, and then we will analyze the execution plan.

alter table course add index index_cname(cname);

mysql> explain select * from teacher t left outer join course c on t.cid=c.cid where c.cname='java';
+----+-------------+-------+------------+------+---------------+-------------+---------+---------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref                 | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+-------------+---------+---------------------+------+----------+-------------+
|  1 | SIMPLE      | c     | NULL       | ref  | index_cname   | index_cname | 83      | const               |    1 |   100.00 | Using where |
|  1 | SIMPLE      | t     | NULL       | ref  | index_cid     | index_cid   | 4       | explain_test1.c.cid |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+-------------+---------+---------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

Similarly, the two tables have been upgraded to ref level, and the performance has been greatly increased. Here, our example is the optimization of the combination of two tables, and the principle of the combination of three or four tables is the same. Small tables drive large tables, and indexes are added to frequently used fields.

At the end of the article, we have finished today's article with tips on how to operate index fields.

  • Composite index should not be used across columns or out of order
  • Try to use full index matching
  • Do not perform any operations on the index, such as calculation, function, type conversion (explicit or implicit), etc
  • Index column cannot use not equal to (! =, < >) or greater than (> or less than (< or is null or is not nul
  • Try to reach using index in index optimization
  • Don't start with% when writing like statement
  • Index column cannot use or

Tags: SQL Java MySQL Spring

Posted on Fri, 12 Jun 2020 02:24:32 -0400 by grumpy