"I can't understand you hit me series" -- minimalist mybatis cache [look up the source code of sql execution process]

mybatis version: 3.4.6

Digression

First of all, as a technical worker, learning framework or technology, I think there are three aspects that need to be considered.

  1. What is it?

    What is this thing? How do I use it? In what scenario does it need to be used?

  2. Why?

    Why is there this thing and what problems can it solve?

  3. How?

    A framework or technology is a tool. If it's a tool, it's hard to work. The tools don't work well. What should I do?

    Fix!

    How to fix it if you don't understand the principle? Therefore, we should understand at least some of its basic principles and ideas, and understand how it works, so as to be able to play with it

Yesterday, I spent a day reviewing mybatis, including its use and the general flow of the source code. I really realized what the teacher said: "if the source code analysis is in place, it can't stop. It's directly connected to the bottom".

I went to bed at 12 o'clock last night. I was still thinking about mybatis secondary cache. I suddenly felt that a place was blocked and something was wrong. I slept actively and got up. So he got out of bed quickly, turned on the computer and was satisfied until 1 o'clock. Last night, I dreamt that I was shuttling through the source files in the IDEA. My head was dizzy. I was really drunk.

Next, I'll summarize the things that mybatis caches. I have the right to be my personal learning record and practice writing. I don't say much and open it directly.

Dinner

There are two kinds of cache for mybatis: L1 cache and L2 cache

It is described according to the above three aspects

  1. What is it?

    mybatis cache saves the query results of the database to memory (or hard disk), so that the next time the same query is executed, the results can be directly fetched from memory without going through the database

  2. Why?

    For repeated queries, it can improve the response speed and reduce the access pressure of the database. It is applicable to the situation where the response time is high and the real-time data is not high. The cache needs to consider the problem of data consistency, that is, the data in the cache may not be consistent with the data in the database, so the cache will generally set a refresh interval, or refresh the cache when performing some operations.

  3. How?

    The core class of cache in mybatis is PerpetualCache, whose bottom layer is to save data through a HashMap. To see what happens next, please listen to the following breakdown.

First, let's summarize the general operation process of mybatis and the classes involved

Operation process

  1. Parsing global configuration files using Xpath syntax
  2. Resolve global configuration, such as
    • Enable L2 cache
    • Enable deferred loading
    • Whether to use plug-ins
    • Set type alias
    • Database connection information
  3. Parse the mapper mapping file and encapsulate each CRUD tag into a MappedStatement
  4. All information is encapsulated into the Configuration object (this is a heavyweight object)
  5. Encapsulate Configuration into SqlSessionFactory
  6. Open a SqlSession every time SqlSessionFactory is called
  7. Each SqlSession holds a private Executor object and a shared Configuration object
  8. For each operation, select a MappedStatement to be executed by the Executor. Including SQL statement assembly, parameter parsing, return result parsing and other operations

It is represented by a rough diagram

Key classes

  • Configuration

    Encapsulates all attributes

  • MappedStatement

    A CRUD tag corresponds to a MappedStatement

  • Executor system

    mybatis core Executor, which performs database operations through Executor

  • SqlSource system

    A SqlSource that encapsulates the CRUD tag and is used to assemble SQL statements

  • SqlNode architecture

    Dynamic SQL tags are stored in a tree form. A SqlSource has a rootSqlNode. Each SqlNode has an apply method, which is used to splice SQL statements when the Executor executes. The design of SqlNode uses the combination mode in the design mode.

  • StatementHandler

    Generate a Statement, set parameters, and execute a query

  • ParameterHandler

    When setting parameters, perform parameter parsing, type conversion, etc

  • ResultSetHandler

    Obtain the result set and encapsulate the result

L1 cache

  • Scope: SqlSession (default)
  • Holder: BaseExecutor
  • It is enabled by default (in fact, it cannot be closed, but some methods can be used to invalidate the L1 cache)
  • Clear cache:
    • When adding, deleting and modifying operations are performed in the same SqlSession (no submission is required), the L1 cache will be cleared
    • When SqlSession is committed or closed (automatically committed when closed), the L1 cache is cleared
    • Set the attribute flushCache=true for a CRUD tag in mapper.xml (this will invalidate the L1 cache and L2 cache of the tag)
    • Set < setting name = "localcachescope" value = "state" / > in the global configuration file, which will invalidate the L1 cache and not affect the L2 cache

The scope of L1 cache is only in the same SqlSession. It is implemented through a localCache attribute in BaseExecutor. In fact, this localCache is an instance of the PerpetualCache class, which is an ordinary HashMap. The following is a specific explanation through a query

//Test code
public class SimpleTest {
<span class="token keyword">private</span> SqlSessionFactory factory<span class="token punctuation">;</span>

<span class="token annotation punctuation">@Before</span>
<span class="token keyword">public</span> <span class="token keyword">void</span> <span class="token function">init</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">throws</span> IOException <span class="token punctuation">{<!-- --></span>
    String resource <span class="token operator">=</span> <span class="token string">"v1/mybatis-config.xml"</span><span class="token punctuation">;</span>
    InputStream inputStream <span class="token operator">=</span> Resources<span class="token punctuation">.</span><span class="token function">getResourceAsStream</span><span class="token punctuation">(</span>resource<span class="token punctuation">)</span><span class="token punctuation">;</span>
    factory <span class="token operator">=</span> <span class="token keyword">new</span> <span class="token class-name">SqlSessionFactoryBuilder</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">build</span><span class="token punctuation">(</span>inputStream<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token annotation punctuation">@Test</span>
<span class="token keyword">public</span> <span class="token keyword">void</span> <span class="token function">testFindById</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">{<!-- --></span>
    SqlSession sqlSession <span class="token operator">=</span> factory<span class="token punctuation">.</span><span class="token function">openSession</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    User u1 <span class="token operator">=</span> sqlSession<span class="token punctuation">.</span><span class="token function">selectOne</span><span class="token punctuation">(</span><span class="token string">"findUserById"</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span>u1<span class="token punctuation">)</span><span class="token punctuation">;</span>
    User u2 <span class="token operator">=</span> sqlSession<span class="token punctuation">.</span><span class="token function">selectOne</span><span class="token punctuation">(</span><span class="token string">"findUserById"</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span>u2<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>

}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

First look at factory.openSession()

A new Executor comes out, and the default ExecutorType in Configuration is SIMPLE

//Configuration source code fragment
protected ExecutorType defaultExecutorType = ExecutorType.SIMPLE;

 
  • 1
  • 2

By default, cacheEnabled is on

//Configuration source code fragment
protected boolean cacheEnabled = true;

 
  • 1
  • 2

Finally, the default generated Executor is the simpleexecution wrapped by the cacheingexecution (here is the decorator mode, but I think it is better explained as packaging)

Put a Debug diagram

Put another Executor class diagram

BaseExecutor is an abstract class. The design here uses the template method pattern. The cacheingexecution exists only as a decorator. Its main function is to help maintain the L2 cache (the L2 cache is actually associated with MappedStatement). The real execution is entrusted to the simpleexecution. The L1 cache is maintained in BaseExecutor.

Next, execute a query to see what the Executor has done. Our entry is the selectOne method of SqlSession

User u1 = sqlSession.selectOne("findUserById", 1);

 
  • 1

Finally, we come to the selectList method in DefaultSqlSession

Enter cachengexecution

The CacheKey here is related to the L2 cache. Keep it first, and then move on

The query method in simpleexecution is inherited from BaseExecutor and has not been overridden, so it enters BaseExecutor

Next, enter queryFromDatabase to really query data from the database

Results of the first query:

Next, execute the same query for the second time in the same SqlSession

Directly into BaseExecutor

Look at the localCache at this time. It has value

Look at the execution results

The SQL query was executed only once, and the first level cache was used for the second time

Let's take another look at the localCache in BaseExecutor

The L1 cache is actually the perpetual cache, a HashMap with id

In fact, the lowest level implementation of L2 cache is also the PerpetualCache, but L2 cache uses decorator mode to wrap the PerpetualCache layer by layer, and the holder is no longer BaseExecutor, but MappedStatement.

Let's start L2 cache, go

L2 cache

  • Scope: mapper level (can span sqlsessions). A mapper.xml corresponds to a L2 cache. Each L2 cache is uniquely identified by the namespace of the mapper file

  • Holder: MappedStatement

  • It is off by default (in fact, cacheEnabled in the global configuration file is true by default, which is the master switch of L2 cache. It is on by default. However, the < cache / > tag must be configured in mapper.xml to enable the L2 cache of mapper.xml)

  • For the MappedStatement corresponding to the SELECT node, the L2 cache is enabled by default, and for other nodes (INSERT/DELETE/UPDATE), it is closed by default (this is controlled by the useCache attribute in the MappedStatement)

    Paste an ugly figure. Each CRUD tag in a mapper.xml will be encapsulated as a MappedStatement. A mapper.xml has only one L2 cache, but this L2 cache is held by the MappedStatement (to be exact, the SELECT node)

  • For the data put into the L2 cache, the Serializable interface shall be implemented by default, because the storage medium of the L2 cache may have a hard disk in addition to memory, so the stored objects need to be Serializable. Of course, if the readOnly attribute of the L2 cache is set to true, the object data may not need to implement the Serializable interface

    • readOnly=true: returns the reference of the cache object. readOnly=true is intended to tell the user not to modify the data after fetching the data from the cache, rather than to ensure the readability of the cache (the cache stores object references. If the data is fetched and modified, the real object will be changed. If another user fetches the object from the cache, he will find that the object has been modified)
    • readOnly=false: through serialization, a copy of the cached object is returned, which is slower but safer. After the user obtains the data, the modification will not affect the objects in the L2 cache
  • After a query is executed, it needs to be submitted, and the query results will be saved to the L2 cache

  • Problems:

    • Since each namespace corresponds to a L2 cache (a namespace is a mapper. XML), if mapper a.xml is full of operations on the user table, and there are a few operations on the user table in another mapper b.xml, the L2 caches of the two mappers are independent of each other. However, if mapper B adds, deletes, and modifies the user table and submits it, However, it will not be flushed to the secondary cache of mapper A. at this time, if mapper A is used for query, dirty data may be obtained. Therefore, the L2 cache should be used with special care. The operations on the same table should be placed in only one mapper.xml as far as possible.
    • Fine grained control is not good enough. Quoting a classic chestnut, if an e-commerce website caches commodity information and requires users to query the latest information every time, mybatis secondary cache cannot achieve "when a commodity information changes, only refresh the information of the commodity in the cache without refreshing the information of other commodities". Because the L2 cache is mapper level, one refresh will clear all the information in the cache. A possible solution: abandon the L2 cache and use a controllable cache in the business layer.

    In mybatis, multiple mapper.xml can share the same L2 cache through the < cache ref / > tag

    You can also use a custom cache implementation or a third-party cache (such as redis, ehcache, memcache) through the type attribute in < cache / >

First, when parsing the configuration file from mybatis, analyze the configuration of the L2 cache. The entry is the parse method of XMLConfigBuilder. We won't repeat it, but go directly to the place where mapper.xml is parsed

Go directly to the source code of XMLMapperBuilder

Take a look at the parsing of < cache / > tags

By default, Perpetual is used as the cache implementation and replaced by LRU. readOnly is also set to false by default, which requires that the objects put into the L2 cache must be serializable by default

Finally, the MapperBuilderAssistant is called to store the cache (a mapper file corresponds to a builderAssistant, which contains a public information of mapper, such as parameterMap, resultMap, mapper cache).

Let's take a look at the process of creating a cache, that is, the build method of CacheBuilder

In the previous DEBUG diagram, see what the final Cache looks like

When executing a query and stuffing data into the L2 cache, a TransactionalCache will also be wrapped in the outermost layer, which is used to temporarily store the query results, and uniformly put the query results into the perpetual cache after the session is submitted

Another rough picture

Next, perform the same query in two sqlsessions to see whether the L2 cache is a ghost

	@Test
	public void testLevel2Cache(){
		SqlSession sqlSession = factory.openSession();
		UserMapper userMapper = sqlSession.getMapper(UserMapper.class);
		User userById = userMapper.findUserById(1);
		System.out.println(userById);
		//Note that the results will not be saved to the L2 cache until they are submitted
		//If it is not submitted, the database will still be queried twice
		sqlSession.close();
	SqlSession sqlSession2 <span class="token operator">=</span> factory<span class="token punctuation">.</span><span class="token function">openSession</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
	userMapper <span class="token operator">=</span> sqlSession2<span class="token punctuation">.</span><span class="token function">getMapper</span><span class="token punctuation">(</span>UserMapper<span class="token punctuation">.</span><span class="token keyword">class</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
	User user2 <span class="token operator">=</span> userMapper<span class="token punctuation">.</span><span class="token function">findUserById</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
	System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span>user2<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

For the first query, go

Note that this CacheKey is a key generated according to MappedStatement, input parameters, rowboundaries, etc. Will be used to insert the query results into the HashMap

At this time, the Cache obtained from the MappedStatement is not null, but a synchronized Cache instance. The first query shows that there is no data in the L2 Cache, so it is entrusted to simpleexectuator. After the query, save the query results to the L2 Cache through the sentence tcm.putObject(cache,key,list) (but it is not really flushed to the L2 Cache at this time, and it will be flushed only after submission). We can go in and take a look at this method

getTransactionalCache is actually a layer of wrapper for the cache (decorator mode)

As we said before, the bottom layer of the L2 cache is a perpetual cache, which is internally a HashMap, and the transactional cache maintains a Map itself

After the query, the query results are temporarily stored in the entriesToAddOnCommit

The data fetched from the L2 cache is the data in the HashMap in the PerpetualCache. Therefore, when it is not submitted, the query result is still in the entriesToAddOnCommit. Only when it is submitted, will the data in the entriesToAddOnCommit be inserted into the HashMap in the PerpetualCache (pass the data layer by layer by calling the putObject method of delegate)

We can take a look at SqlSession.commit

The commit of the executor will be called

Then call the commit of the TransactionalCacheManager

commit will be called for all transactionalcaches under the TransactionalCacheManager

At this time, the temporary data will be flushed to the L2 cache

Execute the second query, go

It can be seen that during the second query, the data is obtained from the L2 cache

Take a look at the execution results

The second query hit the L2 cache, and the two queries executed SQL only once

To sum up:

In fact, there are not many things. To put it simply, it is the Executor system and Cache system of mybatis. Attention should be paid to the respective defects of L1 Cache and L2 Cache.

The design of Executor uses template method mode and decorator mode

The design of Cache is a typical decorator pattern

Reference link:

mybatis cache through source code analysis

About the readOnly attribute of mybatis L2 cache

Advantages of using redis as the secondary cache of mybatis

What is the significance of mybatis L1 cache?

mybatis L1 cache pit record

finish

</article>

Tags: Java Design Pattern Back-end crawler

Posted on Tue, 23 Nov 2021 04:09:48 -0500 by sovtek