Analysis of HOT principle

2020-06-09 19:31:01

1, Questions

Some time ago, someone in the QQ group was confused about "I update d the data in this row of table (0,4), looked at the page data of the index, and found that the index always points to (0,4), and I can't find the data by querying the business table with ctid='(0,4)"; then I made the vacuum of the table, reindex even drop/create index, or so on ".

The (heap only tuple) HOT feature is implemented in PostgreSQL8.3. The purpose of its existence is to eliminate the impact of table non index column updates on indexes. But how does it work?

2, Analysis

Let's simulate the environment

postgres=# create table tbl_hot(id int primary key, info text);
CREATE TABLE
postgres=# insert into tbl_hot select generate_series(1, 4), 'lottu';
INSERT 0 4
postgres=# select ctid ,t.* from tbl_hot t;
 ctid  | id | info  
-------+----+-------
 (0,1) |  1 | lottu
 (0,2) |  2 | lottu
 (0,3) |  3 | lottu
 (0,4) |  4 | lottu
(4 rows)
postgres=# \d tbl_hot
              Table "public.tbl_hot"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 id     | integer |           | not null | 
 info   | text    |           |          | 
Indexes:
    "tbl_hot_pkey" PRIMARY KEY, btree (id)

We create TABLE tbl_hot; and insert 4 records. This is the record we updated (0,4). as follows

postgres=# update tbl_hot set info = 'rax' where id = 4;
UPDATE 1
postgres=# select ctid ,t.* from tbl_hot t;
 ctid  | id | info  
-------+----+-------
 (0,1) |  1 | lottu
 (0,2) |  2 | lottu
 (0,3) |  3 | lottu
 (0,5) |  4 | rax
(4 rows)

Do we see the index changes after the update?

postgres=# select * from bt_page_items('tbl_hot_pkey', 1);
 itemoffset | ctid  | itemlen | nulls | vars |          data           
------------+-------+---------+-------+------+-------------------------
          1 | (0,1) |      16 | f     | f    | 01 00 00 00 00 00 00 00
          2 | (0,2) |      16 | f     | f    | 02 00 00 00 00 00 00 00
          3 | (0,3) |      16 | f     | f    | 03 00 00 00 00 00 00 00
          4 | (0,4) |      16 | f     | f    | 04 00 00 00 00 00 00 00
(4 rows)
Bt_ page_ The items function is used to return detailed information about all items on the b-tree index page. In the B-tree leaf page, the ctid points to a heap tuple. In the internal page, the block number portion of the ctid points to another page in the index itself.

We can see that the index has not changed. The index store is the ctid + index value of the table data. Using index can quickly find the corresponding record's ctid. Now, the ctid (0,4) of the record id=4 index is not consistent with the corresponding ctid(0,5) of the table. Is that an index failure. Let's test it

postgres=# explain select id from tbl_hot where id = 4;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Only Scan using tbl_hot_pkey on tbl_hot  (cost=0.15..8.17 rows=1 width=4)
   Index Cond: (id = 4)
(2 rows)

The index is not invalid; how to find the corresponding record? Let's first look at the page stored in the following table

get_raw_page: returns the page content in the current table file according to the parameter indication, data file type (main, fsm, vm) and page location. Another function has the same name and only two parameters. It omits the second parameter and uses' main 'directly.
heap_page_items: parameter is function get_ raw_ The return value of page. The return value is the details of the item pointer (ItemIddata) and the HeapTupleHeaderData in the page.
Among them, understand the meaning of the following fields
 lp: This is the column defined by the plug-in itself, which is not in the source code. This is the order of item pointers.
lp_off: the position of the tuple in the page
 lp_flags: the meaning is as follows
#define LP_UNUSED       0       /* unused (should always have lp_len=0) */
#define LP_NORMAL       1       /* used (should always have lp_len>0) */
#define LP_REDIRECT     2       /* HOT redirect (should have lp_len=0) */
#define LP_DEAD         3       /* dead, may or may not have storage */
t_ctid: This refers to the physical ID
 t_infomask2: the number of table fields and some flags; the meaning of flag
#define HEAP_NATTS_MASK         0x07FF
             /* 11 bits for number of attributes *//* bits 0x1800 are available */
#define HEAP_KEYS_UPDATED       0x2000 
          /* tuple was updated and key cols* modified, or tuple deleted */
#define HEAP_HOT_UPDATED        0x4000  /* tuple was HOT-updated */
#define HEAP_ONLY_TUPLE         0x8000  /* this is heap-only tuple */
#define HEAP2_XACT_MASK         0xE000  /* visibility-related bits */
postgres=# select * from heap_page_items(get_raw_page('tbl_hot', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |         t_data         
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------------------
  1 |   8152 |        1 |     34 |    554 |      0 |        0 | (0,1)  |           2 |       2306 |     24 |        |       | \x010000000d6c6f747475
  2 |   8112 |        1 |     34 |    554 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |       | \x020000000d6c6f747475
  3 |   8072 |        1 |     34 |    554 |      0 |        0 | (0,3)  |           2 |       2306 |     24 |        |       | \x030000000d6c6f747475
  4 |   8032 |        1 |     34 |    554 |    555 |        0 | (0,5)  |       16386 |       1282 |     24 |        |       | \x040000000d6c6f747475
  5 |   8000 |        1 |     32 |    555 |      0 |        0 | (0,5)  |       32770 |      10498 |     24 |        |       | \x0400000009726178
(5 rows)

Let's figure out: how to find the corresponding record through the condition id=4

  1. Find the index tuple(0,4) pointing to the target data tuple
  2. According to the position (0,4) where the index tuple is obtained, find the position where the row pointer lp is 4. That is, the corresponding ctid is (0,5)
  3. According to the ctid of (0,5); we can find two tuples. Judge which tuple is visible according to the MVCC mechanism of PG
  4. You can find the corresponding tuple

The principle of updating many times is similar.

At this time, you will have a question "execute vacuum; clear the table tuple(0,4); without step 2, the above process will not work.". Let's analyze:

postgres=# vacuum tbl_hot;
VACUUM
postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));
 lp | lp_off | lp_flags | t_ctid | t_infomask2 
----+--------+----------+--------+-------------
  1 |   8152 |        1 | (0,1)  |           2
  2 |   8112 |        1 | (0,2)  |           2
  3 |   8072 |        1 | (0,3)  |           2
  4 |      5 |        2 |        |            
  5 |   8040 |        1 | (0,5)  |       32770
(5 rows)

At this time, in order to solve this problem, postgresql will redirect the row pointer at the right time, which is called trimming. Now let's deal with this situation: how to find the corresponding record through the condition id=4

  1. Find the index tuple(0,4) pointing to the target data tuple
  2. According to the position (0,4) of the index tuple, find the position where the row pointer lp is 4; this is lp_ If the flag is 2, the pointer redirection lp is 5; that is, the corresponding position of the line pointer is 8040
  3. The corresponding tuple can be found through the pointer.

This is a tuple(0,4); since vacuum; means it can be reused; but this is the tag LP_REDIRECT; indicates that the tuple is not dead tuple; it is not recycled; it cannot be reused. At this point, you may have a question: "when can it be recycled?" ; the answer is that this tuple(0,4) does not mark dead tuple. But execute vacuum; this page is recyclable space; this is the content of the MVCC processing mechanism of PG - vacuum; you can go to the next page. Here we can briefly demonstrate the following:

postgres=# update tbl_hot set info = 'postgres' where id = 4;
UPDATE 1
postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));
 lp | lp_off | lp_flags | t_ctid | t_infomask2 
----+--------+----------+--------+-------------
  1 |   8152 |        1 | (0,1)  |           2
  2 |   8112 |        1 | (0,2)  |           2
  3 |   8072 |        1 | (0,3)  |           2
  4 |      5 |        2 |        |            
  5 |   8040 |        1 | (0,6)  |       49154
  6 |   8000 |        1 | (0,6)  |       32770
(6 rows)
postgres=# vacuum tbl_hot;
VACUUM
postgres=# select lp, lp_off, lp_flags, t_ctid, t_infomask2 from heap_page_items(get_raw_page('tbl_hot', 0));
 lp | lp_off | lp_flags | t_ctid | t_infomask2 
----+--------+----------+--------+-------------
  1 |   8152 |        1 | (0,1)  |           2
  2 |   8112 |        1 | (0,2)  |           2
  3 |   8072 |        1 | (0,3)  |           2
  4 |      6 |        2 |        |            
  5 |      0 |        0 |        |            
  6 |   8032 |        1 | (0,6)  |       32770
(6 rows)
postgres=# select ctid,t.* from tbl_hot t;
 ctid  | id |   info   
-------+----+----------
 (0,1) |  1 | lottu
 (0,2) |  2 | lottu
 (0,3) |  3 | lottu
 (0,5) |  5 | lottu
 (0,6) |  4 | postgres
(5 rows)

Finally, when the updated ancestor is on another page, the index will also be updated; this can be understood as row migration. This is also the case in Oracle. But it is more frequent than oracle; of course, you can set to lower the fillfactor; reduce this situation.

3, Reference

https://blog.csdn.net/xiaohai928ww/article/details/98603707

https://www.postgresql.org/docs/12/pageinspect.html

Tags: PostgreSQL Oracle

Posted on Tue, 09 Jun 2020 23:11:28 -0400 by niki