postgresql unordered uuid performance test

Influence of unordered uuid on Database

Due to the recent performance test of large tables, it is found that unordered UUIDs as primary keys have a certain impact on the table insertion performance. Combined with the actual situation, it is found that the larger the amount of data in the table, the greater the impact on the table insertion performance.

testing environment

PostgreSQL creates an insert script to test tps in various cases.

Database version: PostgreSQL 10.4 (artrybase 5.0.0, Thunisoft)

Operating system configuration: CentOS Linux release 7, 32GB memory, 8 cpu

Test parameters: pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb

Various primary key tests for empty tables, 1000w data, 5000w data and 100 million data.

Test unordered UUIDs, ordered UUIDs, sequences, common btree, unique index and no primary key

test

1. Create table

--Disordered uuid
pgbenchdb=# create table test_uuid_v4(id char(32) primary key);
CREATE TABLE
--Orderly uuid
pgbenchdb=# create table test_time_nextval(id char(32) primary key);
CREATE TABLE
--Incremental sequence
pgbenchdb=# create table test_seq_bigint(id int8 primary key);
CREATE TABLE
--Create sequence
 create sequence test_seq start with 1 ;

2. Test script

--Test disorder uuid script
vi pgbench_uuid_v4.sql
insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--Test order uuid script
vi pgbench_time_nextval.sql
insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
--Test sequence script
vi pgbench_seq_bigint.sql
insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));

Unordered uuid, no data

Disk usage
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.76    0.00    0.38    4.67    0.00   94.19

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00   96.00     0.00  2048.00    42.67     1.02   10.67    0.00   10.67  10.33  99.20
dm-0              0.00     0.00    0.00   96.00     0.00  2048.00    42.67     1.02   10.66    0.00   10.66  10.32  99.10
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

tps: 
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 53494
latency average = 8.974 ms
tps = 891.495404 (including connections establishing)
tps = 891.588967 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         9.006  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));

tps without data

       category     |  for the first time  | The second time  | third time | average value(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 disorder uuid		  | 919  	| 907     |  891  |   906     | 99.2% | 10.66   
 Orderly uuid    	  | 985  	| 882     |  932  |   933     | 98.7% | 4.4
 sequence    	      | 1311     | 1277    |  1280 |  1289     | 97.5% | 3.4 

Initialize 100w data to the table

pgbenchdb=# insert into test_uuid_v4 (id) select  replace(uuid_generate_v4()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 43389.817 ms (00:43.390)
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 30585.134 ms (00:30.585)
pgbenchdb=#  insert into test_seq_bigint select generate_series (1,1000000);
INSERT 0 1000000
Time: 9818.639 ms (00:09.819)
disorder uuid Insert 100 w Need 43 s,Order requires 30 s,The sequence requires 10 s. 

tps after inserting one million data

       category     |  for the first time  | The second time  | third time | average value(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 disorder uuid		  | 355  	| 440     |  302  |   365     | 98.8% | 13   
 Orderly uuid    	  | 948  	| 964     |  870  |   927     | 97.2% | 4.0
 sequence    	      | 1159     | 1234    |  1115 |  1169     | 96.6% | 3.5 

tps after inserting 10 million data

       category     |  for the first time  | The second time  | third time | average value(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 disorder uuid		  | 260  	| 292     |  227  |   260     | 99.2% | 16.8   
 Orderly uuid    	  | 817  	| 960     |  883  |   870     | 97.7% | 3.9
 sequence       	   | 1305     | 1261    |  1270 |  1278     | 96.8% | 3.0 

After inserting 50 million data

Insert 5 into the table kw Data and add a primary key
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 453985.318 ms (07:33.985)
pgbenchdb=# insert into test_seq_bigint select generate_series (1,50000000);
INSERT 0 50000000
Time: 352206.160 ms (05:52.206)
pgbenchdb=# insert into test_uuid_v4 (id) select  replace(uuid_generate_v4()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 1159689.338 ms (00:19:19.689)

When there is no primary key, insert 50 million data in order uuid It takes 7 minutes, the sequence takes 6 minutes, and it is out of order uuid It took nearly 20 minutes.

pgbenchdb=# alter table test_uuid_v4 add primary key ("id");
ALTER TABLE
Time: 845199.296 ms (14:05.199)
pgbenchdb=# alter table test_time_nextval add primary key ("id");
ALTER TABLE
Time: 932151.103 ms (15:32.151)
pgbenchdb=# alter table test_seq_bigint add primary key ("id");
ALTER TABLE
Time: 148138.871 ms (02:28.139)

pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_uuid_v4'));
 pg_size_pretty 
----------------
 6072 MB
(1 row)

Time: 0.861 ms
pgbenchdb=#  select pg_size_pretty(pg_total_relation_size('test_time_nextval'));
 pg_size_pretty 
----------------
 6072 MB
(1 row)

Time: 0.942 ms
pgbenchdb=#  select pg_size_pretty(pg_total_relation_size('test_seq_bigint'));
 pg_size_pretty 
----------------
 2800 MB
(1 row)

Time: 0.699 ms

After 5kw insertion

       category     |  for the first time  | The second time  | third time | average value(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 disorder uuid		  | 162  	| 163     |  163  |   163     | 99.6% | 18.4   
 Orderly uuid    	  | 738  	| 933     |  979  |   883     | 97.7% | 3.9
 sequence         	 | 1132     | 1264    |  1265 |  1220     | 96.8% | 3.5 

After inserting 100 million pieces of data

       category     |  for the first time  | The second time  | third time | average value(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 disorder uuid		  | 121  	| 131     |  143  |   131     | 99.6% | 28.2   
 Orderly uuid    	  | 819  	| 795     |  888  |   834     | 99.2% | 28.7
 sequence      	    | 1193     | 1115    |  1109 |  1139     | 96.8% | 11.3

Ordinary btree index

The unordered uuid is measured above. In 1kw case, the tps with primary key is 260 and the tps without primary key is 1234. Try to test the normal index and the unique index tps

--Create normal index
pgbenchdb=# create index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 316367.010 ms (05:16.367)
--After creating a normal index
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13308
latency average = 36.080 ms
tps = 221.727391 (including connections establishing)
tps = 221.749660 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
        38.512  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--Create unique index
pgbenchdb=# drop index i_test_uuid_v4_id;
DROP INDEX
Time: 267.451 ms
pgbenchdb=# create unique index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 153372.622 ms (02:33.373)
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
^[[3~transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13847
latency average = 34.693 ms
tps = 230.593988 (including connections establishing)
tps = 230.620469 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
        36.410  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
   

Both ordinary btree indexes and unique indexes will affect the efficiency of insertion.

Delete all primary key indexes

--Delete all primary keys
alter table test_uuid_v4 drop constraint "test_uuid_v4_pkey";
alter table test_time_nextval drop constraint "test_time_nextval_pkey" ;
alter table test_seq_bigint drop constraint "test_seq_bigint_pkey";

1,--disorder uuid: test pgbench_uuid_v4.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74109
latency average = 6.479 ms
tps = 1234.842229 (including connections establishing)
tps = 1235.042674 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         6.112  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));

2,--Orderly uuid,test pgbench_time_nextval.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_time_nextval.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_time_nextval.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74027
latency average = 6.486 ms
tps = 1233.364360 (including connections establishing)
tps = 1233.482292 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         6.186  insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
3,--Sequence, test pgbench_seq_bigint.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_seq_bigint.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_seq_bigint.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 76312
latency average = 6.290 ms
tps = 1271.832907 (including connections establishing)
tps = 1272.124397 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         5.916  insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));

After deleting the primary key constraint, tps is very close in the three cases, reaching 1200 +.

Btree index, average tps comparison of insertion operations

 category/average tps    |  No data  | ten million  | Fifty million | one hundred million 		|
---------------+---------+---------+---------+---------+
 disorder uuid		  | 960  	| 260     |  163  |   131     |
 Orderly uuid    	  | 933  	| 870     |  883  |   834     |
 sequence        	  | 1289     | 1278    |  1220 |  1139     |

According to the test data, it can be seen that the tps of the inserted data decreases sharply after the unordered uuid reaches 1kw, while the ordered uuid and the increasing sequence decrease less. The tps ordered uuid of 100 million data is 6 times that of disorder, and the sequence is 9 times that of disorder uuid.

Create a separate tablespace to store index information

If you have a fast disk, you can store indexes and data separately to speed up writing.

Create a separate index space:

create tablespace indx_test owner sa location '/home/tablespace/index_test';

Specify index storage directory:

create index i_test_uuid_v4_id on test_uuid_v4 using btree(id) tablespace indx_test;

About ordered uuid

Test the sequential UUIDs plug-in used to generate ordered UUIDs.

The structure of ordered uuid is (block ID; random data). In fact, it is to divide the data into two parts, one is self increasing and the other is random.

sequential-uuids

sequential-uuids-git

Two algorithms are provided:

1.uuid_sequence_nextval(sequence regclass, block_size int default 65536, block_count int default 65536)

The prefix is a self incrementing sequence. If the BLOCK ID is stored in 2 bytes, 256 records can be stored in an index BLOCK (assuming an 8K BLOCK, one record includes uuid VALUE (16 bytes) and ctid (6 bytes), so an index page stores about 363 records (8000 / (16 + 6))

2.uuid_time_nextval(interval_length int default 60, interval_count int default 65536) RETURNS uuid

By default, the prefix of the data every 60 seconds is the same. The prefix is incremented by 1 and circulates after 65535.

use uuid_time_nextval Generated order uuid
pgbenchdb=# select id from test_time_nextval;
                id                
----------------------------------
 a18b7dd0ca92b0b5c1844a402f9c6999
 a18b540b8bbe0ddb2b6d0189b2e393c6
 a18b83eb7320b0a90e625185421e065e
 a18bade4ff15e05dab81ecd3f4c2dee4
 a18b79e41c3bc8d2d4ba4b70447e6b29
 a18bdad18d9e0d2fa1d9d675bc7129f0
 a18b13723ec7be9a2f1a3aec5345a88b
 a18bd9d866047aec69a064d30e9493d2
 a18bd76e8c787c7464479502f381e6d7
 a18ba5c0c966f81cfdbeff866618da8d
......

The first four bits of ordered uuid are ordered, and the following ones are randomly generated.

epilogue

1. For ordered UUIDs, the first four bits are ordered and the rest are randomly generated.

2. In this environment, it is found that with the increasing amount of data, tps declines sharply.

3. Due to the existence of btree index, unordered UUIDs will lead to a large number of discrete io. Resulting in high disk utilization. This affects the insertion efficiency. It becomes more obvious with the increase of table data.

4. The test is conducted on ordinary disks, not on SSDs.

5. If you want to use ordered UUIDs, there are many ways to implement them. You also need to consider generating global ordered UUIDs under distributed conditions.

Tags: Database PostgreSQL

Posted on Thu, 28 Oct 2021 22:20:40 -0400 by chyan