# postgresql-无序uuid tps测试
## 无序uuid对数据库的影响
由于最近在做超大表的性能测试,在该过程中发现了无序uuid做主键对表插入性能有一定影响。结合实际情况发现当表的数据量越大,对表插入性能的影响也就越大。
### 测试环境
PostgreSQL创建插入脚本,测试各种情况的tps。
数据库版本:PostgreSQL 10.4 (ArteryBase 5.0.0,Thunisoft)
操作系统配置:CentOS Linux release 7 ,32GB内存,8 cpu
测试参数:pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
空表,1000w数据,5000w数据,一亿数据的各种主键测试。
测试无序的uuid,有序的uuid,序列,有普通btree,有唯一索引和没有主键的情况
### 测试
1.创建表
```sql
--无序的uuid
pgbenchdb=# create table test_uuid_v4(id char(32) primary key);
CREATE TABLE
--有序的uuid
pgbenchdb=# create table test_time_nextval(id char(32) primary key);
CREATE TABLE
--递增序列
pgbenchdb=# create table test_seq_bigint(id int8 primary key);
CREATE TABLE
--创建序列
?create sequence test_seq start with 1 ;
```
2.测试脚本
```sql
--测试无序uuid脚本
vi pgbench_uuid_v4.sql
insert into test_uuid_v4 (id) values (replace(‘-‘,‘‘,uuid_generate_v4()::text));
--测试有序uuid脚本
vi pgbench_time_nextval.sql
insert into test_time_nextval (id) values (replace(‘-‘,uuid_time_nextval()::text));
--测试序列脚本
vi pgbench_seq_bigint.sql
insert into test_seq_bigint (id) values (nextval(‘test_seq‘::regclass));
```
无序uuid,无数据情况
```sql
磁盘使用情况
avg-cpu:? %user?? %nice %system %iowait? %steal?? %idle
?????????? 0.76??? 0.00??? 0.38??? 4.67??? 0.00?? 94.19
Device:???????? rrqm/s?? wrqm/s???? r/s???? w/s??? rkB/s??? wkB/s avgrq-sz avgqu-sz?? await r_await w_await? svctm? %util
sdb?????????????? 0.00???? 0.00??? 0.00??? 0.00???? 0.00???? 0.00???? 0.00???? 0.00??? 0.00??? 0.00??? 0.00?? 0.00?? 0.00
sda?????????????? 0.00???? 0.00??? 0.00?? 96.00???? 0.00? 2048.00??? 42.67???? 1.02?? 10.67??? 0.00?? 10.67? 10.33? 99.20
dm-0????????????? 0.00???? 0.00??? 0.00?? 96.00???? 0.00? 2048.00??? 42.67???? 1.02?? 10.66??? 0.00?? 10.66? 10.32? 99.10
dm-1????????????? 0.00???? 0.00??? 0.00??? 0.00???? 0.00???? 0.00???? 0.00???? 0.00??? 0.00??? 0.00??? 0.00?? 0.00?? 0.00
dm-2????????????? 0.00???? 0.00??? 0.00??? 0.00???? 0.00???? 0.00???? 0.00???? 0.00??? 0.00??? 0.00??? 0.00?? 0.00?? 0.00
tps:
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 53494
latency average = 8.974 ms
tps = 891.495404 (including connections establishing)
tps = 891.588967 (excluding connections establishing)
script statistics:
?- statement latencies in milliseconds:
???????? 9.006? insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,‘-‘,‘‘));
```
无数据情况下,tps
```sql
?????? 类别???? |? 第一次? | 第二次? | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
?无序uuid?? ??? ?? | 919 ??? ?| 907???? |? 891? |?? 906???? | 99.2% | 10.66? ?
?有序uuid?? ??? ?? | 985 ??? ?| 882???? |? 932? |?? 933???? | 98.7% | 4.4
?序列?????????? | 1311??? | 1277??? |? 1280 |? 1289???? | 97.5% | 3.4
```
向表里面初始化100w数据
```sql
pgbenchdb=# insert into test_uuid_v4 (id) select? replace(uuid_generate_v4()::text,‘‘) from generate_series(1,1000000);
INSERT 0 1000000
Time: 43389.817 ms (00:43.390)
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,1000000);
INSERT 0 1000000
Time: 30585.134 ms (00:30.585)
pgbenchdb=#? insert into test_seq_bigint select generate_series (1,1000000);
INSERT 0 1000000
Time: 9818.639 ms (00:09.819)
无序uuid插入100w需要43s,有序需要30s,序列需要10s。
```
插入一百万数据后的tps
```sql
?????? 类别???? |? 第一次? | 第二次? | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
?无序uuid?? ??? ?? | 355 ??? ?| 440???? |? 302? |?? 365???? | 98.8% | 13? ?
?有序uuid?? ??? ?? | 948 ??? ?| 964???? |? 870? |?? 927???? | 97.2% | 4.0
?序列?????????? | 1159??? | 1234??? |? 1115 |? 1169???? | 96.6% | 3.5
```
插入一千万数据后的tps
```sql
?????? 类别???? |? 第一次? | 第二次? | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
?无序uuid?? ??? ?? | 260 ??? ?| 292???? |? 227? |?? 260???? | 99.2% | 16.8? ?
?有序uuid?? ??? ?? | 817 ??? ?| 960???? |? 883? |?? 870???? | 97.7% | 3.9
?序列?????????? | 1305??? | 1261??? |? 1270 |? 1278???? | 96.8% | 3.0
```
插入五千万数据后
```sql
向表中插入5kw数据,并且添加主键
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,50000000);
INSERT 0 50000000
Time: 453985.318 ms (07:33.985)
pgbenchdb=# insert into test_seq_bigint select generate_series (1,50000000);
INSERT 0 50000000
Time: 352206.160 ms (05:52.206)
pgbenchdb=# insert into test_uuid_v4 (id) select? replace(uuid_generate_v4()::text,50000000);
INSERT 0 50000000
Time: 1159689.338 ms (00:19:19.689)
在无主键情况下,插入五千万数据,有序uuid耗时7分钟,序列耗时6分钟,而无序uuid耗时接近20分钟。
pgbenchdb=# alter table test_uuid_v4 add primary key ("id");
ALTER TABLE
Time: 845199.296 ms (14:05.199)
pgbenchdb=# alter table test_time_nextval add primary key ("id");
ALTER TABLE
Time: 932151.103 ms (15:32.151)
pgbenchdb=# alter table test_seq_bigint add primary key ("id");
ALTER TABLE
Time: 148138.871 ms (02:28.139)
pgbenchdb=# select pg_size_pretty(pg_total_relation_size(‘test_uuid_v4‘));
?pg_size_pretty
----------------
?6072 MB
(1 row)
Time: 0.861 ms
pgbenchdb=#? select pg_size_pretty(pg_total_relation_size(‘test_time_nextval‘));
?pg_size_pretty
----------------
?6072 MB
(1 row)
Time: 0.942 ms
pgbenchdb=#? select pg_size_pretty(pg_total_relation_size(‘test_seq_bigint‘));
?pg_size_pretty
----------------
?2800 MB
(1 row)
Time: 0.699 ms
```
插入5kw后
```sql
?????? 类别???? |? 第一次? | 第二次? | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
?无序uuid?? ??? ?? | 162 ??? ?| 163???? |? 163? |?? 163???? | 99.6% | 18.4? ?
?有序uuid?? ??? ?? | 738 ??? ?| 933???? |? 979? |?? 883???? | 97.7% | 3.9
?序列?????????? | 1132??? | 1264??? |? 1265 |? 1220???? | 96.8% | 3.5
```
插入1亿条数据后
```sql
?????? 类别???? |? 第一次? | 第二次? | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
?无序uuid?? ??? ?? | 121 ??? ?| 131???? |? 143? |?? 131???? | 99.6% | 28.2? ?
?有序uuid?? ??? ?? | 819 ??? ?| 795???? |? 888? |?? 834???? | 99.2% | 28.7
?序列?????????? | 1193??? | 1115??? |? 1109 |? 1139???? | 96.8% | 11.3
```
### 普通btree索引
上面测了无序uuid,1kw情况下,有主键的tps是260,无主键的tps是1234。尝试测试普通的索引,和唯一索引tps
```sql
--创建普通索引
pgbenchdb=# create index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 316367.010 ms (05:16.367)
--创建普通索引后
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13308
latency average = 36.080 ms
tps = 221.727391 (including connections establishing)
tps = 221.749660 (excluding connections establishing)
script statistics:
?- statement latencies in milliseconds:
??????? 38.512? insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,‘‘));
--创建唯一索引
pgbenchdb=# drop index i_test_uuid_v4_id;
DROP INDEX
Time: 267.451 ms
pgbenchdb=# create unique index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 153372.622 ms (02:33.373)
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
^[[3~transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13847
latency average = 34.693 ms
tps = 230.593988 (including connections establishing)
tps = 230.620469 (excluding connections establishing)
script statistics:
?- statement latencies in milliseconds:
??????? 36.410? insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,‘‘));
? ?
```
无论是普通btree索引和唯一索引,都会影响插入的效率。
### 删除所有的主键索引
```sql
--删除所有主键
alter table test_uuid_v4 drop constraint "test_uuid_v4_pkey";
alter table test_time_nextval drop constraint "test_time_nextval_pkey" ;
alter table test_seq_bigint drop constraint "test_seq_bigint_pkey";
1,--无序uuid:测试pgbench_uuid_v4.sql
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74109
latency average = 6.479 ms
tps = 1234.842229 (including connections establishing)
tps = 1235.042674 (excluding connections establishing)
script statistics:
?- statement latencies in milliseconds:
???????? 6.112? insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,‘‘));
2、--有序uuid,测试pgbench_time_nextval.sql
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_time_nextval.sql -U sa pgbenchdb
transaction type: /opt/thunisoft/pgbench_time_nextval.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74027
latency average = 6.486 ms
tps = 1233.364360 (including connections establishing)
tps = 1233.482292 (excluding connections establishing)
script statistics:
?- statement latencies in milliseconds:
???????? 6.186? insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,‘‘));
3、--序列,测试pgbench_seq_bigint.sql
[[email?protected] thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_seq_bigint.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_seq_bigint.sqlscaling factor: 1query mode: preparednumber of clients: 8number of threads: 8duration: 60 snumber of transactions actually processed: 76312latency average = 6.290 mstps = 1271.832907 (including connections establishing)tps = 1272.124397 (excluding connections establishing)script statistics:?- statement latencies in milliseconds:???????? 5.916? insert into test_seq_bigint (id) values (nextval(‘test_seq‘::regclass));```删除主键约束后,三种情况下tps非常接近,都达到了1200+。### 不同基数下插入表的平均tps对比```sql?类别/平均tps??? |? 无数据? | 一千万? | 五千万 | 一亿 ?? ??? ?|---------------+---------+---------+---------+---------+?无序uuid?? ??? ?? | 960 ??? ?| 260???? |? 163? |?? 131???? |?有序uuid?? ??? ?? | 933 ??? ?| 870???? |? 883? |?? 834???? |?序列?????????? | 1289??? | 1278??? |? 1220 |? 1139???? |```根据测试数据可以看出无序的uuid在数据到达1kw后插入数据的tps下降的非常厉害,而有序的uuid和递增序列下降的比较少。到一亿数据的tps有序uuid是无序的6倍,序列是无序uuid的9倍。## 创建单独的表空间用来存储索引信息如果有多快磁盘那么可以将索引和数据分开存储,以此来加快写入的速度。创建单独的索引空间:create tablespace indx_test owner sa location ‘/home/tablespace/index_test‘;指定索引存储目录:create index i_test_uuid_v4_id on test_uuid_v4 using btree(id) tablespace indx_test;## 关于有序uuid测试使用的sequential-uuids插件,生成的有序uuid。有序uuid的结构为(block ID; random data),实际上就是把数据拆成两部分,一部分自增,一部分随机。[sequential-uuids](https://blog.2ndquadrant.com/sequential-uuid-generators/)[git](https://github.com/tvondra/sequential-uuids)提供了两种算法:1.uuid_sequence_nextval(sequence regclass,block_size int default 65536,block_count int default 65536)前缀为自增序列,如果块ID使用2字节存储,一个索引BLOCK里面可以存储256条记录(假设8K的BLOCK,一条记录包括uuid VALUE(16字节)以及ctid(6字节),所以一个索引页约存储363条记录(8000 /(16 + 6)))2.uuid_time_nextval(interval_length int default 60,interval_count int default 65536) RETURNS uuid ?默认每60秒内的数据的前缀是一样的,前缀递增1,到65535后循环。```sql使用uuid_time_nextval生成的有序uuidpgbenchdb=# select id from test_time_nextval;??????????????? id?????????????? ?----------------------------------?a18b7dd0ca92b0b5c1844a402f9c6999?a18b540b8bbe0ddb2b6d0189b2e393c6?a18b83eb7320b0a90e625185421e065e?a18bade4ff15e05dab81ecd3f4c2dee4?a18b79e41c3bc8d2d4ba4b70447e6b29?a18bdad18d9e0d2fa1d9d675bc7129f0?a18b13723ec7be9a2f1a3aec5345a88b?a18bd9d866047aec69a064d30e9493d2?a18bd76e8c787c7464479502f381e6d7?a18ba5c0c966f81cfdbeff866618da8d......```有序uuid前四位有序,后面的随机生成。## 结语1.关于有序的uuid,前4位是有序的,后面都是随机生成的。2.在该环境中发现,无序uuid随着数据量的不断增大,tps下滑比较厉害。3.由于btree索引的存在,无序的uuid会导致大量的离散io。导致磁盘使用率高。进而影响插入效率。随着表数据量的增大更加明显。4.该测试是在普通的磁盘上面测试,并未在ssd上面测试。5.如果要使用有序uuid,有多种实现方式,还需要考虑分布式情况下生成全局有序uuid。