ASH结合AWR实战解决oracle高负载
线上oracle数据库负载比较高,所以需要优化,一般进去看负载高的进程,如果是oracle进程,那么就是oracle运行不健康出异常了,我们通常可以用ash来分析问题,如果ash还无法解决,我们就需要扩大分析范围,需要采用awr分析日志来进行分析。
1、ASH分析日志分析Events
Top SQL with Top Events
|
SQL ID
Planhash
Sampled # of Executions
% Activity
Event
% Event
Top Row Source
% RwSrc
SQL Text
2xg4n5sj1sx5h
2047223008
5066
11.18
latch: cache buffers chains
5.88
TABLE ACCESS - FULL
2.34
select * from (select a.bis_pr...
CPU + Wait for CPU
5.24
TABLE ACCESS - FULL
2.35
19gkjf4r3zkpw
2939541073
1013
2.23
latch: cache buffers chains
1.27
TABLE ACCESS - FULL
1.00
select round(sum(t.current_fac...
6qhrc4ax6zr1a
2939541073
960
2.12
latch: cache buffers chains
1.21
TABLE ACCESS - FULL
0.96
select round(sum(t.current_fac...
6nmdjjxfv7aqp
2939541073
847
1.87
latch: cache buffers chains
1.03
TABLE ACCESS - FULL
0.78
select round(sum(t.current_fac...
1qq2h1qkxkaha
2939541073
776
1.71
latch: cache buffers chains
1.02
TABLE ACCESS - FULL
0.83
select round(sum(t.current_fac...
Top SQL with Top Row Sources
SQL ID
PlanHash
Sampled # of Executions
% Activity
Row Source
% RwSrc
Top Event
% Event
SQL Text
2xg4n5sj1sx5h
2047223008
5066
11.18
TABLE ACCESS - FULL
4.70
CPU + Wait for CPU
2.35
select * from (select a.bis_pr...
TABLE ACCESS - FULL
3.90
latch: cache buffers chains
2.30
TABLE ACCESS - FULL
1.13
latch: cache buffers chains
0.66
19gkjf4r3zkpw
2939541073
1013
2.23
TABLE ACCESS - FULL
1.73
latch: cache buffers chains
1.00
select round(sum(t.current_fac...
6qhrc4ax6zr1a
2939541073
960
2.12
TABLE ACCESS - FULL
1.59
latch: cache buffers chains
0.96
select round(sum(t.current_fac...
6nmdjjxfv7aqp
2939541073
847
1.87
TABLE ACCESS - FULL
1.39
latch: cache buffers chains
0.78
select round(sum(t.current_fac...
1qq2h1qkxkaha
2939541073
776
1.71
TABLE ACCESS - FULL
1.34
latch: cache buffers chains
0.83
select round(sum(t.current_fac...
从这里分析,可以看到2xg4n5sj1sx5h的SQL引发了性能瓶颈,因为短短的20分钟内执行了5066次,而且诱发了latch: cache buffers chains以及TABLE ACCESS – FULL全表扫描,所以针对这个sql,分析后,建立优化索引:
createindex IDX_PROJECT_ID_BRRQon BIS_REPORT_RENT_QUICK(BIS_PROJECT_ID,SEQUNCE_NO,YEAR,MONTH);
之后继续观察cpu负载,已经降低了一半,但是还不够,还没有达到正常值,而看ash报告已经不能提供帮助了。接下来我们需要更加全面的awr报告。
2、Awr看以下个主要方向
(1)SQL ordered by Sharable Memory
(2)SQL ordered by CPU Tim
3、先看SQL ordered by Sharable Memory
- Only Statements with Sharable Memory greater than 1048576 are displayed
Sharable Mem (b)
Executions
% Total
SQL Id
SQL Module
SQL Text
5,890,119
19
0.04
8jfndnfr9hct8
select * from ( select * from ...
4,980,367
3
0.04
3gnf4zamgyb7x
select * from ( select * from ...
3,816,036
8
0.03
277rt4gknr76h
update POWERDESK.cont_ledger s...
2,966,255
21
0.02
f2b3a59mdr7wb
select * from ( select * from ...
2,694,091
12
0.02
a8y6k4gt09xxn
select * from ( select * from ...
2,316,487
21
0.02
amyymj7a06gum
select * from ( select * from ...
2,003,767
18
0.01
7z4hxbs26uz8d
select * from ( select * from ...
2,743
6
0.01
96264xfwmxcm7
select * from ( select * from ...
1,979,443
12
0.01
a8y6k4gt09xxn
select * from ( select * from ...
1,626,167
18
0.01
7z4hxbs26uz8d
select * from ( select * from ...
1,354,011
9
0.01
g371ncnjwg9zx
select * from ( select * from ...
1,325,857
1,827
0.01
fm2hxa2nb9gzx
select count(*) from ( select ...
1,303,125
192
0.01
dq8xug6mzj7dj
select count(*) from (select r...
1,277,136
11
0.01
1jubuhx6cdxw8
select sum(rent_square) from( ...
1,147,400
4
0.01
6gj7zqb64m29q
select count(*) from ( select ...
1,135,232
4
0.01
6gj7zqb64m29q
select count(*) from ( select ...
1,126,672
1
0.01
242g0k9qdh0q0
select sum(rent_square) from( ...
1,064,607
3,723
0.01
7sum7dh4hcmbh
select * from ( select jbpmtas...
看到有fm2hxa2nb9gzx这条sql记录,执行次数比较多,有优化的空间:
SELECT
COUNT(*)
FROM
(SELECT
res.*,
1 rec_status
FROM
Res_Approve_Info res
WHEREEXISTS
(SELECT
1
FROM
Res_Approve_User u
WHERE u.res_Approve_Info_Id = res.res_Approve_Info_Id
AND u.user_Cd = :1)
UNION
SELECT
res.*,
9 rec_status
FROM
Res_Approve_Info res
WHEREEXISTS
(SELECT
1
FROM
Res_Approve_User u,
Res_Accredit_Info a
WHERE u.res_Approve_Info_Id = res.res_Approve_Info_Id
AND U.USER_CD = A.USER_CD
AND A.ACC_USER_CD = :2)) res
WHERE1 = 1
AND res.status_cd = '1'
解决办法,添加索引:
createindex IXU_RES_USER_CD_2 on RES_APPROVE_USER(USER_CD);
4、再看SQL ordered by CPU Time
- Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
- %Total - CPU Time as a percentage of Total DB CPU
- %CPU - CPU Time as a percentage of Elapsed Time
- %IO - User I/O Time as a percentage of Elapsed Time
- Captured SQL account for 23.1% of Total CPU Time (s): 11,858
- Captured PL/SQL account for 0.0% of Total CPU Time (s): 11,858
CPU Time (s)
Executions
CPU per Exec (s)
%Total
Elapsed Time (s)
%CPU
%IO
SQL Id
SQL Module
SQL Text
146.67
11,055
0.01
1.24
192.16
76.33
0.00
4198t22zum3dr
select t1.module_cd,count(t1....
140.39
65
2.16
1.18
300.45
46.73
0.00
a8s9kjk7mk6yc
select nvl(sum(nvl(sales_money...
133.99
63
2.13
1.13
365.39
36.67
0.00
15uv0hbq9dmm7
select nvl(sum(nvl(sales_money...
130.97
65
2.01
1.10
286.23
45.75
0.00
bhnjxfa4av3kw
select nvl(sum(nvl(sales_money...
127.93
4,215
0.03
1.08
157.88
81.03
0.00
aajc7r9y7z1qs
select * from ( select this_.d...
124.72
63
1.98
1.05
317.81
39.24
0.00
1na6hypkn1q2f
select nvl(sum(nvl(sales_money...
112.56
347
0.32
0.95
444.87
25.30
0.00
cpw6nx6gdv937
select sum(rent_square) from( ...
110.81
19,013
0.01
0.93
156.71
70.71
0.00
ckkvz3r38xa4k
select * from ( select kmpush0...
103.00
19,013
0.01
0.87
136.82
75.28
0.00
8n23w37kwdn3p
select count(*) as col_0_0_ fr...
97.71
7,400
0.01
0.82
130.32
74.98
0.00
f1z0k5gx90tqz
select (case when t1.module_cd...
依据以上awr统计分析cpu time(s)得出的结论,点击进去,找出sql记录,并给出优化方案如下:
属于sql写法不规范的优化:
4198t22zum3dr
select t1.module_cd,count(t1.jbpm_task_id) num from jbpm_task t1,jbpm_task_candidate t2 where t1.jbpm_task_id=t2.jbpm_task_id and (t2.user_cd=:1 or t2.user_cd like :2 ) group by t1.module_cd order by decode(t1.module_cd,'mesMeetingInfo','0','resApprove','1','planTarget','2') asc
优化建议:
将order by后面的decode(t1.module_cd,'2') asc移到from前面,在外层做order by操作
bhnjxfa4av3kw
select nvl(sum(nvl(sales_money,0)),0) from bis_sales_day where to_char(sales_date,'yyyy-MM')='2016-06' and bis_cont_id in (select bis_cont_id from bis_cont bc where cont_type_cd in ('1','2') and ((bc.status_cd = '2' and '2016-06' >= to_char(bc.cont_start_date,'yyyy-MM') and to_char(bc.cont_to_fail_date,'yyyy-MM')>='2016-06') or (bc.status_cd in ('1','3') and '2016-06' >= to_char(bc.cont_start_date,'yyyy-MM') and to_char(bc.cont_end_date,'yyyy-MM')>='2016-06')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type='1' )
优化建议:
将where后面 in的子查询判断字句变成 exists判断。
15uv0hbq9dmm7
select nvl(sum(nvl(sales_money,'yyyy-MM')='2016-07' and bis_cont_id in (select bis_cont_id from bis_cont bc where cont_type_cd in ('1','2') and ((bc.status_cd = '2' and '2016-07' >= to_char(bc.cont_start_date,'yyyy-MM')>='2016-07') or (bc.status_cd in ('1','3') and '2016-07' >= to_char(bc.cont_start_date,'yyyy-MM')>='2016-07')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type in ('1','2') )
优化建议:
将where后面 in的子查询判断字句变成 exists判断。
bhnjxfa4av3kw
select nvl(sum(nvl(sales_money,'yyyy-MM')>='2016-06')) and bc.bis_project_id in (select bis_project_id from bis_project where is_business_project ='1' ) and bc.store_type='1' )
优化建议:
将where后面 in的子查询判断字句变成 exists判断。
属于sql中添加索引的优化:
aajc7r9y7z1qs
select * from ( select this_.dly_note_id as dly1_396_0_,this_.created_center_cd as created2_396_0_,this_.created_date as created3_396_0_,this_.created_dept_cd as created4_396_0_,this_.created_position_cd as created5_396_0_,this_.creator as creator396_0_,this_.note_biz_type_cd as note7_396_0_,this_.note_content as note8_396_0_,this_.note_title as note9_396_0_,this_.record_version as record10_396_0_,this_.remark as remark396_0_,this_.send_mail_flg as send12_396_0_,this_.seq as seq396_0_,this_.tip_end_date as tip14_396_0_,this_.tip_flg as tip15_396_0_,this_.tip_period_cd as tip16_396_0_,this_.tip_start_date as tip17_396_0_,this_.tip_type_cd as tip18_396_0_,this_.updated_center_cd as updated19_396_0_,this_.updated_date as updated20_396_0_,this_.updated_dept_cd as updated21_396_0_,this_.updated_position_cd as updated22_396_0_,this_.updator as updator396_0_ from POWERDESK.dly_note this_ where this_.creator=:1 order by this_.seq asc ) where rownum <= :2
优化方案,添加索引:createindex IDX_CREATOR on DLY_NOTE (creator);
在添加好IDX_CREATOR、IXU_RES_USER_CD_2 这2个索引后,cpu负载已经再次降低了一半,虽然没有到正常值,但是已经接近最低级别的告警线,暂时数据库脱离危险了,就等其它的几个需要修改sql的优化继续完成后,cpu负载基本就可以恢复到正常值了。
所以,一般oracle服务器负载飙升,90%都是由不规范的sql、性能差的sql造成的,而ash和awr分析报告是我们解决的最有效的途径。
(编辑:李大同)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!