加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

nagios报 check_oracle_rman_backup_problems告警处理思路

发布时间:2020-12-12 14:03:06 所属栏目:百科 来源:网络整理
导读:本人不是Oracle DBA,不懂Oracle,告警了运维又不管,说是DBA的活,反正在他们眼里无论是MySQL,Oracle,SYBASE还是Redis,MongoDB都是DBA,和他们没关系。。。。。 1.打开nrpe.cfg,找到check_oracle_rman_backup_problems监控项,执行一下 cat /usr/local/n

本人不是Oracle DBA,不懂Oracle,告警了运维又不管,说是DBA的活,反正在他们眼里无论是MySQL,Oracle,SYBASE还是Redis,MongoDB都是DBA,和他们没关系。。。。。
1.打开nrpe.cfg,找到check_oracle_rman_backup_problems监控项,执行一下
cat /usr/local/nagios/etc/nrpe.cfg![](http://i2.51cto.com/images/blog/201803/09/6ac77908871d3a4587a289d7f718f8a4.png?x-oss-process=image/watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=)
2.找到check_oracle_health脚本(perl语言)监控的,那就打开看看是如何取值监控的呗
通过rman-backup-problems搜索到在@mode数组
br/>![](http://i2.51cto.com/images/blog/201803/09/6ac77908871d3a4587a289d7f718f8a4.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk=)
2.找到check_oracle_health脚本(perl语言)监控的,那就打开看看是如何取值监控的呗
通过rman-backup-problems搜索到在@mode数组

并找到如下代码,其中sql就是我们最终要找的,这是关于rman备份状态监控
elsif ($params{mode} =~ /server::instance::rman::backup::problems/) {
$self->{rman_backup_problems} = $self->{handle}->fetchrow_array(q{
SELECT COUNT(*) FROM v$rman_status
WHERE
operation = 'BACKUP'
AND
status != 'COMPLETED'
AND
status != 'RUNNING'
AND
start_time > sysdate-3
});
} elsif ($params{mode} =~ /server::instance::rman::backup::problems/) {
$self->add_nagios(
$self->check_thresholds($self->{rman_backup_problems},1,2),
sprintf "rman had %d problems during the last 3 days",
$self->{rman_backup_problems});
$self->add_perfdata(sprintf "rman_backup_problems=%d;%d;%d",
$self->{rman_backup_problems},
$self->{warningrange},$self->{criticalrange});
现在知道这个是由于rman备份造成,那就执行下sql和备份日志,发现如下错误
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle

![](http://i2.51cto.com/images/blog/201803/09/6ac77908871d3a4587a289d7f718f8a4.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk=)
2.找到check_oracle_health脚本(perl语言)监控的,那就打开看看是如何取值监控的呗
通过rman-backup-problems搜索到在@mode数组

Control File Copy 69 2017-12-20 11:22:41 /data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.fRMAN-00571: ===========================================================RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============RMAN-00571: ===========================================================RMAN-03009: failure of delete command on ORA_DISK_1 channel at 03/06/2018 01:15:28ORA-19606: Cannot copy or restore to snapshot control file知道错误,那就好解决啦,网上一搜总结如下:CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f_bak';crosscheck controlfilecopy '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';delete expired controlfilecopy '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';CONFIGURE SNAPSHOT CONTROLFILE NAME clear;总结,这里需要你能看懂perl面向对象编程,这里package xxx相当于class 声明类,new函数就是常说的构造函数,我觉的不会不可怕,不会可以去学,顺便了解了一下perl语言,还是有收获的

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读