linux – Elastic Beanstalk日志轮换导致Apache重启
我已经完成了
AWS Elastic Beanstalk – Apache is restarting constantly
我们的弹性beanstalk实例在error_log中报告以下消息 [Mon Jun 26 22:01:01.878892 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart *** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f564cced560 *** 有时错误序列看起来更像是这样: [Tue Jun 27 00:01:01.215260 2017] [:error] [pid 6429] [remote XX.XXX.XX.195:29773] mod_wsgi (pid=6429): Exception occurred processing WSGI script '/opt/python/current/app/site/settings/wsgi/__init__.py'. [Tue Jun 27 00:01:01.215320 2017] [:error] [pid 6429] [remote XX.XXX.XX.195:29773] OSError: failed to write data [Tue Jun 27 00:01:01.222407 2017] [:error] [pid 6430] [remote XX.XXX.XX.60:53313] mod_wsgi (pid=6430): Exception occurred processing WSGI script '/opt/python/current/app/site/settings/wsgi/__init__.py'. [Tue Jun 27 00:01:01.222460 2017] [:error] [pid 6430] [remote XX.XXX.XX.60:53313] OSError: failed to write data [Tue Jun 27 00:01:04.554810 2017] [core:warn] [pid 8595] AH00045: child process 7614 still did not exit,sending a SIGTERM [Tue Jun 27 00:01:04.554850 2017] [core:warn] [pid 8595] AH00045: child process 7615 still did not exit,sending a SIGTERM [Tue Jun 27 00:01:05.555958 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart *** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f5640cae900 *** *** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f78649b7970 *** 几乎每个小时都会这样.常见的信息是: [Mon Jun 26 22:01:01.878892 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart 我查找了mpm_prefork模块conf块…并且没有一个,所以所有默认值都在使用. 我查找了由弹性beanstalk执行的logrotation命令 /var/log/httpd/* { size 10M missingok notifempty rotate 5 sharedscripts compress dateext dateformat -%s create postrotate /sbin/service httpd reload > /dev/null 2>/dev/null || true endscript olddir /var/log/httpd/rotated } 很标准的东西.我对重装的理解是尝试优雅重启…… 我可以通过执行sudo apachectl -k restart手动触发错误消息,虽然我无法找到在日志轮换期间运行的位置. 我们有下游服务似乎在此服务器挂起它的所有连接时抛出异常. 所以我的问题是,在logrotate期间还有什么可能导致mpm_prefork中的SIGHUP?据我所知,这不应该在错误条件之外发生. Apache / 2.4.18(亚马逊)mod_wsgi / 3.5 Python / 3.4.3 解决方法
简而言之,看起来当前的Elastic Beanstalk日志转换配置被破坏,导致服务停机,504网关超时.让我们来看看.
再生产 我们创建最简单的Python WSGI应用程序. application.py import time def application(environ,start_response): # somewhat realistic response duration time.sleep(0.5) status = '200 OK' response_headers = [('Content-type','text/plain')] start_response(status,response_headers) return [b'Hello world!n'] 将其压缩到application.zip.然后创建Elastic Beanstalk Python应用程序和环境,上传档案.确保使用您拥有的密钥对.保留其他设置默认值.等到它完成(几分钟). ssh到底层的EC2实例中(参见EB日志中的实例标识符).输入(httpd的logrotate post-action,见下文): sudo /sbin/service httpd reload 然后在你的机器上运行: siege -v -b -c 10 -t 10S http://your-test-eb.you-aws-region.elasticbeanstalk.com/ 当它运行时,重复重载命令几次. 然后你会看到如下内容: ** SIEGE 3.0.8 ** Preparing 10 concurrent users for battle. The server is now under siege... HTTP/1.1 200 0.63 secs: 13 bytes ==> GET / HTTP/1.1 200 0.65 secs: 13 bytes ==> GET / HTTP/1.1 200 0.64 secs: 13 bytes ==> GET / HTTP/1.1 200 0.60 secs: 13 bytes ==> GET / ... 这是重装时会发生什么. HTTP/1.1 504 0.06 secs: 0 bytes ==> GET / HTTP/1.1 504 0.07 secs: 0 bytes ==> GET / HTTP/1.1 504 0.08 secs: 0 bytes ==> GET / HTTP/1.1 504 0.10 secs: 0 bytes ==> GET / HTTP/1.1 504 0.11 secs: 0 bytes ==> GET / HTTP/1.1 504 0.66 secs: 0 bytes ==> GET / HTTP/1.1 504 0.19 secs: 0 bytes ==> GET / HTTP/1.1 504 0.20 secs: 0 bytes ==> GET / HTTP/1.1 504 0.09 secs: 0 bytes ==> GET / 然后它恢复了. HTTP/1.1 200 1.25 secs: 13 bytes ==> GET / HTTP/1.1 200 1.24 secs: 13 bytes ==> GET / HTTP/1.1 200 1.26 secs: 13 bytes ==> GET / ... Lifting the server siege.. done. Transactions: 75 hits Availability: 81.52 % Elapsed time: 9.40 secs Data transferred: 0.00 MB Response time: 1.21 secs Transaction rate: 7.98 trans/sec Throughput: 0.00 MB/sec Concurrency: 9.68 Successful transactions: 75 Failed transactions: 17 Longest transaction: 4.27 Shortest transaction: 0.06 请注意,ELB似乎对问题没有任何影响,并且可以通过两个SSH会话复制到底层EC2并且(Amazon AMI没有围攻): ab -v 4 -c 10 -t 10 http://your-test-eb.you-aws-region.elasticbeanstalk.com/ 原因 /etc/cron.hourly/cron.logrotate.elasticbeanstalk.httpd.conf #!/bin/sh test -x /usr/sbin/logrotate || exit 0 /usr/sbin/logrotate /etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf /etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf /var/log/httpd/* { size 10M missingok notifempty rotate 5 sharedscripts compress dateext dateformat -%s create postrotate /sbin/service httpd reload > /dev/null 2>/dev/null || true endscript olddir /var/log/httpd/rotated } 请注意postrotate. / sbin / service只是/etc/init.d/中脚本的System V包装器.它的手册页说:
请注意,重新加载不是标准的Apache maintenance command.这是发行版的下游添加.让我们看一下init脚本/etc/init.d/httpd.相关部分如下: reload() { echo -n $"Reloading $prog: " check13 || exit 1 killproc -p ${pidfile} $httpd -HUP RETVAL=$? echo } 如您所见,它向Apache发送HUP信号,解释为Restart Now:
TERM很好地解释了504s.但它可能应该如何完成是Graceful Restart,因为它也重新打开日志但不终止正在提供的请求:
解决方法 可以使用 files: "/etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf": mode: "000644" owner: root group: root content: | /var/log/httpd/* { size 10M missingok notifempty rotate 5 sharedscripts compress dateext dateformat -%s create postrotate /sbin/service httpd graceful > /dev/null 2>/dev/null || true endscript olddir /var/log/httpd/rotated } 并重新部署您的Elastic Beanstalk环境.请注意,然而随后的亚秒优雅重启,我能够(零星地)产生503服务不可用,但是,对于日志轮换不是这样,因为均匀间隔的优雅重启没有错误. (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |