linux – Haproxy中的TIME_WAIT数量很多

发布时间：2020-12-14 01:19:03 所属栏目：Linux 来源：网络整理

导读：我们在CentOS 5.9机器上安装了haproxy 1.3.26,它具有2.13 GHz Intel Xeon处理器,作为http和用于众多服务的tcp负载均衡器,提供~2000个请求/秒的峰值吞吐量.它已经运行了2年,但逐渐增加了流量和服务数量. 我们观察到即使在重新加载旧的haproxy过程后仍然存在.

我们在CentOS 5.9机器上安装了haproxy 1.3.26,它具有2.13 GHz Intel Xeon处理器,作为http和&用于众多服务的tcp负载均衡器,提供~2000个请求/秒的峰值吞吐量.它已经运行了2年,但逐渐增加了流量和服务数量.

我们观察到即使在重新加载旧的haproxy过程后仍然存在.在进一步调查中,我们发现旧进程在TIME_WAIT状态下有许多连接.我们还看到netstat和lsof需要很长时间.在提到http://agiletesting.blogspot.in/2013/07/the-mystery-of-stale-haproxy-processes.html时,我们引入了选项forceclose,但它正在弄乱各种监控服务,因此将其恢复.在进一步挖掘时,我们意识到在接近200K套接字的/ proc / net / sockstat处于tw(TIME_WAIT)状态,这是令人惊讶的,因为在/etc/haproxy/haproxy.cfg中maxconn被指定为31000并且ulimit-n被指定为64000.我们有超时服务器和超时客户端作为300s我们改为30s但没有多少用处.

现在的疑虑是： –

>是否可以接受如此高数量的TIME_WAIT.如果是,那么我们应该担心的是一个数字.看看What is the cost of many TIME_WAIT on the server side?和Setting TIME_WAIT TCP似乎应该没有任何问题.
>如何减少这些TIME_WAIT
>对于netstat和lsof有什么替代品,即使TIME_WAIT数量非常多,也会表现良好

解决方法

注意：这个答案中的引用都是从 a mail by Willy Tarreau(HAProxy的主要作者)到HAProxy邮件列表.

TIME_WAIT状态下的连接是无害的,并且不再消耗任何资源.它们由服务器上的内核保留一段时间,用于在连接关闭后它仍然收到包的罕见事件.在该状态下保持关闭连接的默认时间通常为120秒(或最大段生命周期的2倍)

TIME_WAIT are harmless on the server side. You can easily reach millions
without any issues.

如果您仍希望减少该数字以便更早地释放连接,则可以指示内核执行此操作.例如,设置为30秒执行：

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

如果你有很多连接(无论是否在TIME_WAIT中),netstat,lsof,ipcs的性能都很差,实际上整个系统的速度都会降低.再次引用威利：

There are two commands that you must absolutely never use in a monitoring
system :

netstat -a

ipcs -a

Both of them will saturate the system and considerably slow it down when
something starts to go wrong. For the sockets you should use what’s in
/proc/net/sockstat. You have all the numbers you want. If you need more
details,use ss -a instead of netstat -a,it uses the netlink interface
and is several orders of magnitude faster.

在Debian和Ubuntu系统上,ss在iproute或iproute2包中可用(取决于您的发行版的版本).

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!