Broken pipe error on Rados gateway log

Nghia Viet Tran <Nghia.Viet.Tran@xxxxxxxxxx> · Wed, 4 Aug 2021 13:45:11 +0000

Hi everyone,

Recently, I just noticed that there is a lot of log about Broken pipe error from all RGW nodes.

Log:
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 ====== starting new request req=0x7f4fac3d7670 =====
2021-08-04T06:25:05.997+0000 7f4f15f7b700  0 ERROR: client_io->complete_request() returned Broken pipe
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 ====== req done req=0x7f4fac3d7670 op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:05.997+0000 7f4f15f7b700  1 beast: 0x7f4fac3d7670: 10.0.244.246 - - [2021-08-04T06:25:05.997988+0000] "HEAD / HTTP/1.0" 200 0 - - -
2021-08-04T06:25:06.337+0000 7f4ebe6cc700  1 ====== starting new request req=0x7f4fac3d7670 =====
2021-08-04T06:25:06.337+0000 7f4f13776700  0 ERROR: client_io->complete_request() returned Broken pipe
2021-08-04T06:25:06.337+0000 7f4f13776700  1 ====== req done req=0x7f4fac3d7670 op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:06.337+0000 7f4f13776700  1 beast: 0x7f4fac3d7670: 10.0.244.244 - - [2021-08-04T06:25:06.337994+0000] "HEAD / HTTP/1.0" 200 0 - - -
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 ====== starting new request req=0x7f4fac3d7670 =====
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 ====== req done req=0x7f4fac3d7670 op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:07.994+0000 7f4eaa6a4700  1 beast: 0x7f4fac3d7670: 10.0.244.245 - - [2021-08-04T06:25:07.994022+0000] "HEAD / HTTP/1.0" 200 5 - - -
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 ====== starting new request req=0x7f4fac3d7670 =====
2021-08-04T06:25:08.002+0000 7f4ee1f13700  0 ERROR: client_io->complete_request() returned Broken pipe
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 ====== req done req=0x7f4fac3d7670 op status=0 http_status=200 latency=0s ======
2021-08-04T06:25:08.002+0000 7f4ee1f13700  1 beast: 0x7f4fac3d7670: 10.0.244.246 - - [2021-08-04T06:25:08.002023+0000] "HEAD / HTTP/1.0" 200 0 - - -
….

We setup the cluster by using Ceph-ansible script. The currently version of cluster is Octopus (15.2.13). After check the configuration in RGW nodes, I see that there is a config in HAProxy for sending a request to RGW instances every 2s for health check.
The problem is gone after disabling the check but I think this is not a good way to fix the problem…

Does anyone have experience on this problem?

References
- Ansible script commit for adding health check: https://github.com/ceph/ceph-ansible/commit/a951c1a3f0a34e086964f52b0bbf7a8d89481aad#diff-1ea21f2851186f2a01ff25e715ed670b9d96629c6b7bc385aefd9e4154204bde

Many thanks,
Nghia.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx