Re: rgw realm reloader keeps triggering after upgrade to Nautiilus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/25/19 1:14 AM, Eric Choi wrote:
I tried this one in ceph-user, didn't get any response, trying it here again after a slight edit:
---

Hello,

We recently upgraded from Luminous to Nautilus, after the upgrade, we are seeing this
sporadic "lock-up" behavior on the RGW side.

What I noticed from the log is that it seems to coincide with rgw realm reloader. What we
are seeing is that realm reloader tries to pause frontends for that time period RGW is completely locked up, unable
to take new requests. I believe this is an expected behavior.

But why does rgw realm reloader keep triggering?  is there a way to disable it or reduce frequency? We are
not using multi-site feature (although we have a default realm) and we don't change our realm config at all.  I captured the log below to see to capture anything with 'watch',

Anyone?
--

2019-09-19 18:03:23.245 7f0bd5f5f700  1 rgw realm reloader: Resuming frontends with new realm configuration.
2019-09-19 18:03:23.245 7f2bd8f5d700  1 ====== starting new request req=0x7f2bd8f56950 =====
2019-09-19 18:03:23.245 7f2bd2750700  1 ====== starting new request req=0x7f2bd2749950 =====
2019-09-19 18:03:23.245 7f2bcaf41700  1 ====== starting new request req=0x7f2bcaf3a950 =====
2019-09-19 18:03:23.245 7f2bd074c700  1 ====== starting new request req=0x7f2bd0745950 =====
2019-09-19 18:03:23.245 7f2bc6f39700  1 ====== starting new request req=0x7f2bc6f32950 =====
2019-09-19 18:03:23.245 7f2bd5756700  1 ====== starting new request req=0x7f2bd574f950 =====
2019-09-19 18:03:23.245 7f2bc4f35700  1 ====== starting new request req=0x7f2bc4f2e950 =====
--
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj verifying op params
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj pre-executing
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj executing
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj completing
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj op status=0
2019-09-19 18:05:41.588 7f2bd2750700  2 req 121303 0.001s s3:get_obj http status=200
2019-09-19 18:05:41.588 7f2bd2750700  1 ====== req done req=0x7f2bd2749950 op status=0 http_status=200 latency=0.001s ======
2019-09-19 18:05:41.588 7f2bd2750700  1 civetweb: 0x7f2c36d6f3a8: 168.245.88.23 - - [19/Sep/2019:18:05:39 +0000] "GET /kamta-incoming/filter0190p3mdw1-28304-5D83C374-LIfsCi4hQiGpa_pXeRjZ-A HTTP/1.1" 200 17578 - Minio (linux; amd64) minio-go/v6.0.17
2019-09-19 18:05:41.589 7f0bd1f57700  4 rgw period pusher: No zones to update
2019-09-19 18:05:41.589 7f0bd1f57700  4 rgw realm reloader: Notification on realm, reconfiguration scheduled

This realm reloader uses librados watch/notify to get these notifications about changes to the period configuration. The only thing that should be generating these notifications are radosgw-admin period commit commands. How frequently are you seeing these?

I'm not aware of any changes to this stuff between luminous and nautilus. Can you share the 'epoch' and 'realm_epoch' values from 'radosgw-admin period get'? The epoch should tell you approximately how many times the configuration has changed.

If some rogue radosgw-admin process is committing periods, you may be able to track down its ip address from debug_ms=1 logging on the osds. For example:

$ cat osd.0.log | grep notify | grep reply | grep control
2019-09-25T10:15:20.583-0400 7f0144ff9700  1 -- [v2:10.17.151.111:6800/21950,v1:10.17.151.111:6801/21950] --> 10.17.151.111:0/2370632151 -- osd_op_reply(265 realms.9b307ff8-8ac5-4ca9-8398-abb8af02e5b2.control [notify cookie 140285980045424] v0'0 uv1 ondisk = 0) v8 -- 0x7f011807e110 con 0x7f020801a0a0

This shows the osd replying to a client address '10.17.151.111:0/2370632151'.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux