Luminous rgw hangs after sighup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I noticed this morning that all four of our rados gateways (luminous 12.2.2) hung at logrotate time overnight. The last message logged was:

2017-12-08 03:21:01.897363 7fac46176700  0 ERROR: failed to clone shard, completion_mgr.get_next() returned ret=-125

one of the 3 nodes recorded more detail:
2017-12-08 06:51:04.452108 7f80fbfdf700  1 rgw realm reloader: Pausing frontends for realm update...
2017-12-08 06:51:04.452126 7f80fbfdf700  1 rgw realm reloader: Frontends paused
2017-12-08 06:51:04.452891 7f8202436700  0 ERROR: failed to clone shard, completion_mgr.get_next() returned ret=-125
I remember seeing this happen on our test cluster a while back with Kraken. I can't find the tracker issue I originally found related to this, but it also sounds like it could be a reversion of bug #20339 or #20686?

I recorded some strace output from one of the radosgw instances before restarting, if it's useful to open an issue.

--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux