Re: [nautilus] ceph tell hanging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Follow up on the tell hanging: iterating over all osds and trying to
raise the max-backfills gives hanging ceph tell processes like this:

root     1007846 15.3  1.2 918388 50972 pts/5    Sl   00:03   0:48 /usr/bin/python3 /usr/bin/ceph tell osd.4 injectargs --osd-max-backfill
root     1007890  0.4  0.9 850664 37596 pts/5    Sl   00:03   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.7 injectargs --osd-max-backfill
root     1007930  0.3  0.9 842472 37484 pts/5    Sl   00:03   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.11 injectargs --osd-max-backfil
root     1007987  0.3  0.9 850668 37540 pts/5    Sl   00:03   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.18 injectargs --osd-max-backfil
root     1008054  0.4  0.9 850664 37600 pts/5    Sl   00:03   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.29 injectargs --osd-max-backfil
root     1008147 14.7  1.2 910192 50648 pts/5    Sl   00:03   0:42 /usr/bin/python3 /usr/bin/ceph tell osd.33 injectargs --osd-max-backfil
root     1008205  0.3  0.9 842468 37524 pts/5    Sl   00:03   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.45 injectargs --osd-max-backfil
root     1008246  0.3  0.9 850664 37828 pts/5    Sl   00:04   0:01 /usr/bin/python3 /usr/bin/ceph tell osd.48 injectargs --osd-max-backfil
...

Additionally many of the tell processes get into an infinite loop and
print this error over and over again:

2020-09-23 00:09:48.766 7f07e5f99700  0 --1- [2a0a:e5c0:2:1:20d:b9ff:fe48:3bd4]:0/2338294673 >> v1:[2a0a:e5c0:2:1:21b:21ff:febc:5060]:6858/12824 conn(0x7f07c8055680 0x7f07c8053740 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2020-09-23 00:09:48.774 7f07e5f99700  0 --1- [2a0a:e5c0:2:1:20d:b9ff:fe48:3bd4]:0/2338294673 >> v1:[2a0a:e5c0:2:1:21b:21ff:febc:5060]:6858/12824 conn(0x7f07c804f590 0x7f07c80505c0 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2020-09-23 00:09:48.786 7f07e5f99700  0 --1- [2a0a:e5c0:2:1:20d:b9ff:fe48:3bd4]:0/2338294673 >> v1:[2a0a:e5c0:2:1:21b:21ff:febc:5060]:6858/12824 conn(0x7f07c8055680 0x7f07c8053740 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2020-09-23 00:09:48.790 7f07e5f99700  0 --1- [2a0a:e5c0:2:1:20d:b9ff:fe48:3bd4]:0/2338294673 >> v1:[2a0a:e5c0:2:1:21b:21ff:febc:5060]:6858/12824 conn(0x7f07c804f590 0x7f07c80505c0 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 connect got BADAUTHORIZER
2020-09-23 00:09:48.798 7f07e5f99700  0 --1-
[2a0a:e5c0:2:1:20d:b9ff:fe48:3bd4]:0/2338294673 >>
v1:[2a0a:e5c0:2:1:21b:21ff:febc:5060]:6858/12824 conn(0x7f07c8055680
0x7f07c8053740 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0
l=1).handle_connect_reply_2 connect got BADAUTHORIZ




Nico Schottelius <nico.schottelius@xxxxxxxxxxx> writes:

> So the same problem happens with pgs which are in "unknown" state,
>
> [19:31:08] black2.place6:~# ceph pg 2.5b2 query | tee query_2.5b2
>
> hangs until the pg actually because active again. I assume that this
> should not be the case, should it?
>
>
> Nico Schottelius <nico.schottelius@xxxxxxxxxxx> writes:
>
>> Update to the update: currently debugging why pgs are stuck in the
>> peering state:
>>
>> [18:57:49] black2.place6:~# ceph pg dump all | grep 2.7d1
>> dumped all
>> 2.7d1     16666                  0        0         0       0 69698617344           0          0 3002     3002                                                            peering 2020-09-22 18:49:28.587859   80407'8126117   80915:35142541    [22,84]         22    [22,84]             22   80407'8126117 2020-09-22 17:23:11.860334   79594'8122364 2020-09-21 13:27:16.376009             0
>>
>> The problem is that
>>
>> ceph pg 2.7d1 query
>>
>> hangs and does not output information. Does anyone know what could be
>> the cause for this?


--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux