Hello,
I already had the case, I applied the parameter (osd_find_best_info_ignore_history_les) to all the osd that have reported the queries blocked.
-- Cordialement, CEO FEELB | Corentin BONNETON
Hi Ceph folks,
I have a cluster running Jewel 10.2.5 using a mix EC and replicated pools.
After rebooting a host last night, one PG refuses to complete peering
pg 1.323 is stuck inactive for 73352.498493, current state peering, last acting [595,1391,240,127,937,362,267,320,7,634,716]
Restarting OSDs or hosts does nothing to help, or sometimes results in things like this:
pg 1.323 is remapped+peering, acting [2147483647,1391,240,127,937,362,267,320,7,634,716]
The host that was rebooted is home to osd.7 (8). If I go onto it to look at the logs for osd.7 this is what I see:
$ tail -f /var/log/ceph/ceph-osd.7.log 2017-02-08 15:41:00.445247 7f5fcc2bd700 0 -- XXX.XXX.XXX.172:6905/20510 >> XXX.XXX.XXX.192:6921/55371 pipe(0x7f6074a0b400 sd=34 :42828 s=2 pgs=319 cs=471 l=0 c=0x7f6070086700).fault, initiating reconnect
I'm assuming that in IP1:port1/PID1 >> IP2:port2/PID2 the >> indicates the direction of communication. I've traced these to osd.7 (rank 8 in the stuck PG) reaching out to osd.595 (the primary in the stuck PG).
Meanwhile, looking at the logs of osd.595 I see this:
$ tail -f /var/log/ceph/ceph-osd.595.log 2017-02-08 15:41:15.760708 7f1765673700 0 -- XXX.XXX.XXX.192:6921/55371 >> XXX.XXX.XXX.172:6905/20510 pipe(0x7f17b2911400 sd=101 :6921 s=0 pgs=0 cs=0 l=0 c=0x7f17b7beaf00).accept connect_seq 478 vs existing 477 state standby 2017-02-08 15:41:20.768844 7f1765673700 0 bad crc in front 1941070384 != exp 3786596716
which again shows osd.595 reaching out to osd.7 and from what I could gather the CRC problem is about messaging.
Google searching has yielded nothing particularly useful on how to get this unstuck.
ceph pg 1.323 query seems to hang forever but it completed once last night and I noticed this:
"peering_blocked_by_detail": [ { "detail": "peering_blocked_by_history_les_bound" }
We have seen this before and it was cleared by setting osd_find_best_info_ignore_history_les to true for the first two OSDs on the stuck PGs (this was on a 3 replica pool). This hasn't worked in this case and I suspect the option needs to be set on either a majority of OSDs or enough k number of OSDs to be able to use their data and ignore history.
We would really appreciate any guidance and/or help the community can offer!
|