Hi Manuel,
could you please elaborate a bit about the reproduction steps in 16.2.6:
1) Do you just put the object named this way with rados tool to a
replicated pool and subseqent deep scrubs reports the error? Or some
othe steps are present?
2) Do you have all-bluestore setup for that pacific cluster or there is
a mixture of bluestore and file store osds?
Thanks,
Igor
On 2/10/2022 12:06 PM, Manuel Lausch wrote:
Okay. the issue is triggered with a specifc object name
->
c76c7ac2014adb9f0f0837ac1e85fd1e241af225908b6a0c3d3a44d6b866e732_00400000
And with this name I could trigger at least the scrub issues
on a ceph pacific 16.2.6 as well.
I opend a bug ticket to this issue:
https://tracker.ceph.com/issues/54226
On Tue, 8 Feb 2022 14:35:58 +0100
Manuel Lausch <manuel.lausch@xxxxxxxx> wrote:
Okay, I definitely need here some help.
The crashing OSD moved with the PG. so The PG seems to have the issue
I moved (via upmaps ) all 4 replicas to filestore OSDs. After this the
error seems to be solved. No OSD crashed after this.
A deep-scrub of the PG didn't throw any error. So I moved the first
shard back to a bluestore OSD. This worked flawlessly as well.
A deep scrub after this showed one object missing. The
same which was obviously the cause of the prior crashes.
repair seemed to fixed the object. But a further deep-scrub brings back
the same error.
Even putting the object again with rados put didn't help. now I have
two "missing" objects. (the head and the snapshot from overwriting)
Here the scrub error and reapair from the osd log
2022-02-08 14:04:43.751 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff shard 3 1:ffffffff:::c76c7ac2014adb9f0f0837ac1e85fd1e241af225908b6a0c3d3a44d6b866e732_00400000:head : missing
2022-02-08 14:04:43.751 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff deep-scrub 1 missing, 0 inconsistent objects
2022-02-08 14:04:43.751 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff deep-scrub 1 errors
2022-02-08 13:52:09.111 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff shard 3 1:ffffffff:::c76c7ac2014adb9f0f0837ac1e85fd1e241af225908b6a0c3d3a44d6b866e732_00400000:head : missing
2022-02-08 13:52:09.111 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff repair 1 missing, 0 inconsistent objects
2022-02-08 13:52:09.111 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff repair 1 errors, 1 fixed
and here the new scrub error with the two missings
2022-02-08 14:19:10.990 7f600dfec700 0 log_channel(cluster) log [DBG] : 1.7fff deep-scrub starts
2022-02-08 14:25:17.749 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff shard 3 1:ffffffff:::c76c7ac2014adb9f0f0837ac1e85fd1e241af225908b6a0c3d3a44d6b866e732_00400000:974 : missing
2022-02-08 14:25:17.749 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff shard 3 1:ffffffff:::c76c7ac2014adb9f0f0837ac1e85fd1e241af225908b6a0c3d3a44d6b866e732_00400000:head : missing
2022-02-08 14:25:17.750 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff deep-scrub 2 missing, 0 inconsistent objects
2022-02-08 14:25:17.750 7f600dfec700 -1 log_channel(cluster) log [ERR] : 1.7fff deep-scrub 2 errors
Can someone help me here? I don't have any clue.
Regards
Manuel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Igor Fedotov
Ceph Lead Developer
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx