Hi! Just try to Google data_digest_mismatch_oi On old maillist archives couple of threads with same problem k Sent from my iPhone > On 29 Jun 2022, at 13:54, Lennart van Gijtenbeek | Routz <lennart.vangijtenbeek@xxxxxxxx> wrote: > > Hello Ceph community, > > > I hope you could help me with an issue we are experiencing on our backup cluster. > > The Ceph version we are running here is 10.2.10 (Jewel), and we are using Filestore. > The PG is part of a replicated pool with size=2. > > > Getting the following error: > ``` > > root@cephmon0:~# ceph health detail > HEALTH_ERR 1 pgs inconsistent; 2 scrub errors > pg 37.189 is active+clean+inconsistent, acting [144,170] > 2 scrub errors > ``` > > ``` > root@cephmon0:~# grep 37.189 /var/log/ceph/ceph.log > 2022-06-29 11:11:27.782920 osd.144 10.129.160.22:6800/2810 7598 : cluster [INF] osd.144 pg 37.189 Deep scrub errors, upgrading scrub to deep-scrub > 2022-06-29 11:11:27.884628 osd.144 10.129.160.22:6800/2810 7599 : cluster [INF] 37.189 deep-scrub starts > 2022-06-29 11:13:07.124841 osd.144 10.129.160.22:6800/2810 7600 : cluster [ERR] 37.189 shard 144: soid 37:9193d307:::isqPpJMKYY4.000000000000001e:head data_digest 0x50007bd9 != data_digest 0x885fabcc from auth oi 37:9193d307:::isqPpJMKYY4.000000000000001e:head(7211'173457 osd.71.0:397191 dirty|data_digest|omap_digest s 4194304 uv 39699 dd 885fabcc od ffffffff alloc_hint [0 0]) > 2022-06-29 11:13:07.124849 osd.144 10.129.160.22:6800/2810 7601 : cluster [ERR] 37.189 shard 170: soid 37:9193d307:::isqPpJMKYY4.000000000000001e:head data_digest 0x50007bd9 != data_digest 0x885fabcc from auth oi 37:9193d307:::isqPpJMKYY4.000000000000001e:head(7211'173457 osd.71.0:397191 dirty|data_digest|omap_digest s 4194304 uv 39699 dd 885fabcc od ffffffff alloc_hint [0 0]) > 2022-06-29 11:13:07.124853 osd.144 10.129.160.22:6800/2810 7602 : cluster [ERR] 37.189 soid 37:9193d307:::isqPpJMKYY4.000000000000001e:head: failed to pick suitable auth object > 2022-06-29 11:20:46.459906 osd.144 10.129.160.22:6800/2810 7603 : cluster [ERR] 37.189 deep-scrub 2 errors > ``` > > The PG has already been transferred from 2 other OSDs. That is, the same error occurred when the PG was stored on two different OSDs. So it seems this is not a disk issue. There seems to be something wrong with the object "isqPpJMKYY4.000000000000001e". > However, when looking at the md5sum for the object. On both OSDs, this is the same. > > > ``` > > root@ceph12:/var/lib/ceph/osd/ceph-144/current/37.189_head/DIR_9/DIR_8/DIR_9/DIR_C# ls -l isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > > -rw-r--r-- 1 ceph ceph 4194304 Jun 3 09:56 isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > > root@ceph12:/var/lib/ceph/osd/ceph-144/current/37.189_head/DIR_9/DIR_8/DIR_9/DIR_C# md5sum isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > 96d702072cd441f2d0af60783e8db248 isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > ``` > > ``` > root@ceph15:/var/lib/ceph/osd/ceph-170/current/37.189_head/DIR_9/DIR_8/DIR_9/DIR_C# ls -l isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > -rw-r--r-- 1 ceph ceph 4194304 Jun 23 16:41 isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > > root@ceph15:/var/lib/ceph/osd/ceph-170/current/37.189_head/DIR_9/DIR_8/DIR_9/DIR_C# md5sum isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > 96d702072cd441f2d0af60783e8db248 isqPpJMKYY4.000000000000001e__head_E0CBC989__25 > ``` > > ``` > root@cephmon0:~# rados list-inconsistent-obj 37.189 --format=json-pretty > { > "epoch": 167653, > "inconsistents": [ > { > "object": { > "name": "isqPpJMKYY4.000000000000001e", > "nspace": "", > "locator": "", > "snap": "head", > "version": 39699 > }, > "errors": [], > "union_shard_errors": [ > "data_digest_mismatch_oi" > ], > "selected_object_info": "37:9193d307:::isqPpJMKYY4.000000000000001e:head(7211'173457 osd.71.0:397191 dirty|data_digest|omap_digest s 4194304 uv 39699 dd 885fabcc od ffffffff alloc_hint [0 0])", > "shards": [ > { > "osd": 144, > "errors": [ > "data_digest_mismatch_oi" > ], > "size": 4194304, > "omap_digest": "0xffffffff", > "data_digest": "0x50007bd9" > }, > { > "osd": 170, > "errors": [ > "data_digest_mismatch_oi" > ], > "size": 4194304, > "omap_digest": "0xffffffff", > "data_digest": "0x50007bd9" > } > ] > } > ] > } > ``` > > I don't understand where there is a "data_digest_mismatch_oi" error. Since the checksums seem to match. > > Does anyone have any idea on how to fix this? > Your input would be very much appreciated. Please let me know if you need additional info. > > Thank you. > > Best regards, > Lennart van Gijtenbeek > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx