Hi Ivan, just to get a better overview. Can you provide more details about the pool with ID 2? And also ceph pg 2.c90 query. Joachim joachim.kraftmayer@xxxxxxxxx www.clyso.com Hohenzollernstr. 27, 80801 Munich Utting | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE275430677 Am Fr., 29. Nov. 2024 um 12:14 Uhr schrieb Ivan Clayson < ivan@xxxxxxxxxxxxxxxxx>: > Hello, > > We have an Alma8.9 (version 4 kernel) quincy (17.2.7) CephFS cluster > with spinners for our bulk data and SSDs for the metadata where we have > a single unfound object in the bulk pool: > > [root@ceph-n30 ~]# ceph -s > cluster: > id: fa7cf62b-e261-49cd-b00e-383c36b79ef3 > health: HEALTH_ERR > 1/849660811 objects unfound (0.000%) > Possible data damage: 1 pg recovery_unfound > Degraded data redundancy: 9/8468903874 objects degraded > (0.000%), 1 pg degraded > > services: > mon: 3 daemons, quorum ceph-s2,ceph-s3,ceph-s1 (age 44h) > mgr: ceph-s2(active, since 45h), standbys: ceph-s3, ceph-s1 > mds: 1/1 daemons up, 3 standby > osd: 439 osds: 439 up (since 43h), 439 in (since 43h); 176 > remapped pgs > > data: > volumes: 1/1 healthy > pools: 9 pools, 4321 pgs > objects: 849.66M objects, 2.3 PiB > usage: 3.0 PiB used, 1.7 PiB / 4.6 PiB avail > pgs: 9/8468903874 objects degraded (0.000%) > 36630744/8468903874 objects misplaced (0.433%) > 1/849660811 objects unfound (0.000%) > 4122 active+clean > 174 active+remapped+backfill_wait > 22 active+clean+scrubbing+deep > 2 active+remapped+backfilling > 1 active+recovery_unfound+degraded > > io: > client: 669 MiB/s rd, 87 MiB/s wr, 302 op/s rd, 77 op/s wr > recovery: 175 MiB/s, 59 objects/s > [root@ceph-n30 ~]# ceph health detail | grep unfound > HEALTH_ERR 1/849661114 objects unfound (0.000%); Possible data > damage: 1 pg recovery_unfound; Degraded data redundancy: > 9/8468906904 objects degraded (0.000%), 1 pg degraded > [WRN] OBJECT_UNFOUND: 1/849661114 objects unfound (0.000%) > pg 2.c90 has 1 unfound objects > [ERR] PG_DAMAGED: Possible data damage: 1 pg recovery_unfound > pg 2.c90 is active+recovery_unfound+degraded, acting > [259,210,390,209,43,66,322,297,25,374], 1 unfound > pg 2.c90 is active+recovery_unfound+degraded, acting > [259,210,390,209,43,66,322,297,25,374], 1 unfound > > We've tried deep-scrubbing and repairing the PG as well as rebooting the > entire cluster but unfortunately this has not resolved our issue. > > The primary OSD (259) log reports that our 1009e1df26d.000000c9 object > is missing where when we do rados commands on the object that command > just hangs: > > [root@ceph-n30 ~]# grep 2.c90 /var/log/ceph/ceph-osd.259.log > ... > 2024-11-25T11:38:33.860+0000 7fd409870700 1 osd.259 pg_epoch: > 512353 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512348/512349 n=211842 > ec=1175/1168 lis/c=512348/472766 les/c/f=512349/472770/232522 > sis=512353 pruub=11.010143280s) > [259,210,390,209,43,66,322,297,NONE,374]p259(0) r=0 lpr=512353 > pi=[472766,512353)/11 crt=512310'8145216 mlcod 0'0 unknown pruub > 205.739364624s@ m=1 mbc={}] state<Start>: transitioning to Primary > 2024-11-25T11:38:54.926+0000 7fd409870700 1 osd.259 pg_epoch: > 512356 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512353/512354 n=211842 > ec=1175/1168 lis/c=512353/472766 les/c/f=512354/472770/232522 > sis=512356 pruub=11.945847511s) > [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512356 > pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 active pruub > 227.741577148s@ m=1 > > mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(0+0)=1},9={(1+0)=1}}] > start_peering_interval up > [259,210,390,209,43,66,322,297,2147483647,374] -> > [259,210,390,209,43,66,322,297,25,374], acting > [259,210,390,209,43,66,322,297,2147483647,374] -> > [259,210,390,209,43,66,322,297,25,374], acting_primary 259(0) -> > 259, up_primary 259(0) -> 259, role 0 -> 0, features acting > 4540138320759226367 upacting 4540138320759226367 > 2024-11-25T11:38:54.926+0000 7fd409870700 1 osd.259 pg_epoch: > 512356 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512353/512354 n=211842 > ec=1175/1168 lis/c=512353/472766 les/c/f=512354/472770/232522 > sis=512356 pruub=11.945847511s) > [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512356 > pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 unknown pruub > 227.741577148s@ m=1 mbc={}] state<Start>: transitioning to Primary > 2024-11-25T11:38:59.910+0000 7fd409870700 0 osd.259 pg_epoch: > 512359 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842 > ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522 > sis=512356) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 > lpr=512356 pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 > active+recovering+degraded rops=1 m=1 > > mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(1+0)=1}} > trimq=[13f6e~134]] get_remaining_shards not enough shards left to > try for 2:0930c16c:::1009e1df26d.000000c9:head read result was > read_result_t(r=0, > > errors={25(8)=-2,43(4)=-2,66(5)=-2,209(3)=-2,210(1)=-2,297(7)=-2,322(6)=-2,390(2)=-2}, > noattrs, returned=(0, 8388608, [])) > 2024-11-25T11:38:59.911+0000 7fd409870700 0 osd.259 pg_epoch: > 512359 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842 > ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522 > sis=512356) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 > lpr=512356 pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 > active+recovering+degraded rops=1 m=1 u=1 > > mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}} > trimq=[13f6e~134]] on_failed_pull > 2:0930c16c:::1009e1df26d.000000c9:head from shard > 25(8),43(4),66(5),209(3),210(1),297(7),322(6),390(2), reps on 374(9) > unfound? 1 > 2024-11-25T11:39:01.435+0000 7fd409870700 -1 log_channel(cluster) > log [ERR] : 2.c90 has 1 objects unfound and apparently lost > 2024-11-25T11:39:02.193+0000 7fd409870700 -1 log_channel(cluster) > log [ERR] : 2.c90 has 1 objects unfound and apparently lost > 2024-11-25T11:39:03.590+0000 7fd409870700 1 osd.259 pg_epoch: > 512362 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842 > ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522 > sis=512362 pruub=8.352689743s) > [259,210,390,209,43,66,322,297,25,NONE]p259(0) r=0 lpr=512362 > pi=[472766,512362)/11 crt=512310'8145216 mlcod 0'0 active pruub > 232.812744141s@ m=1 u=1 > > mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}}] > start_peering_interval up [259,210,390,209,43,66,322,297,25,374] -> > [259,210,390,209,43,66,322,297,25,2147483647], acting > [259,210,390,209,43,66,322,297,25,374] -> > [259,210,390,209,43,66,322,297,25,2147483647], acting_primary 259(0) > -> 259, up_primary 259(0) -> 259, role 0 -> 0, features acting > 4540138320759226367 upacting 4540138320759226367 > 2024-11-25T11:39:03.591+0000 7fd409870700 1 osd.259 pg_epoch: > 512362 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842 > ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522 > sis=512362 pruub=8.352689743s) > [259,210,390,209,43,66,322,297,25,NONE]p259(0) r=0 lpr=512362 > pi=[472766,512362)/11 crt=512310'8145216 mlcod 0'0 unknown pruub > 232.812744141s@ m=1 mbc={}] state<Start>: transitioning to Primary > 2024-11-25T11:39:24.954+0000 7fd409870700 1 osd.259 pg_epoch: > 512365 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512362/512363 n=211842 > ec=1175/1168 lis/c=512362/472766 les/c/f=512363/472770/232522 > sis=512365 pruub=11.731550217s) > [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512365 > pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 active pruub > 257.554870605s@ m=1 > > mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(0+0)=1}}] > start_peering_interval up > [259,210,390,209,43,66,322,297,25,2147483647] -> > [259,210,390,209,43,66,322,297,25,374], acting > [259,210,390,209,43,66,322,297,25,2147483647] -> > [259,210,390,209,43,66,322,297,25,374], acting_primary 259(0) -> > 259, up_primary 259(0) -> 259, role 0 -> 0, features acting > 4540138320759226367 upacting 4540138320759226367 > 2024-11-25T11:39:24.954+0000 7fd409870700 1 osd.259 pg_epoch: > 512365 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512362/512363 n=211842 > ec=1175/1168 lis/c=512362/472766 les/c/f=512363/472770/232522 > sis=512365 pruub=11.731550217s) > [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512365 > pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 unknown pruub > 257.554870605s@ m=1 mbc={}] state<Start>: transitioning to Primary > 2024-11-25T11:39:30.679+0000 7fd409870700 0 osd.259 pg_epoch: > 512368 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512365/512366 n=211842 > ec=1175/1168 lis/c=512365/472766 les/c/f=512366/472770/232522 > sis=512365) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 > lpr=512365 pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 > active+recovering+degraded rops=1 m=1 > > mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(1+0)=1}} > trimq=[13f6e~134]] get_remaining_shards not enough shards left to > try for 2:0930c16c:::1009e1df26d.000000c9:head read result was > read_result_t(r=0, > > errors={25(8)=-2,43(4)=-2,66(5)=-2,209(3)=-2,210(1)=-2,297(7)=-2,322(6)=-2,390(2)=-2}, > noattrs, returned=(0, 8388608, [])) > 2024-11-25T11:39:30.679+0000 7fd409870700 0 osd.259 pg_epoch: > 512368 pg[2.c90s0( v 512310'8145216 lc 0'0 > (511405'8142151,512310'8145216] local-lis/les=512365/512366 n=211842 > ec=1175/1168 lis/c=512365/472766 les/c/f=512366/472770/232522 > sis=512365) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0 > lpr=512365 pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 > active+recovering+degraded rops=1 m=1 u=1 > > mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}} > trimq=[13f6e~134]] on_failed_pull > 2:0930c16c:::1009e1df26d.000000c9:head from shard > 25(8),43(4),66(5),209(3),210(1),297(7),322(6),390(2), reps on 374(9) > unfound? 1 > 2024-11-25T11:40:11.652+0000 7fd409870700 -1 log_channel(cluster) > log [ERR] : 2.c90 has 1 objects unfound and apparently lost > [root@ceph-n30 ~]# rados -p ec82pool stat 1009e1df26d.000000c9 > > ... # hanging indefinitely > > Similarly if we do any object store commands, then we get crashes and > seg faults as well: > > [root@ceph-n25 ~]# ceph pg 2.c90 list_unfound > { > "num_missing": 1, > "num_unfound": 1, > "objects": [ > { > "oid": { > "oid": "1009e1df26d.000000c9", > "key": "", > "snapid": -2, > "hash": 914558096, > "max": 0, > "pool": 2, > "namespace": "" > }, > "need": "510502'8140206", > "have": "0'0", > "flags": "none", > "clean_regions": "clean_offsets: [], clean_omap: 0, > new_object: 1", > "locations": [ > "374(9)" > ] > } > ], > "state": "NotRecovering", > "available_might_have_unfound": true, > "might_have_unfound": [], > "more": false > } > [root@ceph-n11 ~]# ceph-objectstore-tool --data-path > /var/lib/ceph/osd/ceph-374 --debug --pgid 2.c90 1009e1df26d.000000c9 > dump > ... > -6> 2024-11-25T15:54:57.387+0000 7f8b6a0a7340 1 > bluestore(/var/lib/ceph/osd/ceph-374) _upgrade_super from 4, latest 4 > > -5> 2024-11-25T15:54:57.387+0000 7f8b6a0a7340 1 > bluestore(/var/lib/ceph/osd/ceph-374) _upgrade_super done > > -4> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 prioritycache > tune_memory target: 4294967296 mapped: 132694016 unmapped: 618364928 > heap: 751058944 old mem: 134217728 new mem: 2761652735 > > -3> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 rocksdb: > commit_cache_size High Pri Pool Ratio set to 0.0540541 > > -2> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 prioritycache > tune_memory target: 4294967296 mapped: 132939776 unmapped: 618119168 > heap: 751058944 old mem: 2761652735 new mem: 2842823159 > > -1> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 > bluestore.MempoolThread(0x55cd3a807b40) _resize_shards cache_size: > 2842823159 kv_alloc: 1241513984 kv_used: 2567024 kv_onode_alloc: > 42949672 kv_onode_used: -22 meta_alloc: 1174405120 meta_used: 13360 > data_alloc: 218103808 data_u > sed: 0 > > 0> 2024-11-25T15:54:57.445+0000 7f8b6a0a7340 -1 *** Caught > signal (Segmentation fault) ** > in thread 7f8b6a0a7340 thread_name:ceph-objectstor > > ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) > quincy (stable) > 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f8b67941cf0] > 2: > > (BlueStore::collection_list(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, > ghobject_t const&, ghobject_t const&, int, std::vector<ghobject_t, > std::allocator<ghobject_t> >*, ghobject_t*)+0x4c) [0x55cd37e34f5c] > 3: (_action_on_all_objects_in_pg(ObjectStore*, coll_t, > action_on_object_t&, bool)+0x13b4) [0x55cd377fdf64] > 4: (action_on_all_objects_in_exact_pg(ObjectStore*, coll_t, > action_on_object_t&, bool)+0x64) [0x55cd377fe274] > 5: main() > 6: __libc_start_main() > 7: _start() > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to interpret this. > We don't have a previous verison of this object and trying to > fix-lost with the object store command seg faults: > > [root@ceph-n11 ~] ceph-objectstore-tool --data-path > /var/lib/ceph/osd/ceph-374 --pgid 2.c90 --op fix-lost --dry-run > *** Caught signal (Segmentation fault) ** > in thread 7f45d9890340 thread_name:ceph-objectstor > ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) > quincy (stable) > 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f45d712acf0] > 2: > > (BlueStore::collection_list(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, > ghobject_t const&, ghobject_t const&, int, std::vector<ghobject_t, > std::allocator<ghobject_t> >*, ghobject_t*)+0x4c) [0x5556f3c36f5c] > 3: (_action_on_all_objects_in_pg(ObjectStore*, coll_t, > action_on_object_t&, bool)+0x13b4) [0x5556f35fff64] > 4: (action_on_all_objects_in_exact_pg(ObjectStore*, coll_t, > action_on_object_t&, bool)+0x64) [0x5556f3600274] > 5: main() > 6: __libc_start_main() > 7: _start() > Segmentation fault (core dumped) > > The recommended solution seems to be to use "mark_unfound_lost revert" > but for our object, there is no previous version so I think this command > will discard and delete the object. The lost object is on our live > filesystem and there seems to be no easily way to find the backup > version as I can't access the path and or filename associated with the > object to recover it from the backup. Is there any way for us to recover > this object without discarding it? Or should we just accept our losses > and delete it? > > Kindest regards, > > Ivan Clayson > > -- > Ivan Clayson > ----------------- > Scientific Computing Officer > Room 2N269 > Structural Studies > MRC Laboratory of Molecular Biology > Francis Crick Ave, Cambridge > CB2 0QH > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx