Hello,
We have an Alma8.9 (version 4 kernel) quincy (17.2.7) CephFS cluster
with spinners for our bulk data and SSDs for the metadata where we have
a single unfound object in the bulk pool:
[root@ceph-n30 ~]# ceph -s
cluster:
id: fa7cf62b-e261-49cd-b00e-383c36b79ef3
health: HEALTH_ERR
1/849660811 objects unfound (0.000%)
Possible data damage: 1 pg recovery_unfound
Degraded data redundancy: 9/8468903874 objects degraded
(0.000%), 1 pg degraded
services:
mon: 3 daemons, quorum ceph-s2,ceph-s3,ceph-s1 (age 44h)
mgr: ceph-s2(active, since 45h), standbys: ceph-s3, ceph-s1
mds: 1/1 daemons up, 3 standby
osd: 439 osds: 439 up (since 43h), 439 in (since 43h); 176
remapped pgs
data:
volumes: 1/1 healthy
pools: 9 pools, 4321 pgs
objects: 849.66M objects, 2.3 PiB
usage: 3.0 PiB used, 1.7 PiB / 4.6 PiB avail
pgs: 9/8468903874 objects degraded (0.000%)
36630744/8468903874 objects misplaced (0.433%)
1/849660811 objects unfound (0.000%)
4122 active+clean
174 active+remapped+backfill_wait
22 active+clean+scrubbing+deep
2 active+remapped+backfilling
1 active+recovery_unfound+degraded
io:
client: 669 MiB/s rd, 87 MiB/s wr, 302 op/s rd, 77 op/s wr
recovery: 175 MiB/s, 59 objects/s
[root@ceph-n30 ~]# ceph health detail | grep unfound
HEALTH_ERR 1/849661114 objects unfound (0.000%); Possible data
damage: 1 pg recovery_unfound; Degraded data redundancy:
9/8468906904 objects degraded (0.000%), 1 pg degraded
[WRN] OBJECT_UNFOUND: 1/849661114 objects unfound (0.000%)
pg 2.c90 has 1 unfound objects
[ERR] PG_DAMAGED: Possible data damage: 1 pg recovery_unfound
pg 2.c90 is active+recovery_unfound+degraded, acting
[259,210,390,209,43,66,322,297,25,374], 1 unfound
pg 2.c90 is active+recovery_unfound+degraded, acting
[259,210,390,209,43,66,322,297,25,374], 1 unfound
We've tried deep-scrubbing and repairing the PG as well as rebooting the
entire cluster but unfortunately this has not resolved our issue.
The primary OSD (259) log reports that our 1009e1df26d.000000c9 object
is missing where when we do rados commands on the object that command
just hangs:
[root@ceph-n30 ~]# grep 2.c90 /var/log/ceph/ceph-osd.259.log
...
2024-11-25T11:38:33.860+0000 7fd409870700 1 osd.259 pg_epoch:
512353 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512348/512349 n=211842
ec=1175/1168 lis/c=512348/472766 les/c/f=512349/472770/232522
sis=512353 pruub=11.010143280s)
[259,210,390,209,43,66,322,297,NONE,374]p259(0) r=0 lpr=512353
pi=[472766,512353)/11 crt=512310'8145216 mlcod 0'0 unknown pruub
205.739364624s@ m=1 mbc={}] state<Start>: transitioning to Primary
2024-11-25T11:38:54.926+0000 7fd409870700 1 osd.259 pg_epoch:
512356 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512353/512354 n=211842
ec=1175/1168 lis/c=512353/472766 les/c/f=512354/472770/232522
sis=512356 pruub=11.945847511s)
[259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512356
pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 active pruub
227.741577148s@ m=1
mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(0+0)=1},9={(1+0)=1}}]
start_peering_interval up
[259,210,390,209,43,66,322,297,2147483647,374] ->
[259,210,390,209,43,66,322,297,25,374], acting
[259,210,390,209,43,66,322,297,2147483647,374] ->
[259,210,390,209,43,66,322,297,25,374], acting_primary 259(0) ->
259, up_primary 259(0) -> 259, role 0 -> 0, features acting
4540138320759226367 upacting 4540138320759226367
2024-11-25T11:38:54.926+0000 7fd409870700 1 osd.259 pg_epoch:
512356 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512353/512354 n=211842
ec=1175/1168 lis/c=512353/472766 les/c/f=512354/472770/232522
sis=512356 pruub=11.945847511s)
[259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512356
pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0 unknown pruub
227.741577148s@ m=1 mbc={}] state<Start>: transitioning to Primary
2024-11-25T11:38:59.910+0000 7fd409870700 0 osd.259 pg_epoch:
512359 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842
ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522
sis=512356) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0
lpr=512356 pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0
active+recovering+degraded rops=1 m=1
mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(1+0)=1}}
trimq=[13f6e~134]] get_remaining_shards not enough shards left to
try for 2:0930c16c:::1009e1df26d.000000c9:head read result was
read_result_t(r=0,
errors={25(8)=-2,43(4)=-2,66(5)=-2,209(3)=-2,210(1)=-2,297(7)=-2,322(6)=-2,390(2)=-2},
noattrs, returned=(0, 8388608, []))
2024-11-25T11:38:59.911+0000 7fd409870700 0 osd.259 pg_epoch:
512359 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842
ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522
sis=512356) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0
lpr=512356 pi=[472766,512356)/10 crt=512310'8145216 mlcod 0'0
active+recovering+degraded rops=1 m=1 u=1
mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}}
trimq=[13f6e~134]] on_failed_pull
2:0930c16c:::1009e1df26d.000000c9:head from shard
25(8),43(4),66(5),209(3),210(1),297(7),322(6),390(2), reps on 374(9)
unfound? 1
2024-11-25T11:39:01.435+0000 7fd409870700 -1 log_channel(cluster)
log [ERR] : 2.c90 has 1 objects unfound and apparently lost
2024-11-25T11:39:02.193+0000 7fd409870700 -1 log_channel(cluster)
log [ERR] : 2.c90 has 1 objects unfound and apparently lost
2024-11-25T11:39:03.590+0000 7fd409870700 1 osd.259 pg_epoch:
512362 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842
ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522
sis=512362 pruub=8.352689743s)
[259,210,390,209,43,66,322,297,25,NONE]p259(0) r=0 lpr=512362
pi=[472766,512362)/11 crt=512310'8145216 mlcod 0'0 active pruub
232.812744141s@ m=1 u=1
mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}}]
start_peering_interval up [259,210,390,209,43,66,322,297,25,374] ->
[259,210,390,209,43,66,322,297,25,2147483647], acting
[259,210,390,209,43,66,322,297,25,374] ->
[259,210,390,209,43,66,322,297,25,2147483647], acting_primary 259(0)
-> 259, up_primary 259(0) -> 259, role 0 -> 0, features acting
4540138320759226367 upacting 4540138320759226367
2024-11-25T11:39:03.591+0000 7fd409870700 1 osd.259 pg_epoch:
512362 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512356/512357 n=211842
ec=1175/1168 lis/c=512356/472766 les/c/f=512357/472770/232522
sis=512362 pruub=8.352689743s)
[259,210,390,209,43,66,322,297,25,NONE]p259(0) r=0 lpr=512362
pi=[472766,512362)/11 crt=512310'8145216 mlcod 0'0 unknown pruub
232.812744141s@ m=1 mbc={}] state<Start>: transitioning to Primary
2024-11-25T11:39:24.954+0000 7fd409870700 1 osd.259 pg_epoch:
512365 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512362/512363 n=211842
ec=1175/1168 lis/c=512362/472766 les/c/f=512363/472770/232522
sis=512365 pruub=11.731550217s)
[259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512365
pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 active pruub
257.554870605s@ m=1
mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(0+0)=1}}]
start_peering_interval up
[259,210,390,209,43,66,322,297,25,2147483647] ->
[259,210,390,209,43,66,322,297,25,374], acting
[259,210,390,209,43,66,322,297,25,2147483647] ->
[259,210,390,209,43,66,322,297,25,374], acting_primary 259(0) ->
259, up_primary 259(0) -> 259, role 0 -> 0, features acting
4540138320759226367 upacting 4540138320759226367
2024-11-25T11:39:24.954+0000 7fd409870700 1 osd.259 pg_epoch:
512365 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512362/512363 n=211842
ec=1175/1168 lis/c=512362/472766 les/c/f=512363/472770/232522
sis=512365 pruub=11.731550217s)
[259,210,390,209,43,66,322,297,25,374]p259(0) r=0 lpr=512365
pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0 unknown pruub
257.554870605s@ m=1 mbc={}] state<Start>: transitioning to Primary
2024-11-25T11:39:30.679+0000 7fd409870700 0 osd.259 pg_epoch:
512368 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512365/512366 n=211842
ec=1175/1168 lis/c=512365/472766 les/c/f=512366/472770/232522
sis=512365) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0
lpr=512365 pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0
active+recovering+degraded rops=1 m=1
mbc={0={(0+0)=1},1={(1+0)=1},2={(1+0)=1},3={(1+0)=1},4={(1+0)=1},5={(1+0)=1},6={(1+0)=1},7={(1+0)=1},8={(1+0)=1},9={(1+0)=1}}
trimq=[13f6e~134]] get_remaining_shards not enough shards left to
try for 2:0930c16c:::1009e1df26d.000000c9:head read result was
read_result_t(r=0,
errors={25(8)=-2,43(4)=-2,66(5)=-2,209(3)=-2,210(1)=-2,297(7)=-2,322(6)=-2,390(2)=-2},
noattrs, returned=(0, 8388608, []))
2024-11-25T11:39:30.679+0000 7fd409870700 0 osd.259 pg_epoch:
512368 pg[2.c90s0( v 512310'8145216 lc 0'0
(511405'8142151,512310'8145216] local-lis/les=512365/512366 n=211842
ec=1175/1168 lis/c=512365/472766 les/c/f=512366/472770/232522
sis=512365) [259,210,390,209,43,66,322,297,25,374]p259(0) r=0
lpr=512365 pi=[472766,512365)/10 crt=512310'8145216 mlcod 0'0
active+recovering+degraded rops=1 m=1 u=1
mbc={0={(0+0)=1},1={(0+0)=1},2={(0+0)=1},3={(0+0)=1},4={(0+0)=1},5={(0+0)=1},6={(0+0)=1},7={(0+0)=1},8={(0+0)=1},9={(1+0)=1}}
trimq=[13f6e~134]] on_failed_pull
2:0930c16c:::1009e1df26d.000000c9:head from shard
25(8),43(4),66(5),209(3),210(1),297(7),322(6),390(2), reps on 374(9)
unfound? 1
2024-11-25T11:40:11.652+0000 7fd409870700 -1 log_channel(cluster)
log [ERR] : 2.c90 has 1 objects unfound and apparently lost
[root@ceph-n30 ~]# rados -p ec82pool stat 1009e1df26d.000000c9
... # hanging indefinitely
Similarly if we do any object store commands, then we get crashes and
seg faults as well:
[root@ceph-n25 ~]# ceph pg 2.c90 list_unfound
{
"num_missing": 1,
"num_unfound": 1,
"objects": [
{
"oid": {
"oid": "1009e1df26d.000000c9",
"key": "",
"snapid": -2,
"hash": 914558096,
"max": 0,
"pool": 2,
"namespace": ""
},
"need": "510502'8140206",
"have": "0'0",
"flags": "none",
"clean_regions": "clean_offsets: [], clean_omap: 0,
new_object: 1",
"locations": [
"374(9)"
]
}
],
"state": "NotRecovering",
"available_might_have_unfound": true,
"might_have_unfound": [],
"more": false
}
[root@ceph-n11 ~]# ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-374 --debug --pgid 2.c90 1009e1df26d.000000c9
dump
...
-6> 2024-11-25T15:54:57.387+0000 7f8b6a0a7340 1
bluestore(/var/lib/ceph/osd/ceph-374) _upgrade_super from 4, latest 4
-5> 2024-11-25T15:54:57.387+0000 7f8b6a0a7340 1
bluestore(/var/lib/ceph/osd/ceph-374) _upgrade_super done
-4> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 prioritycache
tune_memory target: 4294967296 mapped: 132694016 unmapped: 618364928
heap: 751058944 old mem: 134217728 new mem: 2761652735
-3> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 rocksdb:
commit_cache_size High Pri Pool Ratio set to 0.0540541
-2> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5 prioritycache
tune_memory target: 4294967296 mapped: 132939776 unmapped: 618119168
heap: 751058944 old mem: 2761652735 new mem: 2842823159
-1> 2024-11-25T15:54:57.439+0000 7f8b5d925700 5
bluestore.MempoolThread(0x55cd3a807b40) _resize_shards cache_size:
2842823159 kv_alloc: 1241513984 kv_used: 2567024 kv_onode_alloc:
42949672 kv_onode_used: -22 meta_alloc: 1174405120 meta_used: 13360
data_alloc: 218103808 data_u
sed: 0
0> 2024-11-25T15:54:57.445+0000 7f8b6a0a7340 -1 *** Caught
signal (Segmentation fault) **
in thread 7f8b6a0a7340 thread_name:ceph-objectstor
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2)
quincy (stable)
1: /lib64/libpthread.so.0(+0x12cf0) [0x7f8b67941cf0]
2:
(BlueStore::collection_list(boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ghobject_t const&, ghobject_t const&, int, std::vector<ghobject_t,
std::allocator<ghobject_t> >*, ghobject_t*)+0x4c) [0x55cd37e34f5c]
3: (_action_on_all_objects_in_pg(ObjectStore*, coll_t,
action_on_object_t&, bool)+0x13b4) [0x55cd377fdf64]
4: (action_on_all_objects_in_exact_pg(ObjectStore*, coll_t,
action_on_object_t&, bool)+0x64) [0x55cd377fe274]
5: main()
6: __libc_start_main()
7: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
We don't have a previous verison of this object and trying to
fix-lost with the object store command seg faults:
[root@ceph-n11 ~] ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-374 --pgid 2.c90 --op fix-lost --dry-run
*** Caught signal (Segmentation fault) **
in thread 7f45d9890340 thread_name:ceph-objectstor
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2)
quincy (stable)
1: /lib64/libpthread.so.0(+0x12cf0) [0x7f45d712acf0]
2:
(BlueStore::collection_list(boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ghobject_t const&, ghobject_t const&, int, std::vector<ghobject_t,
std::allocator<ghobject_t> >*, ghobject_t*)+0x4c) [0x5556f3c36f5c]
3: (_action_on_all_objects_in_pg(ObjectStore*, coll_t,
action_on_object_t&, bool)+0x13b4) [0x5556f35fff64]
4: (action_on_all_objects_in_exact_pg(ObjectStore*, coll_t,
action_on_object_t&, bool)+0x64) [0x5556f3600274]
5: main()
6: __libc_start_main()
7: _start()
Segmentation fault (core dumped)
The recommended solution seems to be to use "mark_unfound_lost revert"
but for our object, there is no previous version so I think this command
will discard and delete the object. The lost object is on our live
filesystem and there seems to be no easily way to find the backup
version as I can't access the path and or filename associated with the
object to recover it from the backup. Is there any way for us to recover
this object without discarding it? Or should we just accept our losses
and delete it?
Kindest regards,
Ivan Clayson
--
Ivan Clayson
-----------------
Scientific Computing Officer
Room 2N269
Structural Studies
MRC Laboratory of Molecular Biology
Francis Crick Ave, Cambridge
CB2 0QH
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx