Hi Eugen,
Many thanks for your reply.
The other two OSD's are up and running, and being used by other pgs with
no problem, for some reason this pg refuses to use these OSD's.
The other two OSDs that are missing from this pg crashed at different
times last month, each OSD crashed when we tried to fix a pg with
recovery_unfound by running a command like:
# ceph pg 5.3fa mark_unfound_lost delete
the osd crash is shown in the osd log file here
<https://www.mrc-lmb.cam.ac.uk/scicomp/ceph-osd.443.log.gz>
"mark_unfound_lost delete" occurs at line 3708
This caused the primary osd to crash with: PrimaryLogPG.cc: 11550:
FAILED ceph_assert(head_obc)
when the osd tries to restart, we see lots of log entries similar to:
-3> 2020-02-10 12:25:58.795 7f5935dfe700 1 get compressor lz4 =
0x55cd193d34a0
and...
-1274> 2020-02-10 12:23:24.661 7f5936e00700 5
bluestore(/var/lib/ceph/osd/ceph-443) _do_alloc_write 0x20000 bytes
compressed using lz4 failed with errcode = -1, leaving uncompressed
the osd then repeatedly crashes with "PrimaryLogPG.cc: 11550: FAILED
ceph_assert(head_obc)" but with no more "compressor lz4" entries
the only fix we found was to destroy & recreate the osd, and then allow
ceph to recover.
We thought that we could fix the small number of recovery unfound pgs by
allowing their primary OSD to crash, and then recreating it.
Unfortunately, while I was waiting for the pg to heal, we seem to have
got caught by another bug, as another osd in this pg got hit with
"OSD: FAILED ceph_assert(clone_size.count(clone))". this log is here:
<https://www.mrc-lmb.cam.ac.uk/scicomp/ceph-osd.287.log.gz>
the full "ceph pg dump" for this failed pg is
[root@ceph1 ~]# ceph pg dump | grep ^5.750
dumped all
5.750 190408 0 0 0 0
569643615603 0 0 3090
3090 down+remapped 2020-03-25
11:17:47.228805 35398'3381328 35968:3266057
[234,354,304,388,125,25,427,226,77,154] 234
[NONE,NONE,NONE,388,125,25,427,226,77,154] 388 24471'3200829
2020-01-28 15:48:35.574934 24471'3200829 2020-01-28 15:48:35.574934
I did notice this other LZ4 corruption bug:
https://tracker.ceph.com/issues/39525 not sure if there is any relation..
best regards,
Jake
On 25/03/2020 14:22, Eugen Block wrote:
Hi,
is there any chance to recover the other failing OSDs that seem to
have one chunk of this PG? Do the other OSDs fail with the same error?
Zitat von Jake Grimmett <jog@xxxxxxxxxxxxxxxxx>:
Dear All,
We are "in a bit of a pickle"...
No reply to my message (23/03/2020), subject "OSD: FAILED
ceph_assert(clone_size.count(clone))"
So I'm presuming it's not possible to recover the crashed OSD
This is bad news, as one pg may be lost, (we are using EC 8+2, pg
dump shows [NONE,NONE,NONE,388,125,25,427,226,77,154] )
Without this pg we have 1.8PB of broken cephfs.
I could rebuild the cluster from scratch, but this means no user
backups for a couple of weeks.
The cluster has 10 nodes, uses an EC 8:2 pool for cephfs data
(replicated NVMe metdata pool) and is running Nautilus 14.2.8
Clearly, it would be nicer if we could fix the OSD, but if this isn't
possible, can someone confirm that the right procedure to recover
from a corrupt pg is:
1) Stop all client access
2) find all files that store data on the bad pg, with:
# cephfs-data-scan pg_files /backup 5.750 2> /dev/null > /root/bad_files
3) delete all of these bad files - presumably using truncate? or is
"rm" fine?
4) destroy the bad pg
# ceph osd force-create-pg 5.750
5) Copy the missing files back with rsync or similar...
a better "recipe" or other advice gratefully received,
best regards,
Jake
****
Note: I am working from home until further notice.
For help, contact unixadmin@xxxxxxxxxxxxxxxxx
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
Phone 01223 267019
Mobile 0776 9886539
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
Note: I am working from home until further notice.
For help, contact unixadmin@xxxxxxxxxxxxxxxxx
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
Phone 01223 267019
Mobile 0776 9886539
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx