Re: ceph-objectstore-tool core dump

Dave Hall <kdhall@xxxxxxxxxxxxxx> · Sun, 3 Oct 2021 13:45:54 -0400

Hello,

I have also recently dealt with a couple inconsistent PGs - EC 8+2 on 12TB
HDDs.

In one case, 'ceph pg repair' was able to clear the issue.  In a second
case it would not do so without other intervention.

As I found documented, I used 'ceph health detail' to locate the problem
PG, and then
'rados list-inconsistent-obj {pg-num} --format=json-pretty | less' to
examine the specific errors on the specific inconsistent objects.

For the second case, the above rados command indicated that one shard had
experienced a 'Read Error'.  Since this was a +2 EC, the object was still
intact, but 'pg repair' wouldn't clear the issue.  In the end, we copied
the object out, deleted it, and then copied it back in.

Note that after doing this, the PG still showed 'inconsistent', but 'rados
list-inconsistent-obj' no longer showed any errors.  In order to finally
clear the 'inconsistent' status we had to run another 'pg repair' after the
object repair.

Since then all is good.

-Dave

--
Dave Hall
Binghamton University
kdhall@xxxxxxxxxxxxxx

On Sun, Oct 3, 2021 at 1:09 PM 胡 玮文 <huww98@xxxxxxxxxxx> wrote:

>
> > 在 2021年10月4日，00:53，Michael Thomas <wart@xxxxxxxxxxx> 写道：
> >
> > I recently started getting inconsistent PGs in my Octopus (15.2.14)
> ceph cluster.  I was able to determine that they are all coming from the
> same OSD: osd.143.  This host recently suffered from an unplanned power
> loss, so I'm not surprised that there may be some corruption.  This PG is
> part of a EC 8+2 pool.
> >
> > The OSD logs from the PG's primary OSD show this and similar errors from
> the PG's most recent deep scrub:
> >
> > 2021-10-03T03:25:25.969-0500 7f6e6801f700 -1 log_channel(cluster) log
> [ERR] : 23.1fa shard 143(1) soid 23:5f8c3d4e:::10000179969.00000168:head :
> candidate had a read error
> >
> > In attempting to fix it, I first ran 'ceph pg repair 23.1fa' on the PG.
> This accomplished nothing.  Next I ran a shallow fsck on the OSD:
>
> I expect this ‘ceph pg repair’ command could handle this kind of errors.
> After issuing this command, the pg should enter a state like
> “active+clean+scrubbing+deep+inconsistent+repair”, then you wait for the
> repair to finish (this can take hours), and you should be able to recover
> from the inconsistent state. What do you mean by “This accomplished
> nothing”?
>
> > # ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-143
> > fsck success
> >
> > I estimated that a deep fsck will take ~24 hours to run on this mostly
> full 16TB HDD.  Before doing that, I wanted to see if I could simply remove
> the offending object and let ceph recover itself.  Unfortunately,
> ceph-objectstore-tool core dumps when I try to remove this object:
> >
> > # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-143 --pgid
> 23.1fa
> '{"oid":"10000179969.00000168","key":"","snapid":-2,"hash":1924936186,"max":0,"pool":23,"namespace":"","shard_id":1,"max":0}'
> remove
> > *** Caught signal (Segmentation fault) **
> > in thread 7fdc491a88c0 thread_name:ceph-objectstor
> > ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus
> (stable)
> > 1: (()+0xf630) [0x7fdc3e62a630]
> > 2: (__pthread_rwlock_rdlock()+0xb) [0x7fdc3e62614b]
> > 3:
> (BlueStore::collection_bits(boost::intrusive_ptr<ObjectStore::CollectionImpl>&)+0x148)
> [0x5583c8fa7878]
> > 4: (main()+0x4b50) [0x5583c8a85270]
> > 5: (__libc_start_main()+0xf5) [0x7fdc3cfe7555]
> > 6: (()+0x39d3a0) [0x5583c8ab03a0]
> > Segmentation fault (core dumped)
> >
> > As a last resort, I know that I can map this OID back to the cephfs file
> and simply remove/restore the offending file to fix the object.  But before
> I do that, I'm running a deep fsck to see if that can fix this and the
> other inconsistent objects.  In the meantime, I wondered if there was
> anything else I could do to clean up this inconsistent PG?
> >
> > --Mike
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx