Re: PG stuck incomplete

Olivier Bonvalet <ceph.list@xxxxxxxxx> · Fri, 21 Sep 2018 21:10:49 +0200

Le vendredi 21 septembre 2018 à 19:45 +0200, Paul Emmerich a écrit :
> The cache tiering has nothing to do with the PG of the underlying
> pool
> being incomplete.
> You are just seeing these requests as stuck because it's the only
> thing trying to write to the underlying pool.

I agree, It was just to be sure that the problems on OSD 32, 68 and 69
are related to only one "real" problem.

> What you need to fix is the PG showing incomplete.  I assume you
> already tried reducing the min_size to 4 as suggested? Or did you by
> chance always run with min_size 4 on the ec pool, which is a common
> cause for problems like this.

Yes, it has always run with min_size 4.

We use Luminous 12.2.8 here, but some (~40%) OSD still run Luminous
12.2.7. I was hoping to "fix" this problem before to continue
upgrading.

pool details :

pool 37 'bkp-foo-raid6' erasure size 6 min_size 4 crush_rule 20
object_hash rjenkins pg_num 256 pgp_num 256 last_change 585715 lfor
585714/585714 flags hashpspool,backfillfull stripe_width 4096 fast_read
1 application rbd
	removed_snaps [1~3]

> Can you share the output of "ceph osd pool ls detail"?
> Also, which version of Ceph are you running?
> Paul
> 
> Am Fr., 21. Sep. 2018 um 19:28 Uhr schrieb Olivier Bonvalet
> <ceph.list@xxxxxxxxx>:
> > 
> > So I've totally disable cache-tiering and overlay. Now OSD 68 & 69
> > are
> > fine, no more blocked.
> > 
> > But OSD 32 is still blocked, and PG 37.9c still marked incomplete
> > with
> > :
> > 
> >     "recovery_state": [
> >         {
> >             "name": "Started/Primary/Peering/Incomplete",
> >             "enter_time": "2018-09-21 18:56:01.222970",
> >             "comment": "not enough complete instances of this PG"
> >         },
> > 
> > But I don't see blocked requests in OSD.32 logs, should I increase
> > one
> > of the "debug_xx" flag ?
> > 
> > 
> > Le vendredi 21 septembre 2018 à 16:51 +0200, Maks Kowalik a écrit :
> > > According to the query output you pasted shards 1 and 2 are
> > > broken.
> > > But, on the other hand EC profile (4+2) should make it possible
> > > to
> > > recover from 2 shards lost simultanously...
> > > 
> > > pt., 21 wrz 2018 o 16:29 Olivier Bonvalet <ceph.list@xxxxxxxxx>
> > > napisał(a):
> > > > Well on drive, I can find thoses parts :
> > > > 
> > > > - cs0 on OSD 29 and 30
> > > > - cs1 on OSD 18 and 19
> > > > - cs2 on OSD 13
> > > > - cs3 on OSD 66
> > > > - cs4 on OSD 0
> > > > - cs5 on OSD 75
> > > > 
> > > > And I can read thoses files too.
> > > > 
> > > > And all thoses OSD are UP and IN.
> > > > 
> > > > 
> > > > Le vendredi 21 septembre 2018 à 13:10 +0000, Eugen Block a
> > > > écrit :
> > > > > > > I tried to flush the cache with "rados -p cache-bkp-foo
> > > > 
> > > > cache-
> > > > > > > flush-
> > > > > > > evict-all", but it blocks on the object
> > > > > > > "rbd_data.f66c92ae8944a.00000000000f2596".
> > > > > 
> > > > > This is the object that's stuck in the cache tier (according
> > > > > to
> > > > > your
> > > > > output in https://pastebin.com/zrwu5X0w). Can you verify if
> > > > > that
> > > > > block
> > > > > device is in use and healthy or is it corrupt?
> > > > > 
> > > > > 
> > > > > Zitat von Maks Kowalik <maks_kowalik@xxxxxxxxx>:
> > > > > 
> > > > > > Could you, please paste the output of pg 37.9c query
> > > > > > 
> > > > > > pt., 21 wrz 2018 o 14:39 Olivier Bonvalet <
> > > > > > ceph.list@xxxxxxxxx>
> > > > > > napisał(a):
> > > > > > 
> > > > > > > In fact, one object (only one) seem to be blocked on the
> > > > 
> > > > cache
> > > > > > > tier
> > > > > > > (writeback).
> > > > > > > 
> > > > > > > I tried to flush the cache with "rados -p cache-bkp-foo
> > > > 
> > > > cache-
> > > > > > > flush-
> > > > > > > evict-all", but it blocks on the object
> > > > > > > "rbd_data.f66c92ae8944a.00000000000f2596".
> > > > > > > 
> > > > > > > So I reduced (a lot) the cache tier to 200MB, "rados -p
> > > > 
> > > > cache-
> > > > > > > bkp-foo
> > > > > > > ls" now show only 3 objects :
> > > > > > > 
> > > > > > >     rbd_directory
> > > > > > >     rbd_data.f66c92ae8944a.00000000000f2596
> > > > > > >     rbd_header.f66c92ae8944a
> > > > > > > 
> > > > > > > And "cache-flush-evict-all" still hangs.
> > > > > > > 
> > > > > > > I also switched the cache tier to "readproxy", to avoid
> > > > > > > using
> > > > > > > this
> > > > > > > cache. But, it's still blocked.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Le vendredi 21 septembre 2018 à 02:14 +0200, Olivier
> > > > > > > Bonvalet
> > > > 
> > > > a
> > > > > > > écrit :
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > on a Luminous cluster, I have a PG incomplete and I
> > > > > > > > can't
> > > > 
> > > > find
> > > > > > > > how to
> > > > > > > > fix that.
> > > > > > > > 
> > > > > > > > It's an EC pool (4+2) :
> > > > > > > > 
> > > > > > > >     pg 37.9c is incomplete, acting [32,50,59,1,0,75]
> > > > 
> > > > (reducing
> > > > > > > > pool
> > > > > > > > bkp-sb-raid6 min_size from 4 may help; search
> > > > > > > > ceph.com/docs
> > > > 
> > > > for
> > > > > > > > 'incomplete')
> > > > > > > > 
> > > > > > > > Of course, we can't reduce min_size from 4.
> > > > > > > > 
> > > > > > > > And the full state : https://pastebin.com/zrwu5X0w
> > > > > > > > 
> > > > > > > > So, IO are blocked, we can't access thoses damaged
> > > > > > > > data.
> > > > > > > > OSD blocks too :
> > > > > > > >     osds 32,68,69 have stuck requests > 4194.3 sec
> > > > > > > > 
> > > > > > > > OSD 32 is the primary of this PG.
> > > > > > > > And OSD 68 and 69 are for cache tiering.
> > > > > > > > 
> > > > > > > > Any idea how can I fix that ?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > 
> > > > > > > > Olivier
> > > > > > > > 
> > > > > > > > 
> > > > > > > > _______________________________________________
> > > > > > > > ceph-users mailing list
> > > > > > > > ceph-users@xxxxxxxxxxxxxx
> > > > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > > > > > > 
> > > > > > > 
> > > > > > > _______________________________________________
> > > > > > > ceph-users mailing list
> > > > > > > ceph-users@xxxxxxxxxxxxxx
> > > > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > ceph-users mailing list
> > > > > ceph-users@xxxxxxxxxxxxxx
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > > 
> > > > _______________________________________________
> > > > ceph-users mailing list
> > > > ceph-users@xxxxxxxxxxxxxx
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com