Re: jewel - rgw blocked on deep-scrub of bucket index pg

Wido den Hollander <wido@xxxxxxxx> · Sat, 6 May 2017 09:25:15 +0200 (CEST)

> Op 5 mei 2017 om 10:33 schreef Sam Wouters <sam@xxxxxxxxx>:
> 
> 
> Hi,
> 
> we have a small cluster running on jewel 10.2.7; NL-SAS disks only, osd
> data and journal co located on the disks; main purpose rgw secondary zone.
> 
> Since the upgrade to jewel, whenever a deep scrub starts on one of the
> rgw index pool pg's, slow requests start piling up and rgw requests are
> blocked after some hours.
> The deep-scrub doesn't seem to finish (still running after +11 hours)
> and only escape I found so far is a restart of the primary osd holding
> the pg.
> 
> Maybe important to know, we have some large rgw buckets regarding
> #objects (+ 3 million) with only index sharding of 8.
> 
> scrub related settings:
> osd scrub sleep = 0.1

Try removing this line, it can block threads under Jewel.

See how that works out.

Wido

> osd scrub during recovery = False
> osd scrub priority = 1
> osd deep scrub stride = 1048576
> osd scrub chunk min = 1
> osd scrub chunk max = 1
> 
> Any help on debugging / resolving would be very much appreciated...
> 
> regards,
> Sam
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com