Re: PG bottlenecks

Sage Weil <sage@xxxxxxxxxxxx> · Mon, 25 Mar 2019 10:30:56 +0000 (UTC)

Try 'ceph osd down 1 2 3 4 5'.

I'm guessing you're triggering a hard to hit race in the messenger that's 
preventing the OSDs from talking.

s

On Mon, 25 Mar 2019, Rafał Wądołowski wrote:

> root@ceph-ceph1:/var/log/ceph# ceph osd blocked-by
> osd num_blocked
>  32           1
>  12           1
>  14           1
>  37           1
>  23           1
>  51           2
>  46           1
>  19           1
>   9           1
>  33           1
>  24           2
>  44           2
>  53           7
>  50           2
>  35           1
>  48           4
>  45           2
>  49           2
>   5        2803
>   3        2781
>   2        2785
>   1        2694
>   4        2753
> 
> 
> That's interesting.. I checked logs of osd.5 but it looks normal
> 
> 
> Best Regards,
> 
> Rafał Wądołowski
> 
> On 25.03.2019 11:11, Sage Weil wrote:
> > What does 'ceph osd blocked-by' show?
> >
> > On Mon, 25 Mar 2019, Rafał Wądołowski wrote:
> >
> >> This issue happened a week ago, so I don't have output from pg query.
> >>
> >> Now on the test cluster I am observing similiar problems. Output from
> >> query in attachment.
> >>
> >>   data:
> >>     pools:   5 pools, 32800 pgs
> >>     objects: 11.53E objects, 62.2GiB
> >>     usage:   176GiB used, 1.05TiB / 1.23TiB avail
> >>     pgs:     30.899% pgs not active
> >>              20193 active+clean
> >>              7573  activating+degraded
> >>              2525  activating
> >>              2460  active+recovery_wait+degraded
> >>              14    remapped+peering
> >>              11    down
> >>              5     activating+degraded+remapped
> >>              4     activating+remapped
> >>              3     active+recovery_wait+degraded+remapped
> >>              2     stale+active+clean
> >>              2     peering
> >>              2     active+clean+remapped
> >>              1     active+undersized+degraded
> >>              1     activating+undersized+degraded
> >>              1     active+recovery_wait+undersized+degraded
> >>              1     active+recovery_wait
> >>              1     active+recovering
> >>              1     active+recovering+degraded
> >>
> >> pool 5 'test' erasure size 6 min_size 4 crush_rule 1 object_hash
> >> rjenkins pg_num 32768 pgp_num 32768 last_change 153 lfor 0/150 flags
> >> hashpspool stripe_width 16384 application rbd
> >>
> >> It looks that cluster is blocked by something... This cluster is 12.2.11
> >>
> >>
> >> Best Regards,
> >>
> >> Rafał Wądołowski
> >>
> >> On 25.03.2019 10:56, Sage Weil wrote:
> >>> On Mon, 25 Mar 2019, Rafał Wądołowski wrote:
> >>>> Hi,
> >>>>
> >>>> On one of our cluster (3400 OSD, ~25PB, 12.2.4), we incremented pg_num &
> >>>> pgp_num on one pool (EC 4+2) from 32k to 64k. After that cluster started
> >>>> to be instable for one hour, pgs were inactive (some activating, some
> >>>> peering).
> >>>>
> >>>> Any idea what bottlenecks we hit? Any ideas what should I change in
> >>>> configuration of ceph/os ?
> >>> Could be lots of things. 
> >>>
> >>> What does 'ceph tell <pgid> query' show for one of the activating or 
> >>> peering pgs?
> >>>
> >>> Note that you're moving ~half of hte data around in yoru cluster with that 
> >>> change, so you will see each of those PGs cycle through backfill -> 
> >>> peering -> activating -> active in the course of it moving.
> >>>
> >>> sage
> _______________________________________________
> Ceph-large mailing list
> Ceph-large@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com
> 
_______________________________________________
Ceph-large mailing list
Ceph-large@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com