Try 'ceph osd down 1 2 3 4 5'. I'm guessing you're triggering a hard to hit race in the messenger that's preventing the OSDs from talking. s On Mon, 25 Mar 2019, Rafał Wądołowski wrote: > root@ceph-ceph1:/var/log/ceph# ceph osd blocked-by > osd num_blocked > 32 1 > 12 1 > 14 1 > 37 1 > 23 1 > 51 2 > 46 1 > 19 1 > 9 1 > 33 1 > 24 2 > 44 2 > 53 7 > 50 2 > 35 1 > 48 4 > 45 2 > 49 2 > 5 2803 > 3 2781 > 2 2785 > 1 2694 > 4 2753 > > > That's interesting.. I checked logs of osd.5 but it looks normal > > > Best Regards, > > Rafał Wądołowski > > On 25.03.2019 11:11, Sage Weil wrote: > > What does 'ceph osd blocked-by' show? > > > > On Mon, 25 Mar 2019, Rafał Wądołowski wrote: > > > >> This issue happened a week ago, so I don't have output from pg query. > >> > >> Now on the test cluster I am observing similiar problems. Output from > >> query in attachment. > >> > >> data: > >> pools: 5 pools, 32800 pgs > >> objects: 11.53E objects, 62.2GiB > >> usage: 176GiB used, 1.05TiB / 1.23TiB avail > >> pgs: 30.899% pgs not active > >> 20193 active+clean > >> 7573 activating+degraded > >> 2525 activating > >> 2460 active+recovery_wait+degraded > >> 14 remapped+peering > >> 11 down > >> 5 activating+degraded+remapped > >> 4 activating+remapped > >> 3 active+recovery_wait+degraded+remapped > >> 2 stale+active+clean > >> 2 peering > >> 2 active+clean+remapped > >> 1 active+undersized+degraded > >> 1 activating+undersized+degraded > >> 1 active+recovery_wait+undersized+degraded > >> 1 active+recovery_wait > >> 1 active+recovering > >> 1 active+recovering+degraded > >> > >> pool 5 'test' erasure size 6 min_size 4 crush_rule 1 object_hash > >> rjenkins pg_num 32768 pgp_num 32768 last_change 153 lfor 0/150 flags > >> hashpspool stripe_width 16384 application rbd > >> > >> It looks that cluster is blocked by something... This cluster is 12.2.11 > >> > >> > >> Best Regards, > >> > >> Rafał Wądołowski > >> > >> On 25.03.2019 10:56, Sage Weil wrote: > >>> On Mon, 25 Mar 2019, Rafał Wądołowski wrote: > >>>> Hi, > >>>> > >>>> On one of our cluster (3400 OSD, ~25PB, 12.2.4), we incremented pg_num & > >>>> pgp_num on one pool (EC 4+2) from 32k to 64k. After that cluster started > >>>> to be instable for one hour, pgs were inactive (some activating, some > >>>> peering). > >>>> > >>>> Any idea what bottlenecks we hit? Any ideas what should I change in > >>>> configuration of ceph/os ? > >>> Could be lots of things. > >>> > >>> What does 'ceph tell <pgid> query' show for one of the activating or > >>> peering pgs? > >>> > >>> Note that you're moving ~half of hte data around in yoru cluster with that > >>> change, so you will see each of those PGs cycle through backfill -> > >>> peering -> activating -> active in the course of it moving. > >>> > >>> sage > _______________________________________________ > Ceph-large mailing list > Ceph-large@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com >
_______________________________________________ Ceph-large mailing list Ceph-large@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-large-ceph.com