Re: multiple pgs down with all disks online

Kári Bertilsson <karibertils@xxxxxxxxx> · Tue, 5 Nov 2019 19:36:32 +0000

This turned out to be because another OSD (no 90) went missing, and the PG wanted to query the missing OSD for unfound object even if this OSD was not part of the PG.

I found this out by doing `ceph pg query 41.3db`

Marking OSD 90 as lost resolved this issue and all PG's became active. A tiny bit of the most recently uploaded data was lost but that was a non-issue in this case.

On Sun, Nov 3, 2019 at 7:34 PM Martin Verges <martin.verges@xxxxxxxx> wrote:
We had this with older Ceph versions, maybe just try to restart all OSDs of affected PGs.
--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.verges@xxxxxxxx
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am So., 3. Nov. 2019 um 20:13 Uhr schrieb Kári Bertilsson <karibertils@xxxxxxxxx>:
    pgs:     14.377% pgs not active
             3749681/537818808 objects misplaced (0.697%)
             810 active+clean
             156 down
             124 active+remapped+backfilling
             1   active+remapped+backfill_toofull
             1   down+inconsistent

when looking at the down pg's all disks are online

41.3db   53775        0         0       0 401643186092           0          0 3044              down    6m 161222'303144 162913:4630171     [32,96,128,115,86,129,113,124,57,109]p32     [32,96,128,115,86,129,113,124,57,109]p32 2019-11-03

Any way to see why the pg is down ?
_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx

To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx