On Thu, Oct 12, 2017 at 7:56 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > On Thu, Oct 12, 2017 at 10:52 AM Florian Haas <florian@xxxxxxxxxxx> wrote: >> >> On Thu, Oct 12, 2017 at 7:22 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> >> wrote: >> > >> > >> > On Thu, Oct 12, 2017 at 3:50 AM Florian Haas <florian@xxxxxxxxxxx> >> > wrote: >> >> >> >> On Mon, Sep 11, 2017 at 8:13 PM, Andreas Herrmann <andreas@xxxxxxxx> >> >> wrote: >> >> > Hi, >> >> > >> >> > how could this happen: >> >> > >> >> > pgs: 197528/1524 objects degraded (12961.155%) >> >> > >> >> > I did some heavy failover tests, but a value higher than 100% looks >> >> > strange >> >> > (ceph version 12.2.0). Recovery is quite slow. >> >> > >> >> > cluster: >> >> > health: HEALTH_WARN >> >> > 3/1524 objects misplaced (0.197%) >> >> > Degraded data redundancy: 197528/1524 objects degraded >> >> > (12961.155%), 1057 pgs unclean, 1055 pgs degraded, 3 pgs undersized >> >> > >> >> > data: >> >> > pools: 1 pools, 2048 pgs >> >> > objects: 508 objects, 1467 MB >> >> > usage: 127 GB used, 35639 GB / 35766 GB avail >> >> > pgs: 197528/1524 objects degraded (12961.155%) >> >> > 3/1524 objects misplaced (0.197%) >> >> > 1042 active+recovery_wait+degraded >> >> > 991 active+clean >> >> > 8 active+recovering+degraded >> >> > 3 active+undersized+degraded+remapped+backfill_wait >> >> > 2 active+recovery_wait+degraded+remapped >> >> > 2 active+remapped+backfill_wait >> >> > >> >> > io: >> >> > recovery: 340 kB/s, 80 objects/s >> >> >> >> Did you ever get to the bottom of this? I'm seeing something very >> >> similar on a 12.2.1 reference system: >> >> >> >> https://gist.github.com/fghaas/f547243b0f7ebb78ce2b8e80b936e42c >> >> >> >> I'm also seeing an unusual MISSING_ON_PRIMARY count in "rados df": >> >> https://gist.github.com/fghaas/59cd2c234d529db236c14fb7d46dfc85 >> >> >> >> The odd thing in there is that the "bench" pool was empty when the >> >> recovery started (that pool had been wiped with "rados cleanup"), so >> >> the number of objects deemed to be missing from the primary really >> >> ought to be zero. >> >> >> >> It seems like it's considering these deleted objects to still require >> >> replication, but that sounds rather far fetched to be honest. >> > >> > >> > Actually, that makes some sense. This cluster had an OSD down while >> > (some >> > of) the deletes were happening? >> >> I thought of exactly that too, but no it didn't. That's the problem. > > > Okay, in that case I've no idea. What was the timeline for the recovery > versus the rados bench and cleanup versus the degraded object counts, then? 1. Jewel deployment with filestore. 2. Upgrade to Luminous (including mgr deployment and "ceph osd require-osd-release luminous"), still on filestore. 3. rados bench with subsequent cleanup. 4. All OSDs up, all PGs active+clean. 5. Stop one OSD. Remove from CRUSH, auth list, OSD map. 6. Reinitialize OSD with bluestore. 7. Start OSD, commencing backfill. 8. Degraded objects above 100%. Please let me know if that information is useful. Thank you! Cheers, Florian _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com