And unless you *need* a given ailing OSD to be up because it's the only copy of data, you may get better recovery/backfill results by stopping the service for that OSD entirely, so that the recovery reads all to to healthier OSDs. > On Oct 3, 2023, at 12:21, Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote: > > Hi Simon, > > If the OSD is actually up, using 'ceph osd down` will cause it to flap > but come back immediately. To prevent this, you would want to 'ceph > osd set noup'. However, I don't think this is what you actually want: > >> I'm thinking (but perhaps incorrectly?) that it would be good to keep the OSD down+in, to try to read from it as long as possible > > In this case, you actually want it up+out ('ceph osd out XXX'), though > if it's replicated then marking it out will switch primaries around so > that it's not actually read from anymore. It doesn't look like you > have that much recovery backfill left, so hopefully you'll be in a > clean state soon, though you'll have to deal with those 'inconsistent' > and 'recovery_unfound' PGs. > > Josh > > On Tue, Oct 3, 2023 at 10:14 AM Simon Oosthoek <s.oosthoek@xxxxxxxxxxxxx> wrote: >> >> Hi >> >> I'm trying to mark one OSD as down, so we can clean it out and replace >> it. It keeps getting medium read errors, so it's bound to fail sooner >> rather than later. When I command ceph from the mon to mark the osd >> down, it doesn't actually do it. When the service on the osd stops, it >> is also marked out and I'm thinking (but perhaps incorrectly?) that it >> would be good to keep the OSD down+in, to try to read from it as long as >> possible. Why doesn't it get marked down and stay that way when I >> command it? >> >> Context: Our cluster is in a bit of a less optimal state (see below), >> this is after one of OSD nodes had failed and took a week to get back up >> (long story). Due to a seriously unbalanced filling of our OSDs we kept >> having to reweight OSDs to keep below the 85% threshold. Several disks >> are starting to fail now (they're 4+ years old and failures are expected >> to occur more frequently). >> >> I'm open to suggestions to help get us back to health_ok more quickly, >> but I think we'll get there eventually anyway... >> >> Cheers >> >> /Simon >> >> ---- >> >> # ceph -s >> cluster: >> health: HEALTH_ERR >> 1 clients failing to respond to cache pressure >> 1/843763422 objects unfound (0.000%) >> noout flag(s) set >> 14 scrub errors >> Possible data damage: 1 pg recovery_unfound, 1 pg inconsistent >> Degraded data redundancy: 13795525/7095598195 objects >> degraded (0.194%), 13 pgs degraded, 12 pgs undersized >> 70 pgs not deep-scrubbed in time >> 65 pgs not scrubbed in time >> >> services: >> mon: 3 daemons, quorum cephmon3,cephmon1,cephmon2 (age 11h) >> mgr: cephmon3(active, since 35h), standbys: cephmon1 >> mds: 1/1 daemons up, 1 standby >> osd: 264 osds: 264 up (since 2m), 264 in (since 75m); 227 remapped pgs >> flags noout >> rgw: 8 daemons active (4 hosts, 1 zones) >> >> data: >> volumes: 1/1 healthy >> pools: 15 pools, 3681 pgs >> objects: 843.76M objects, 1.2 PiB >> usage: 2.0 PiB used, 847 TiB / 2.8 PiB avail >> pgs: 13795525/7095598195 objects degraded (0.194%) >> 54839263/7095598195 objects misplaced (0.773%) >> 1/843763422 objects unfound (0.000%) >> 3374 active+clean >> 195 active+remapped+backfill_wait >> 65 active+clean+scrubbing+deep >> 20 active+remapped+backfilling >> 11 active+clean+snaptrim >> 10 active+undersized+degraded+remapped+backfill_wait >> 2 active+undersized+degraded+remapped+backfilling >> 2 active+clean+scrubbing >> 1 active+recovery_unfound+degraded >> 1 active+clean+inconsistent >> >> progress: >> Global Recovery Event (8h) >> [==========================..] (remaining: 2h) >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx