What are size and min_size for pool '7'... and why? On Fri, Apr 7, 2017 at 4:20 AM, David Welch <dwelch@xxxxxxxxxxxx> wrote: > Hi, > We had a disk on the cluster that was not responding properly and causing > 'slow requests'. The osd on the disk was stopped and the osd was marked down > and then out. Rebalancing succeeded but (some?) pgs from that osd are now > stuck in stale+active+clean state, which is not being resolved (see below > for query results). > > My question: is it better to mark this osd as "lost" (i.e. 'ceph osd lost > 14') or to remove the osd as detailed here: > https://www.sebastien-han.fr/blog/2015/12/11/ceph-properly-remove-an-osd/ > > Thanks, > David > > > $ ceph health detail > HEALTH_ERR 17 pgs are stuck inactive for more than 300 seconds; 17 pgs > stale; 17 pgs stuck stale > pg 7.f3 is stuck stale for 6138.330316, current state stale+active+clean, > last acting [14] > pg 7.bd is stuck stale for 6138.330365, current state stale+active+clean, > last acting [14] > pg 7.b6 is stuck stale for 6138.330374, current state stale+active+clean, > last acting [14] > pg 7.c5 is stuck stale for 6138.330363, current state stale+active+clean, > last acting [14] > pg 7.ac is stuck stale for 6138.330385, current state stale+active+clean, > last acting [14] > pg 7.5b is stuck stale for 6138.330678, current state stale+active+clean, > last acting [14] > pg 7.1b4 is stuck stale for 6138.330409, current state stale+active+clean, > last acting [14] > pg 7.182 is stuck stale for 6138.330445, current state stale+active+clean, > last acting [14] > pg 7.1f8 is stuck stale for 6138.330720, current state stale+active+clean, > last acting [14] > pg 7.53 is stuck stale for 6138.330697, current state stale+active+clean, > last acting [14] > pg 7.1d2 is stuck stale for 6138.330663, current state stale+active+clean, > last acting [14] > pg 7.70 is stuck stale for 6138.330742, current state stale+active+clean, > last acting [14] > pg 7.14f is stuck stale for 6138.330585, current state stale+active+clean, > last acting [14] > pg 7.23 is stuck stale for 6138.330610, current state stale+active+clean, > last acting [14] > pg 7.153 is stuck stale for 6138.330600, current state stale+active+clean, > last acting [14] > pg 7.cc is stuck stale for 6138.330409, current state stale+active+clean, > last acting [14] > pg 7.16b is stuck stale for 6138.330509, current state stale+active+clean, > last acting [14] > $ ceph pg dump_stuck stale > ok > pg_stat state up up_primary acting acting_primary > 7.f3 stale+active+clean [14] 14 [14] 14 > 7.bd stale+active+clean [14] 14 [14] 14 > 7.b6 stale+active+clean [14] 14 [14] 14 > 7.c5 stale+active+clean [14] 14 [14] 14 > 7.ac stale+active+clean [14] 14 [14] 14 > 7.5b stale+active+clean [14] 14 [14] 14 > 7.1b4 stale+active+clean [14] 14 [14] 14 > 7.182 stale+active+clean [14] 14 [14] 14 > 7.1f8 stale+active+clean [14] 14 [14] 14 > 7.53 stale+active+clean [14] 14 [14] 14 > 7.1d2 stale+active+clean [14] 14 [14] 14 > 7.70 stale+active+clean [14] 14 [14] 14 > 7.14f stale+active+clean [14] 14 [14] 14 > 7.23 stale+active+clean [14] 14 [14] 14 > 7.153 stale+active+clean [14] 14 [14] 14 > 7.cc stale+active+clean [14] 14 [14] 14 > 7.16b stale+active+clean [14] 14 [14] 14 > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com