On Wed, 20 Jan 2016, Xiaoxi Chen wrote: > Hi, > > In many case we need to tag some OSD with NODOWN/NOOUT/NOUP/NOIN > tag, but we dont want it cluster wise as these tag may stop other OSDs > doing self-healthing.As a an example when an recovered OSD need to > catch up with the OSDMap, to prevent flipping we set > NODOWN/NOOUT/NOUP, but if other OSD failed by disk error, the failure > will be hidden and we are in the risk of lossing the data. > > Is that reasonable to have these flag work in OSD granularity? > say ceph osd nodown osd.xxx? > Quick look at the code seems NODOWN/NOUP is easier as we could > have new status bits in OSDMap > /* status bits */ > #define CEPH_OSD_EXISTS (1<<0) > #define CEPH_OSD_UP (1<<1) > #define CEPH_OSD_AUTOOUT (1<<2) /* osd was automatically marked out */ > #define CEPH_OSD_NEW (1<<3) /* osd is new, never marked in */ > > #define CEPH_OSD_NOUP (1<<4) /* osd cannot be marked in */ > #define CEPH_OSD_NODOWN (1<<5) /* osd cannot be marked out */ > > But for NOIN/NOOUT seems a bit struggle as IN/OUT depends on > weight? Any suggestion? This looks reasonable if we can sort out a good interface and suitable health warnings. For example, ceph health and ceph -s should say "N osds have noin set", and 'ceph health detail' should tell you which ones. Maybe something like ceph osd set-osd osd.123 noin ? I don't particularly like that but we can't do 'ceph osd set ...' since that does global osdmap flags. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html