On Wed, 9 Dec 2015, Wei-Chung Cheng wrote: > Hi Loic, > > I try to reproduce this problem on my CentOS7. > I can not do the same issue. > This is my version: > ceph version 10.0.0-928-g8eb0ed1 (8eb0ed1dcda9ee6180a06ee6a4415b112090c534) > Would you describe more detail? > > > Hi David, Sage, > > In most of time, when we found the osd failure, the OSD is already in > `out` state. > It could not avoid the redundant data movement unless we could set the > osd noout when failure. > Is it right? (Means if OSD go into `out` state, it will make some > redundant data movement) > > Could we try the traditional spare behavior? (Set some disks backup > and auto replace the broken device?) > > That can replace the failure osd before it go into the `out` state. > Or we could always set the osd noout? I don't think there is a problem with 'out' if the osd id is reused and the crush position remains the same. And I expect usually the OSD will be replaced by a disk with a similar size. If the replacement is smaller (or 0--removed entirely) then you get double-movement, but if it's the same or larger I think it's fine. The sequence would be something like up + in down + in 5-10 minutes go by down + out (marked out by monitor) new replicas uniformly distributed across cluster days go by disk removed new disk inserted ceph-disk recreate ... recreates osd dir w/ the same id, new uuid on startup, osd adjusts crush weight (maybe.. usually by a smallish amount) up + in replicas migrate back to new device sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html