On Mon, May 6, 2019 at 2:48 PM Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx> wrote: > > On 06/05/2019 13:50, Song Liu wrote: > > [...] > > IIUC, we need this for all raid types. Is it possible to fix that in md.c so > > all types get the fix? > > > > Thanks, > > Song > > > > Hi Song, thanks again for reviewing my code and provide input, much > appreciated! > > I understand this could in theory affects all the RAID levels, but in > practice I don't think it'll happen. RAID0 is the only "blind" mode of > RAID, in the sense it's the only one that doesn't care at all with > failures. In fact, this was the origin of my other thread [0], regarding > the change of raid0's behavior in error cases..because it currently does > not care with members being removed and rely only in filesystem failures > (after submitting many BIOs to the removed device). > > That said, in this change I've only took care of raid0, since in my > understanding the other levels won't submit BIOs to dead devices; we can > experiment that to see if it's true. Could you please run a quick test with raid5? I am wondering whether some race condition could get us into similar crash. If we cannot easily trigger the bug, we can process with this version. Thanks, Song > > But I'd be happy to change all other levels also if you think it's > appropriate (or a simple generic change to md.c if it is enough). Do you > think we could go ahead with this change, and further improve that (to > cover all raid cases if necessary)? > > Cheers, > > > Guilherme > > > > [0] https://marc.info/?l=linux-raid&m=155562509905735