Re: [PATCH] md: make suspend range wait timed out

Shaohua Li <shli@xxxxxxxxxx> · Wed, 21 Jun 2017 09:07:04 -0700



On Wed, Jun 21, 2017 at 10:09:08AM -0400, Mikulas Patocka wrote:
> 
> 
> On Mon, 19 Jun 2017, Shaohua Li wrote:
> 
> > > Write errors only get back to the application if it calls fsync(), and
> > > many don't do that.  Write errors can easily cause a filesystem to go
> > > read-only, and require an fsck.  I think we should be very cautious
> > > about triggering write errors.
> > > 
> > > NFS will hang indefinitely rather then return an error if the server is
> > > not available.  That can certainly be annoying, but the alternative has
> > > been tried, and it leads to random data corruption.
> > > The two cases are only comparable at a very high level, but I think
> > > this result should encourage substantial caution.
> > 
> > It's hard to say if an IO error or an infinite wait is better, but since there
> > is better option in this case, I don't want to argue. I'll repost a patch to
> > reset suspend range after a timeout, assume this is your suggestion.
> > 
> > Thanks,
> > Shaohua
> 
> Automatically resetting the suspend range could result in data corruption, 
> so it is even worse than a deadlock.

depending on how you look at this. a deadlock means you will eventually hard
reset the system, and that will result in data corruption.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html