On Wed, Jun 21, 2017 at 10:09:08AM -0400, Mikulas Patocka wrote: > > > On Mon, 19 Jun 2017, Shaohua Li wrote: > > > > Write errors only get back to the application if it calls fsync(), and > > > many don't do that. Write errors can easily cause a filesystem to go > > > read-only, and require an fsck. I think we should be very cautious > > > about triggering write errors. > > > > > > NFS will hang indefinitely rather then return an error if the server is > > > not available. That can certainly be annoying, but the alternative has > > > been tried, and it leads to random data corruption. > > > The two cases are only comparable at a very high level, but I think > > > this result should encourage substantial caution. > > > > It's hard to say if an IO error or an infinite wait is better, but since there > > is better option in this case, I don't want to argue. I'll repost a patch to > > reset suspend range after a timeout, assume this is your suggestion. > > > > Thanks, > > Shaohua > > Automatically resetting the suspend range could result in data corruption, > so it is even worse than a deadlock. depending on how you look at this. a deadlock means you will eventually hard reset the system, and that will result in data corruption. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html