On Wed, Jun 21 2017, Shaohua Li wrote: > On Wed, Jun 21, 2017 at 10:09:08AM -0400, Mikulas Patocka wrote: >> >> >> On Mon, 19 Jun 2017, Shaohua Li wrote: >> >> > > Write errors only get back to the application if it calls fsync(), and >> > > many don't do that. Write errors can easily cause a filesystem to go >> > > read-only, and require an fsck. I think we should be very cautious >> > > about triggering write errors. >> > > >> > > NFS will hang indefinitely rather then return an error if the server is >> > > not available. That can certainly be annoying, but the alternative has >> > > been tried, and it leads to random data corruption. >> > > The two cases are only comparable at a very high level, but I think >> > > this result should encourage substantial caution. >> > >> > It's hard to say if an IO error or an infinite wait is better, but since there >> > is better option in this case, I don't want to argue. I'll repost a patch to >> > reset suspend range after a timeout, assume this is your suggestion. >> > >> > Thanks, >> > Shaohua >> >> Automatically resetting the suspend range could result in data corruption, >> so it is even worse than a deadlock. > > depending on how you look at this. a deadlock means you will eventually hard > reset the system, and that will result in data corruption. But in that circumstance (purely hypothetical at this stage) the sysadmin knows that something has gone wrong. In the other, they might not. Invisible data corruption is much worse that visible. The suspend functionality is used by user-space programs. If you think the current interface is not ideal, maybe it would be better to design an interface that a user-space program can use which is safe to use, but also fails safe. Maybe it could give the kernel a PID with the meaning "if you have to invalidate my suspend request, please kill the pid first". That would have risks of its own of course. The suspend interface is currently used: - to enable backup of a region which it is being reshaped in-place. As time goes on, this will be used less and less as the change-data-offset approach to reshape doesn't need this. - to stablise a region while raid6check performs parity calculations and possibly "corrects" one block. You at least need to analyse the failure modes of these before you can justify making any change to the interface they use. But again, I really don't think there is a problem that needs fixing. NeilBrown
Attachment:
signature.asc
Description: PGP signature