> The only way I can think is that a DIO write should check early enough > that the write(N) will complete with N bytes without an error. Is it > possible to completely guarantee that? Probably not. > > Leaving it as it is incorrect as quoted in the artificial test case. You > should not be changing the file and yet conveying to the user an error > for the same write() call. It should either be an error and the file > contents are unchanged, or it should be change in contents and the write > size returned. There are already lots of syscall cases that don't completely undo changes on error handling. Fixing that would likely require a transaction system higher level in the kernel, or lots of complicated code everywhere, which is unlikely to happen. Also the complicated code would be difficult to test, and likely bit rot over time, because it would be only an error handling path that is infrequently exercised. So yes it's not nice, but the alternative would be worse. I think we're best of leaving it as it is now. Adding some comments/documentation to explain this would be good though. Perhaps you could submit a patch to the manpage? -Andi