On 1/22/18 12:10 PM, Andreas Dilger wrote: > >> On Jan 20, 2018, at 8:07 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 1/20/18 7:23 PM, Goldwyn Rodrigues wrote: >>> >>> >>> On 01/20/2018 08:11 PM, Andi Kleen wrote: >>>> >>>> It's likely there's a lot of code in user space that does >>>> >>>> if (write(..., N) < 0) handle error >>>> >>>> With your change it would need to be >>>> >>>> if (write(..., N) != N) handle error >>>> >>>> How much code is actually doing that? >>>> >>>> I can understand it fixes your artifical test suite, but it seems to me your >>>> change has a high potential to break a lot of existing user code >>>> in subtle ways. So it seems to be a bad idea. >>> >>> >>> Quoting 'man 2 write': >>> >>> RETURN VALUE >>> On success, the number of bytes written is returned (zero indicates >>> nothing was written). It is not an error if this number is smaller >>> than the number of bytes requested; this may happen for example because >>> the disk device was filled. See also NOTES. >> >> You can quote as much man page as you want - Andi is well aware of how >> read/write system call works, as I'm sure all of us are, that is not the >> issue. The issue is that there are potentially LOTS of applications out >> there that do not check for short writes, they do exactly what Andi >> speculated above. If you break it with this change, it doesn't matter >> what's in the man page. What matters is previous behavior, and that >> you are breaking user space. At that point nobody cares what's in the >> man page. > > Applications that don't handle partial writes are already broken with > buffered I/O, this change doesn't really make them more broken. Not disagreeing that they are broken, of course they are. But if you've been doing direct IO and not seeing short writes, then this could break your application. And if that's the case, it doesn't really help to say that their application was "already broken". I'd hate for a kernel upgrade to break them. I do wish we could make the change, and maybe we can. But it probably needs some safe guard proc entry to toggle the behavior, something we can drop in a few years when we're confident it won't break real applications. -- Jens Axboe