Re: [PATCH v5 1/2] Return bytes transferred for partial direct I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 01/20/2018 09:07 PM, Jens Axboe wrote:
> On 1/20/18 7:23 PM, Goldwyn Rodrigues wrote:
>>
>>
>> On 01/20/2018 08:11 PM, Andi Kleen wrote:
>>> Goldwyn Rodrigues <rgoldwyn@xxxxxxx> writes:
>>>
>>>> From: Goldwyn Rodrigues <rgoldwyn@xxxxxxxx>
>>>>
>>>> In case direct I/O encounters an error midway, it returns the error.
>>>> Instead it should be returning the number of bytes transferred so far.
>>>
>>> It's likely there's a lot of code in user space that does
>>>
>>>      if (write(..., N) < 0) handle error
>>>
>>> With your change it would need to be
>>>
>>>      if (write(..., N) != N) handle error
>>>
>>> How much code is actually doing that?
>>>
>>> I can understand it fixes your artifical test suite, but it seems to me your
>>> change has a high potential to break a lot of existing user code
>>> in subtle ways. So it seems to be a bad idea.
>>>
>>> -Andi
>>>
>>
>>
>> Quoting 'man 2 write':
>>
>> RETURN VALUE
>>  On  success,  the  number  of bytes written is returned (zero indicates
>>  nothing was written).  It is not an error if  this  number  is  smaller
>>  than the number of bytes requested; this may happen for example because
>>  the disk device was filled.  See also NOTES.
> 
> You can quote as much man page as you want - Andi is well aware of how
> read/write system call works, as I'm sure all of us are, that is not the
> issue. The issue is that there are potentially LOTS of applications out
> there that do not check for short writes, they do exactly what Andi
> speculated above. If you break it with this change, it doesn't matter
> what's in the man page. What matters is previous behavior, and that
> you are breaking user space. At that point nobody cares what's in the
> man page.
> 

Agree. So how do you think we should fix this to accommodate userspace
application who did not cater to the fact that write can return short
write, and still be consistent?

The only way I can think is that a DIO write should check early enough
that the write(N) will complete with N bytes without an error. Is it
possible to completely guarantee that?

Leaving it as it is incorrect as quoted in the artificial test case. You
should not be changing the file and yet conveying to the user an error
for the same write() call. It should either be an error and the file
contents are unchanged, or it should be change in contents and the write
size returned.


-- 
Goldwyn



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux