Re: pnfs LD partial sector write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 26, 2012 at 3:47 PM, Boaz Harrosh <bharrosh@xxxxxxxxxxx> wrote:
> On 07/26/2012 05:43 AM, Peng Tao wrote:
>
>> Another thing is, this further complicates direct writes, where I
>> cannot use pagecache to ensure proper locking for concurrent writers
>> in the same BLOCK, and sector-aligned partial BLOCK DIO writes need to
>> be serialized internally. IOW, the same code cannot be reused by DIO
>> writes. sigh...
>>
>
>
> One last thing. Applications who use direct IO know to allocate
> and issue sector aligned requests both at offset and length.
> That's a Kernel requirement. It is not for NFS, but even so.
>
> Just refuse sector unaligned DIO and revert to MDS.
>
> With sector aligned IO you directly DIO to DIO pages,
> problem solved.
>
> If you need the COW of partial blocks, you still use
> page-cache pages, which is fine because they do not
> intersect any of the DIO.
>
I certainly thought about it, but it doesn't work for AIO DIO case.
Assuming BLOCK size is 8K, process A write to 0~4095 bytes of file foo
with AIO DIO, at the same time process B write to 4096~8191 with AIO
DIO at the same time. If kernel ever tries to reply on page cache to
cope with invalid extent, it ends up with data corruption.

This is a common problem for any extent based file system to deal with
partial BLOCK (_NOT SECTOR_) AIODIO writes. If you wonder why, take a
look at ext4_unaligned_aio() and all the ext4 AIODIO locking
mechanisms... And that's the reason I bailed out non-block aligned AIO
in previous DIO alignment patches. I think I should just keep the
AIODIO bailout logic since adding locking method is slowing down
writers while they can go locklessly through MDS. I will revive the
bailout patches after fixing the buffer IO things.

Cheers,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux