Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/12/19 3:44 AM, Martin Steigerwald wrote:
> Hi Jens.
> 
> Jens Axboe - 11.12.19, 16:29:38 CET:
>> Recently someone asked me how io_uring buffered IO compares to mmaped
>> IO in terms of performance. So I ran some tests with buffered IO, and
>> found the experience to be somewhat painful. The test case is pretty
>> basic, random reads over a dataset that's 10x the size of RAM.
>> Performance starts out fine, and then the page cache fills up and we
>> hit a throughput cliff. CPU usage of the IO threads go up, and we have
>> kswapd spending 100% of a core trying to keep up. Seeing that, I was
>> reminded of the many complaints I here about buffered IO, and the
>> fact that most of the folks complaining will ultimately bite the
>> bullet and move to O_DIRECT to just get the kernel out of the way.
>>
>> But I don't think it needs to be like that. Switching to O_DIRECT
>> isn't always easily doable. The buffers have different life times,
>> size and alignment constraints, etc. On top of that, mixing buffered
>> and O_DIRECT can be painful.
>>
>> Seems to me that we have an opportunity to provide something that sits
>> somewhere in between buffered and O_DIRECT, and this is where
>> RWF_UNCACHED enters the picture. If this flag is set on IO, we get
>> the following behavior:
>>
>> - If the data is in cache, it remains in cache and the copy (in or
>> out) is served to/from that.
>>
>> - If the data is NOT in cache, we add it while performing the IO. When
>> the IO is done, we remove it again.
>>
>> With this, I can do 100% smooth buffered reads or writes without
>> pushing the kernel to the state where kswapd is sweating bullets. In
>> fact it doesn't even register.
> 
> A question from a user or Linux Performance trainer perspective:
> 
> How does this compare with posix_fadvise() with POSIX_FADV_DONTNEED that 
> for example the nocache¹ command is using? Excerpt from manpage 
> posix_fadvice(2):
> 
>        POSIX_FADV_DONTNEED
>               The specified data will not be accessed  in  the  near
>               future.
> 
>               POSIX_FADV_DONTNEED  attempts to free cached pages as‐
>               sociated with the specified region.  This  is  useful,
>               for  example,  while streaming large files.  A program
>               may periodically request the  kernel  to  free  cached
>               data  that  has already been used, so that more useful
>               cached pages are not discarded instead.
> 
> [1] packaged in Debian as nocache or available herehttps://github.com/
> Feh/nocache
> 
> In any way, would be nice to have some option in rsync… I still did not 
> change my backup script to call rsync via nocache.

I don't know the nocache tool, but I'm guessing it just does the writes
(or reads) and then uses FADV_DONTNEED to drop behind those pages?
That's fine for slower use cases, it won't work very well for fast IO.
The write side currently works pretty much like that internally, whereas
the read side doesn't use the page cache at all.

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux