Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens Axboe - 12.12.19, 16:16:31 CET:
> On 12/12/19 3:44 AM, Martin Steigerwald wrote:
> > Jens Axboe - 11.12.19, 16:29:38 CET:
> >> Recently someone asked me how io_uring buffered IO compares to
> >> mmaped
> >> IO in terms of performance. So I ran some tests with buffered IO,
> >> and
> >> found the experience to be somewhat painful. The test case is
> >> pretty
> >> basic, random reads over a dataset that's 10x the size of RAM.
> >> Performance starts out fine, and then the page cache fills up and
> >> we
> >> hit a throughput cliff. CPU usage of the IO threads go up, and we
> >> have kswapd spending 100% of a core trying to keep up. Seeing
> >> that, I was reminded of the many complaints I here about buffered
> >> IO, and the fact that most of the folks complaining will
> >> ultimately bite the bullet and move to O_DIRECT to just get the
> >> kernel out of the way.
> >> 
> >> But I don't think it needs to be like that. Switching to O_DIRECT
> >> isn't always easily doable. The buffers have different life times,
> >> size and alignment constraints, etc. On top of that, mixing
> >> buffered
> >> and O_DIRECT can be painful.
> >> 
> >> Seems to me that we have an opportunity to provide something that
> >> sits somewhere in between buffered and O_DIRECT, and this is where
> >> RWF_UNCACHED enters the picture. If this flag is set on IO, we get
> >> the following behavior:
> >> 
> >> - If the data is in cache, it remains in cache and the copy (in or
> >> out) is served to/from that.
> >> 
> >> - If the data is NOT in cache, we add it while performing the IO.
> >> When the IO is done, we remove it again.
> >> 
> >> With this, I can do 100% smooth buffered reads or writes without
> >> pushing the kernel to the state where kswapd is sweating bullets.
> >> In
> >> fact it doesn't even register.
> > 
> > A question from a user or Linux Performance trainer perspective:
> > 
> > How does this compare with posix_fadvise() with POSIX_FADV_DONTNEED
> > that for example the nocache¹ command is using? Excerpt from
> > manpage> 
> > posix_fadvice(2):
> >        POSIX_FADV_DONTNEED
> >        
> >               The specified data will not be accessed  in  the  near
> >               future.
> >               
> >               POSIX_FADV_DONTNEED  attempts to free cached pages as‐
> >               sociated with the specified region.  This  is  useful,
> >               for  example,  while streaming large files.  A program
> >               may periodically request the  kernel  to  free  cached
> >               data  that  has already been used, so that more useful
> >               cached pages are not discarded instead.
> > 
> > [1] packaged in Debian as nocache or available
> > herehttps://github.com/ Feh/nocache
> > 
> > In any way, would be nice to have some option in rsync… I still did
> > not change my backup script to call rsync via nocache.
> 
> I don't know the nocache tool, but I'm guessing it just does the
> writes (or reads) and then uses FADV_DONTNEED to drop behind those
> pages? That's fine for slower use cases, it won't work very well for
> fast IO. The write side currently works pretty much like that
> internally, whereas the read side doesn't use the page cache at all.

Yes, it does that. And yeah I saw you changed the read site to bypass 
the cache entirely.

Also as I understand it this is for asynchronous using io uring 
primarily?

-- 
Martin





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux