On 12/17/19 8:16 AM, Guoqing Jiang wrote: > > On 12/17/19 3:39 PM, Jens Axboe wrote: >> If RWF_UNCACHED is set for io_uring (or preadv2(2)), we'll use private >> pages for the buffered reads. These pages will never be inserted into >> the page cache, and they are simply droped when we have done the copy at >> the end of IO. >> >> If pages in the read range are already in the page cache, then use those >> for just copying the data instead of starting IO on private pages. >> >> A previous solution used the page cache even for non-cached ranges, but >> the cost of doing so was too high. Removing nodes at the end is >> expensive, even with LRU bypass. On top of that, repeatedly >> instantiating new xarray nodes is very costly, as it needs to memset 576 >> bytes of data, and freeing said nodes involve an RCU call per node as >> well. All that adds up, making uncached somewhat slower than O_DIRECT. >> >> With the current*solition*, we're basically at O_DIRECT levels of > > Maybe it is 'solution' here. Indeed, fixed up, thanks. -- Jens Axboe