Re: [PATCH v7 19/21] NFS: Convert readdir page cache to use a cookie based index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24 Feb 2022, at 23:25, Trond Myklebust wrote:

> On Fri, 2022-02-25 at 14:17 +1100, NeilBrown wrote:

>> I haven't looked at the code recently so this might not be 100% accurate,
>> but XArray generally assumes that pages are often adjacent.  They don't
>> have to be, but there is a cost.  It uses a multi-level array with 9 bits
>> per level.  At each level there are a whole number of pages for indexes
>> to the next level.
>>
>> If there are two entries, that are 2^45 separate, that is 5 levels of
>> indexing that cannot be shared.  So the path to one entry is 5 pages,
>> each of which contains a single pointer.  The path to the other entry is
>> a separate set of 5 pages.
>>
>> So worst case, the index would be about 64/9 or 7 times the size of the
>> data.  As the number of data pages increases, this would shrink slightly,
>> but I suspect you wouldn't get below a factor of 3 before you fill up all
>> of your memory.

Yikes!

> If the problem is just the range, then that is trivial to fix: we can
> just use xxhash32(), and take the hit of more collisions. However if
> the problem is the access pattern, then I have serious questions about
> the choice of implementation for the page cache. If the cache can't
> support file random access, then we're barking up the wrong tree on the
> wrong continent.

I'm guessing the issue might be "get next", which for an "array" is probably
the operation tested for "perform well".  We're not doing any of that, we're
directly addressing pages with our hashed index.

> Either way, I see avoiding linear searches for cookies as a benefit
> that is worth pursuing.

Me too.  What about just kicking the seekdir users up into the second half
of the index, to use xxhash32() up there.  Everyone else can hang out in the
bottom half with dense indexes and help each other out.

The vast  majority of readdir() use is going to be short listings traversed
in order.  The memory inflation created by a process that needs to walk a
tree and for every two pages of readdir data require 10 pages of indexes
seems pretty extreme.

Ben




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux