Re: [PATCH 2/2] xfs: fix buffer lookup vs release race

Christoph Hellwig <hch@xxxxxx> · Wed, 15 Jan 2025 06:38:00 +0100

On Tue, Jan 14, 2025 at 07:55:30AM +1100, Dave Chinner wrote:
> The idea behind the initial cacheline layout is that it should stay
> read-only as much as possible so that cache lookups can walk the
> buffer without causing shared/exclusive cacheline contention with
> existing buffer users.
> 
> This was really important back in the days when the cache used a
> rb-tree (i.e. the rbnode pointers dominated lookup profiles), and
> it's still likely important with the rhashtable on large caches.
> 
> i.e. Putting a spinlock in that first cache line will result in
> lookups and shrinker walks having cacheline contention as the
> shrinker needs exclusive access for the spin lock, whilst the lookup
> walk needs shared access for the b_rhash_head, b_rhash_key and
> b_length fields in _xfs_buf_obj_cmp() for lookless lookup
> concurrency.

Hmm, this contradict the comment on top of xfs_buf, which explicitly
wants the lock and count in the semaphore to stay in the first cache
line.  These, similar to the count that already is in the cacheline
and the newly moved lock (which would still keep the semaphore partial
layout) are modified for the uncontended lookup there.  Note that
since the comment was written b_sema actually moved entirely into
the first cache line, and this patch keeps it there, nicely aligning
b_lru_ref on my x86_64 no-debug config.

Now I'm usually pretty bad about these cacheline micro-optimizations
and I'm talking to the person who wrote that comment here, so that
rationale might not make sense, but then the comment doesn't either.

I'm kinda tempted to just stick to the rationale there for now and then
let someone smarter than me optimize the layout for the new world order.