Re: [RFC PATCH] vm: align vma allocation and move the lock back into the struct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 12, 2024 at 5:27 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Sun, Aug 11, 2024 at 9:29 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> > That aside as I mentioned earlier the dedicated vma lock cache results
> > in false sharing between separate vmas, except this particular
> > benchmark does not test for it (which in your setup should be visible
> > even if the cache grows the  SLAB_HWCACHE_ALIGN flag).
>
> When implementing VMA locks I did experiment with SLAB_HWCACHE_ALIGN
> for vm_lock cache using different benchmarks and didn't see
> improvements above noise level. Do you know of some specific benchmark
> that would possibly show improvement?
>

I don't know anything specific, I'm saying basic multicore hygiene
says these locks need to land in dedicated cachelines.

Consider the following: struct rw_semaphore is 40 bytes and the word
modified in the lock/unlock fast path is at offset 0.

I don't know how much waste is there in the allocator, if there is
anything less than 24 bytes (which obviously will be the case) there
will be massive false-sharing. 24 bytes of the *second* lock land in
the same cacheline as the first one, including the stuff which is
modified in the fast path. iow the locks allocated this way are
guaranteed to keep bouncing.

I don't believe any effort is warranted to try to find a real scenario
with this problem or synthetically trying to write one.

> > If there are still problems and the lock needs to remain separate, the
> > bare minimum damage-controlling measure would be to hwalign the vma
> > lock cache -- it wont affect the pts benchmark, but it should help
> > others.
>
> Sure but I'll need to measure the improvement and for that I need a
> banchmark or a workload. Any suggestions?
>

I believe I addressed this above.

If there is an individual who in your opinion is going to protest such
a patch on the grounds that no benchmark is being provided, I can give
them a talking to.

Even then, it may be this bit wont be applicable anyway, so....

> >
> > Should the decision be to bring the lock back into the struct, I'll
> > note my patch is merely slapped together to a state where it can be
> > benchmarked and I have no interest in beating it into a committable
> > shape. You stated you already had an equivalent (modulo keeping
> > something in a space previously occupied by the pointer to the vma
> > lock), so as far as I'm concerned you can submit that with your
> > authorship.
>
> Thanks! If we end up doing that I'll keep you as Suggested-by and will
> add a link to this thread.

sgtm

-- 
Mateusz Guzik <mjguzik gmail.com>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux