Re: [PATCH] mm: cache largest vma

Davidlohr Bueso <davidlohr@xxxxxx> · Wed, 13 Nov 2013 09:08:10 -0800

On Mon, 2013-11-11 at 12:47 -0800, Davidlohr Bueso wrote:
> On Mon, 2013-11-11 at 13:04 +0100, Ingo Molnar wrote:
> > * Michel Lespinasse <walken@xxxxxxxxxx> wrote:
> > 
> > > On Sun, Nov 10, 2013 at 8:12 PM, Davidlohr Bueso <davidlohr@xxxxxx> wrote:
> > > > 2) Oracle Data mining (4K pages)
> > > > +------------------------+----------+------------------+---------+
> > > > |    mmap_cache type     | hit-rate | cycles (billion) | stddev  |
> > > > +------------------------+----------+------------------+---------+
> > > > | no mmap_cache          | -        | 63.35            | 0.20207 |
> > > > | current mmap_cache     | 65.66%   | 19.55            | 0.35019 |
> > > > | mmap_cache+largest VMA | 71.53%   | 15.84            | 0.26764 |
> > > > | 4 element hash table   | 70.75%   | 15.90            | 0.25586 |
> > > > | per-thread mmap_cache  | 86.42%   | 11.57            | 0.29462 |
> > > > +------------------------+----------+------------------+---------+
> > > >
> > > > This workload sure makes the point of how much we can benefit of 
> > > > caching the vma, otherwise find_vma() can cost more than 220% extra 
> > > > cycles. We clearly win here by having a per-thread cache instead of 
> > > > per address space. I also tried the same workload with 2Mb hugepages 
> > > > and the results are much more closer to the kernel build, but with the 
> > > > per-thread vma still winning over the rest of the alternatives.
> > > >
> > > > All in all I think that we should probably have a per-thread vma 
> > > > cache. Please let me know if there is some other workload you'd like 
> > > > me to try out. If folks agree then I can cleanup the patch and send it 
> > > > out.
> > > 
> > > Per thread cache sounds interesting - with per-mm caches there is a real 
> > > risk that some modern threaded apps pay the cost of cache updates 
> > > without seeing much of the benefit. However, how do you cheaply handle 
> > > invalidations for the per thread cache ?
> > 
> > The cheapest way to handle that would be to have a generation counter for 
> > the mm and to couple cache validity to a specific value of that. 
> > 'Invalidation' is then the free side effect of bumping the generation 
> > counter when a vma is removed/moved.

Wouldn't this approach make us invalidate all vmas even when we just
want to do it for one? I mean we have no way of associating a single vma
with an mm->mmap_seqnum, or am I missing something?

> 
> I was basing the invalidations on the freeing of vm_area_cachep, so I
> mark current->mmap_cache = NULL whenever we call
> kmem_cache_free(vm_area_cachep, ...). But I can see this being a problem
> if more than one task's mmap_cache points to the same vma, as we end up
> invalidating only one. I'd really like to use a similar logic and base
> everything around the existence of the vma instead of adding a counting
> infrastructure. Sure we'd end up doing more reads when we do the lookup
> in find_vma() but the cost of maintaining it comes free. I just ran into
> a similar idea from 2 years ago:
> http://lkml.indiana.edu/hypermail/linux/kernel/1112.1/01352.html
> 
> While there are several things that aren't needed, it does do the
> is_kmem_cache() to verify that the vma is still a valid slab.

Doing invalidations this way is definitely not the way to go. While our
hit rate does match my previous attempt, the cost of checking the slab
ends up costing an extra 25% more of cycles than what we currently have.

Thanks,
Davidlohr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>