Re: [rfc][patch] x86-64 new smp_call_function design

Nick Piggin <npiggin@xxxxxxx> · Wed, 27 Feb 2008 14:07:16 +0100

On Wed, Feb 27, 2008 at 02:04:18PM +0100, Andi Kleen wrote:
> 
> > On a 2 socket, 8 core system, I see anywhere up to nearly 16x better
> > performance on a stress test. The common cases of call-all, and wait
> > are improved the least, however I think that if call-single and nowait
> > are turned into a high performance API, then new usages will pop up
> > (eg. I started this because I wanted to do "call single, nowait" calls
> > for migrating block IO completions back to submitting CPU; however I
> > am also interested in improving the "call all, wait" case for example
> > to improve vmalloc tlb flushing).
> 
> TLB flushing at least on x86-64 should be already well optimized on its
> own. I would be surprised if you could do much better.

*vmalloc* TLB flushing. 

void flush_tlb_all(void)
{
        on_each_cpu(do_flush_tlb_all, NULL, 1, 1);
}

Of course we could use a new vector for it and speed it up a lot more,
but after my vmalloc improvements I think that would be a waste of a
vector at this point.

> > As far as I understand, calling a subset of online CPUs that is not all or
> > one, is used quite infrequently, so this might be OK.
> 
> With cpusets and isolation etc. it is the normal case.

Oh really? Coming from what callers?
-
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html