On Tue, Dec 02, 2014 at 07:14:22AM -0500, Jeff Layton wrote: > On Tue, 2 Dec 2014 06:57:50 -0500 > Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx> wrote: > > > On Mon, 1 Dec 2014 19:38:19 -0500 > > Trond Myklebust <trondmy@xxxxxxxxx> wrote: > > > > > On Mon, Dec 1, 2014 at 6:47 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > > > I find it hard to think about how we expect this to affect performance. > > > > So it comes down to the observed results, I guess, but just trying to > > > > get an idea: > > > > > > > > - this eliminates sp_lock. I think the original idea here was > > > > that if interrupts could be routed correctly then there > > > > shouldn't normally be cross-cpu contention on this lock. Do > > > > we understand why that didn't pan out? Is hardware capable of > > > > doing this really rare, or is it just too hard to configure it > > > > correctly? > > > > > > One problem is that a 1MB incoming write will generate a lot of > > > interrupts. While that is not so noticeable on a 1GigE network, it is > > > on a 40GigE network. The other thing you should note is that this > > > workload was generated with ~100 clients pounding on that server, so > > > there are a fair amount of TCP connections to service in parallel. > > > Playing with the interrupt routing doesn't necessarily help you so > > > much when all those connections are hot. > > > > > In principle though, the percpu pool_mode should have alleviated the > contention on the sp_lock. When an interrupt comes in, the xprt gets > queued to its pool. If there is a pool for each cpu then there should > be no sp_lock contention. The pernode pool mode might also have > alleviated the lock contention to a lesser degree in a NUMA > configuration. > > Do we understand why that didn't help? Yes, the lots-of-interrupts-per-rpc problem strikes me as a separate if not entirely orthogonal problem. (And I thought it should be addressable separately; Trond and I talked about this in Westford. I think it currently wakes a thread to handle each individual tcp segment--but shouldn't it be able to do all the data copying in the interrupt and wait to wake up a thread until it's got the entire rpc?) > In any case, I think that doing this with RCU is still preferable. > We're walking a very short list, so doing it lockless is still a > good idea to improve performance without needing to use the percpu > pool_mode. I find that entirely plausible. Maybe it would help to ask SGI people. Cc'ing Ben Myers in hopes he could point us to the right person. It'd be interesting to know: - are they using the svc_pool stuff? - if not, why not? - if so: - can they explain how they configure systems to take advantage of it? - do they have any recent results showing how it helps? - could they test Jeff's patches for performance regressions? Anyway, I'm off for now, back to work Thursday. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html