On Mon, Jul 10, 2023 at 01:11:28PM -0500, Bob Pearson wrote: > New news on this. After some testing it turns out that replacing > rcu_read_lock() by xa_lock_irqsave() in rxe_pool_get_index() with a > large number of QPs has very bad performance. ib_send_bw -q 32 > spends about 40% of its time trying to get the spinlock on a 24 > thread CPU with local loopback. With rcu_read_lock performance is > what I expect. So, since we don't actually see this race we are > reverting that change. Without it the irqsave locks aren't required > either. So for now please ignore this patch. Well, no, you need to fix this now that you found it - you need to make the core code RCU free things that rxe thinks are rcu protected. And maybe we should just have the core code do it always, we've wanted RCU objects for other reasons anyhow. Jason