On Thu, Jul 30, 2020 at 11:23:58AM -0700, Sagi Grimberg wrote: > > > > > > I think it will be a significant improvement to have a single code path. > > > > > The code will be more robust and we won't need to face issues that are > > > > > specific for blocking. > > > > > > > > > > If the cost is negligible, I think the upside is worth it. > > > > > > > > > > > > > rcu_read_lock and rcu_read_unlock has been proved as efficient enough, > > > > and I don't think percpu_refcount is better than it, so I'd suggest to > > > > not switch non-blocking into this way. > > > > > > It's not a matter of which is better, its a matter of making the code > > > more robust because it has a single code-path. If moving to percpu_ref > > > is negligible, I would suggest to move both, I don't want to have two > > > completely different mechanism for blocking vs. non-blocking. > > > > FWIW, I proposed an hctx percpu_ref over a year ago (but for a > > completely different reason), and it was measured as too costly. > > > > https://lore.kernel.org/linux-block/d4a4b6c0-3ea8-f748-85b0-6b39c5023a6f@xxxxxxxxx/ > > If this is the case, we shouldn't consider this as an alternative at all, > and move forward with either the original proposal or what > ming proposed to move a counter to the tagset. Well, the point I was trying to make is that we shouldn't bother making blocking and non-blocking dispatchers use the same synchronization since non-blocking has a very cheap solution that blocking can't use.