On Wed, Nov 25, 2020 at 10:15:38PM +0800, Baolin Wang wrote: > > > Hello, > > > > On Tue, Nov 24, 2020 at 11:33:33AM +0800, Baolin Wang wrote: > > > @@ -1445,7 +1447,8 @@ static void iocg_kick_waitq(struct ioc_gq *iocg, bool pay_debt, > > > * after the above debt payment. > > > */ > > > ctx.vbudget = vbudget; > > > - current_hweight(iocg, NULL, &ctx.hw_inuse); > > > + if (need_update_hwi) > > > + current_hweight(iocg, NULL, &ctx.hw_inuse); > > > > So, if you look at the implementation of current_hweight(), it's > > > > 1. If nothing has changed, read out the cached values. > > 2. If something has changed, recalculate. > > Yes, correct. > > > > > and the "something changed" test is single memory read (most likely L1 hot > > at this point) and testing for equality. IOW, the change you're suggesting > > isn't much of an optimization. Maybe the compiler can do a somewhat better > > job of arranging the code and it's a register load than memory load but > > given that it's already a relatively cold wait path, this is unlikely to > > make any actual difference. And that's how current_hweight() is meant to be > > used. > > What I want to avoid is the 'atomic_read(&ioc->hweight_gen)' in > current_hweight(), cause this is not a register load and is always a memory > load. But introducing a flag can be cached and more light than a memory > load. > > But after thinking more, I think we can just move the "current_hweight(iocg, > NULL, &ctx.hw_inuse);" to the correct place without introducing new flag to > optimize the code. How do you think the below code? I don't find this discussion very meaningful. We're talking about theoretical one memory load optimization in a path which likely isn't hot enough for such difference to make any difference. If you can show that this matters, please do. Otherwise, what are we doing? Thanks. -- tejun