On 11/21/24 8:07 AM, Pavel Begunkov wrote: >> At least for now, there's a real issue reported and we should fix it. I >> think the current patches are fine in that regard. That doesn't mean we >> can't potentially make it better, we should certainly investigate that. >> But I don't see the current patches as being suboptimal really, they are >> definitely good enough as-is for solving the issue. > > That's fair enough, but I still would love to know how frequent > it is. There is no purpose in optimising it as hot/slow path if > it triggers every fifth run or such. David, how easy it is to > get some stats? We can hack up some bpftrace script As mentioned, I don't think it's a frequent occurence in the sense that it happens all the time, but it's one of those "if it hits, we're screwed" cases as now we're way over budget. And we have zero limiting in place to prevent it from happening. As such it doesn't really matter how frequent it is, all that matters is that it cannot happen. In terms of a different approach, eg "we can tolerate a bit more overhead for the overrun case happening", then yeah that'd be interesting as it may guide improvements. But the frequency of it occuring for one case doesn't mean that it won't be a "hits 50% of the time" for some other case. Living with a separate retry list (which is lockless, after all) seems to me to be the least of all evils, as the overhead is both limited and constant in all cases. I'd rather entertain NOT using llists for this in the first place, as it gets rid of the reversing which is the main cost here. That won't change the need for a retry list necessarily, as I think we'd be better off with a lockless retry list still. But at least it'd get rid of the reversing. Let me see if I can dig out that patch... Totally orthogonal to this topic, obviously. -- Jens Axboe