On 10/1/21 15:46, Jason Gunthorpe wrote:
On Fri, Oct 01, 2021 at 04:22:28PM -0600, Logan Gunthorpe wrote:
It would close this issue, however synchronize_rcu() is very slow
(think > 1second) in some cases and thus cannot be inserted here.
It shouldn't be *that* slow, at least not the vast majority of the
time... it seems a bit unreasonable that a CPU wouldn't schedule for
more than a second.
I've seen bug reports on exactly this, it is well known. Loaded
big multi-cpu systems have high delays here, for whatever reason.
So have I. One reason is that synchronize_rcu() doesn't merely wait
for a context switch on each CPU--it also waits for callbacks (such as
those set up by call_rcu(), if I understand correctly) to run.
These can really add up to something quite substantial. In fact, I don't
think there is an upper limit on the running times, anywhere.
thanks,
--
John Hubbard
NVIDIA