On Thu, 2019-06-27 at 17:52 -0700, Paul E. McKenney wrote: > On Thu, Jun 27, 2019 at 05:46:27PM -0500, Scott Wood wrote: > > On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote: > > > If by IPI-to-self you mean the IRQ work trick, that isn't implemented > > > across all architectures yet, is it? > > > > Right... smp_send_reschedule() has wider coverage, but even then there's > > some hardware that just can't do it reasonably (e.g. pre-APIC x86). > > Except that smp_send_reschedule() won't do anything unless the scheduler > things something needs to be done, as it its wake list is non-empty. > Which might explain why Peter Zijlstra didn't suggest it. The wake list stuff is separate from the original purpose of the IPI, which is to hit the need_resched check on IRQ exit. When that happens, the scheduler will call into RCU, even if it doesn't change threads. > > So I guess the options are: > > > > 1. Accept that such hardware might experience delayed grace period > > completion in certain configurations, > > 2. Have such hardware check for need_resched in local_irq_enable() (not > > nice > > if sharing a kernel build with hardware that doesn't need it), or > > 3. Forbid the sequence (enforced by debug checks). Again, this would > > only > > prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/ > > local_irq_enable() *without* preempt disabling around the IRQ-disabled > > region. > > 4. If further testing continues to show it to be reliable, continue > using the scheme in -rcu. If the testing isn't done on machines that can't do the IPI then it's basically option #1. FWIW I don't think option #1 is unreasonable given that we're talking about very old and/or specialized hardware, and we're only talking about delays, not a crash (maybe limit the ability to use nohz_full on such hardware?). Of course if it turns out people are actually trying to run (modern versions of) RT on such hardware, that might be different. :-) > 5. Use a short-duration hrtimer to get a clean environment in short > order. Yes, the timer might fire while preemption and/or softirqs > are disabled, but then the code can rely on the following > preempt_enable(), local_bh_enable(), or whatever. This condition > should be sufficiently rare to avoid issues with hrtimer overhead. Yeah, I considered that but was hesitant due to overhead -- at least in the case of the example I gave (pre-APIC x86), arming a oneshot timer is pretty slow. Plus, some hardware might entirely lack one-shot timer capability. > 6. Use smp_call_function_single() to IPI some other poor slob of a > CPU, which then does the same back. Non-waiting version in both > cases, of course. I was assuming any hardware that can't do smp_send_reschedule() is not SMP. > > Probably others as well. > > > > Why not simply make rcutorture cyheck whether it is running in a > > > PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly > > > testing only in that case? > > > > > > And should we later get to a place where the PREEMPT_RT_FULL- > > > unfriendly > > > scenarios are prohibited across all kernel configurations, then the > > > module > > > parameter can be removed. Again, until we know (as opposed to > > > suspect) > > > that these scenarios really don't happen, mainline rcutorture must > > > continue testing them. > > > > Yes, I already acknowledged that debug checks detecting the sequences > > should > > come before the test removal > > OK, good to hear. As you may have noticed, I was getting the impression > that you might have changed your mind on this point. ;-) > > > (including this patch as an RFC at this > > point > > was mainly meant as a demonstration of what's needed to get rcutorture > > to > > pass), but it'd be nice to have some idea of whether there would be > > opposition to the concept before coding up the checks. I'd rather not > > continue the state of "these sequences can blow up on RT and we don't > > know > > if they exist or not" any longer than necessary. Plus, only one of the > > sequences is exclusively an RT issue (though it's the one with the worst > > consequences). > > Steve Rostedt's point about enlisting the aid of lockdep seems worth > looking into. Sure. I was just concerned by the "Linus was against enforcing this in the past" comment and was hoping for more details. -Scott