On Tue, Oct 27, 2015 at 04:15:59PM -0700, Paul E. McKenney wrote: > On Tue, Oct 27, 2015 at 08:27:53AM -0700, Eric Dumazet wrote: > > On Tue, 2015-10-27 at 12:02 -0300, Arnaldo Carvalho de Melo wrote: [..] > > > The first suggestion, with it disabled by default seems to be the most > > > flexible tho, i.e, Paul's original message plus the boot parameter line: > > > > > > Alternatively, a boot-time option could be used: > > > > > > int some_rt_boot_parameter = CONFIG_SYNC_NET_DEFAULT; > > > > > > if (rtnl_is_locked() && !some_rt_boot_parameter) > > > synchronize_rcu_expedited(); > > > else > > > synchronize_rcu(); > > This could be OK, but why not start with something very simple and automatic? > We can always add more knobs when and if they actually prove necessary. I suppose the question is if, for acme's usecases the answer to "when it's proven necessary" is "now". > In contrast, unnecessary knobs can cause confusion and might at the same time > get locked into some misbegotten userspace application, which would make the > unnecessary knob really hard to get rid of. I think I would make a stronger statement; the CONFIG_SYNC_NET_DEFAULT proposed option would be a boot/compile time parameter which says "I require networking (and network configuration) in my critical path", why don't we have these flags for other I/O subsystems? What's special about networking? We don't because applications can make use of thread priorities to express exactly which tasks should be more important than others. So perhaps the failure here is that RCU (and networking, by implication) doesn't (can't?) take into consideration the calling thread's priority? (And, there may be a cascade of other problems as well, like deferred work pushed to a waitqueue, and thus losing the callers priority, etc) (I will admit that RCU is a black box to me, so it is entirely possible it's already capable of this, or it's fundamentally impossible, or somewhere in between :) > > > Then RT oriented kernel .config files would have CONFIG_SYNC_NET_DEFAULT > > > set to 1, while upstream would have this default to 0. > > > > > > RT oriented kernel users could try using this in some scenarios where > > > networking is not the critical path. > > > > Well, if synchronize_rcu_expedited() is such a problem on RT, then maybe > > a generic solution would make synchronize_rcu_expedited() to fallback > > synchronize_rcu() after boot time on RT. > > > > Not sure why networking use of synchronize_rcu_expedited() would be > > problematic, and not the others. > > From what I can see, their testing just happened to run into this one. > Perhaps further testing will run into others, or perhaps the others are > off in code paths that should not be exercised while running RT apps. I accidentally ran into this issue when I was doing testing with an ethernet cable w/ a broken RJ-45 connector (without the tab, that I was just too lazy to replace), and I kept accidentally knocking it out. :) Regardless, industrial automation environments aren't known for having the most stable network environments; there may be deployed systems doing high priority motion control tasks, we'd want to ensure that the poor network technician sent in to repair a defective network switch wouldn't end up being mangled. > > scripts/checkpatch.pl has this comment about this : Also, Documentation/RCU/checklist.txt mentions: Use of the expedited primitives should be restricted to rare configuration-change operations that would not normally be undertaken while a real-time.. I think it could have been argued at the time, that operations under rtnl_lock() were "configuration-change" operations. However, for our use cases, it's not, as link changes are external events beyond control. Thanks again, Josh -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html