Hello, Andrew. On Thu, Mar 08, 2012 at 03:30:48PM -0800, Andrew Morton wrote: > > The behavior change was primarily to > > allow long running work items to use regular workqueues without > > worrying about inducing delay across cpu hotplug operations, which is > > important as it's also used on suspend / hibernation, especially on > > mobile platforms. > > Well.. why did we want to support these long-running work items? > They're abusive, aren't they? Where are they? The rationale was two-fold. One was that using kthread directly is inefficient and difficult. We end up with a lot of mostly idle kthreads lying around and w/ increasing number of cores, creating them per-cpu becomes problematic. On certain setups, we were reaching task limit during boot, so having an easy to use worker pool mechanism is necessary. We already had workqueue, so it was logical to extend wq to support that. Also, on auditing kthread users, a lot of them were (and still are) racy around kthread_should_exit() handling. kthread_should_exit() requires careful synchronization to avoid missing the event. It just sets should exit flag and wakes up the kthread once. Many simply forget to consider the synchronization requirements. Another side was that "long-running" isn't obvious at all. Many workqueue items are used because they require sleepable context for synchronization and while they usually don't consume large amount of time, there are occassions where certain locking takes way longer through chain of dependencies. This was mostly visible through system workqueue getting stalled. > > Another approach would be requiring all workqueues to be drained on > > cpu offlining and requiring any work item which may stall to use > > unbound wq. IMHO, picking out the ones which may stall would be much > > less obvious than the ones which require cpu pinning. > > I'd be surprised if it's *that* hard to find and fix the long-running > work items. Hopefully most of them are already using > create_freezable_workqueue() or create_singlethread_workqueue(). > > I wonder if there's some debug code we can put in workqueue.c to detect > when a pinned work item takes "too long". Yes, we can go either way, but I think it would be easier to weed out the ones with pinned assumptions. As they usually are much less common, more obvious and probably easier to automatically detect (ie. trigger warning on debug_smp_processor_id() if running as un-pinned work item). ISTR there was something already broken about having specific CPU assumption w/ workqueue even before cmwq when using queue_work_on() unless it was explicitly synchronizing using cpu hotplug callback. Hmmm... what was it... I think it was that there was no protection against queueing on workqueue on dead CPU and workqueue was flushed only once during cpu shutdown meaning that queue_work_on() or requeueing work items could end up queued on a workqueue of a dead CPU. Thanks. -- tejun -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel