On Wed, Dec 24, 2014 at 9:56 PM, Kent Overstreet <kmo@xxxxxxxxxxxxx>
wrote:
On Mon, Dec 22, 2014 at 07:16:25PM -0500, Chris Mason wrote:
The 3.19 merge window brought in a great new warning to catch
someone
calling might_sleep with their state != TASK_RUNNING. The idea was
to
find buggy code locking mutexes after calling prepare_to_wait(),
kind
of like this:
Ben just told me about this issue.
IMO, the way the code is structured now is correct, I would argue the
problem is
with the way wait_event() works - they way they have to mess with the
global-ish
task state when adding a wait_queue_t to a wait_queue_head (who came
up with
these names?)
Grin, probably related to the guy who made closure_wait() not actually
wait.
The advantage to the waitqueue head _t setup is its a very well
understood mechanism for sleeping on something without missing wakeups.
The locking overhead for the waitqueues can be a problem for lots of
waiters on the same queue, but otherwise the overhead is low.
I think closures are too big a hammer for this problem, unless
benchmarks show we need the lockless lists (I really like that part).
I do hesitate to make big changes here because debugging AIO hangs is
horrible. The code is only tested by a few workloads, and we can go a
long time before problems are noticed. When people do hit bugs, we
only notice the ones where applications pile up in getevents.
Otherwise it's just strange performance changes that we can't explain
because they are hidden in the app's AIO state machine.
When I first looked at the warning, I didn't realize that might_sleep
and friends were setting a preempted flag to make sure the task wasn't
removed from the runqueue. So I thought we'd potentially sleep forever
(thanks Peter for details++). The real risk here is burning CPU in the
running state, potentially a lot of it if the mutex is highly
contended. We've probably been hitting this for a while, but since we
test AIO performance with fast storage, the burning just made us look
faster.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html