Re: [Bug #14388] keyboard under X with 2.6.31

Oleg Nesterov <oleg@xxxxxxxxxx> · Thu, 15 Oct 2009 14:47:30 +0200

On 10/14, Linus Torvalds wrote:
>
> Yeah. Basically, we want to make sure that it has been called *since it
> was scheduled*. In case it has already been called and is no longer
> pending at all, not calling it again is fine.
>
> It's just that we didn't have any way to do that "force the pending
> delayed work to be scheduled", so instead we ran the scheduled function by
> hand synchronously. Which then seems to have triggered other problems.

Yes. But I am not sure these problems are new. I do not understand this
code even remotely, but from a quick grep it seems to me it is possible
that flush_to_ldisc() can race with itself even without tty_flush_to_ldisc()
which calls work->func() by hand.

> So I don't think it really matters in practice, but I do think that we
> have that nasty hole in workqueues in general with overlapping work. I
> wish I could think of a way to fix it.

I don't entirely agree this is a hole, I mean everything works "as expected".
But yes, I agree, users often do not realize that multithreaded workqueues
imply overlapping works (unless the caller takes care). And in this case
work->func() should solve the races itself.

Perhaps it makes sense to introduce something like

	// same as queue_work(), but ensures work->func() can't race with itself

	int queue_work_xxx(struct workqueue_struct *wq, struct work_struct *work)
	{
		int ret = 0;

		if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) {
			struct cpu_workqueue_struct *cwq = get_wq_data(work);
			int cpu = get_cpu();

			// "cwq->current_work != work" is not strictly needed,
			// but we don't want to pin this work to the single CPU.

			if (!cwq || cwq->current_work != work)
				cwq = wq_per_cpu(wq, cpu);

			__queue_work(cwq, work);
			put_cpu();
			ret = 1;
		}

		return ret;
	}

This way we can never have multiple instances of the same work running on
different CPUs. Assuming, of course, the caller never mixes queue_work_xxx()
with queue_work(). The logic for queue_delayed_work_xxx() is similar.

But, this can race with cpu_down(). I think this is solvable but needs
more locking. I mean, the caller of queue_work_xxx() must not use the old
get_wq_data(work) if this CPU is already dead, but a simple cpu_online()
is not enough, we can race with workqueue_cpu_callback(CPU_POST_DEAD)
flushing this cwq, in this case we should carefully insert this work
into the almost-dead queue.

Or, perhaps better, instead of new helper, we can probably use the free
bit in work_struct->data to mark this work/dwork as "single-instance-work".
In this case __queue_work and queue_delayed_work_on should check this bit.

Do you think this makes sense and can close the hole?

If yes, I'll try to do this on Weekend.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html