Re: [Bug #14255] WARNING: at drivers/char/tty_io.c:1267

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 1 Oct 2009, Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=14255
> Subject		: WARNING: at drivers/char/tty_io.c:1267
> Submitter	: Heinz Diehl <htd@xxxxxxxxxxxxxxxxx>
> Date		: 2009-09-20 11:37 (12 days old)
> References	: http://marc.info/?l=linux-kernel&m=125344629506309&w=4
> 		  http://lkml.org/lkml/2009/9/8/393
> Handled-By	: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

So the real problem here is that horrible workqueue deadlock, but it turns 
out that I think that we should be able to safely do a

	cancel_delayed_work_sync(&tty->buf.work);

in tty_ldisc_halt(), because cancel_delayed_work_sync() should never wait 
for any other work than the exact work in question. And the buf.work thing 
is flush_to_ldisc(), so waiting for _that_ is safe - the problematic thing 
was always waiting for the (unrelated) tty->hangup_work, which can (and 
does) take the semaphore for do_tty_hangup.

So doing that synchronous version of the delayed work cancel means that we 
can now rest easy after tty_ldisc_halt(), and we don't need to worry about 
buf.work still being pending. We _do_ in general need to worry about 
hangup_work, which will call do_tty_hangup, which will call 
tty_ldisc_hangup, but that's actually the routine we are in right now, so 
for the _very_ special case of tty_ldisc_hangup that is a non-issue too.

Did that sound subtle to you? 

It should.

It's subtle as hell, and I don't like it, but I think that the two subtle 
rules above means that the following two-liner patch is safe - it can't 
cause any new deadlocks, and getting rid of a the flush_scheduled_work is 
safe in this one case.

So please give it a whirl. I'm not happy about the subtlety, but I also 
hope that we'll get rid of that in the long run, so as a short-term hack 
this looks acceptable.

To recap:

 - tty_ldisc_halt() _can_ be called under the ldisc_mutex, because while 
   it waits for the work, it never waits for _other_ work, and buf.work 
   itself doesn't need the ldisc_mutex. So no deadlock.

 - The flush_scheduled_work() after tty_ldisc_halt() is normally needed to 
   not just flush the buf.work (which is now done by tty_ldisc_halt() 
   itself), but to also make sure that there isn't any hangup work
   pending.

   So we can't remove that in general, and the other cases will still need 
   to flush all scheduled work (and worry about deadlocks with 
   ldisc_mutex). HOWEVER, in the special case of tty_ldisc_hangup() we 
   know that we are inside the hangup work, and thus don't need to wait 
   for ourselves, so we can just get rid of it there - just nowhere else.

 - The other cases of dropping the ldisc_mutex in the middle are 
   long-standing, and have that TTY_LDISC_CHANGING vs TTY_HUPPED hackery 
   to take care of the races that it opens. I'd love to get rid of that 
   too, but they all seem to work. And they have never apparently 
   triggered the WARN_ON in this bugzilla.

I'm not proud of this patch, and I'm not signing off on it until the 
people who have seen this warning have tried it and report that it seems 
to work..

		Linus

---
 drivers/char/tty_ldisc.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/char/tty_ldisc.c b/drivers/char/tty_ldisc.c
index aafdbae..feb5507 100644
--- a/drivers/char/tty_ldisc.c
+++ b/drivers/char/tty_ldisc.c
@@ -518,7 +518,7 @@ static void tty_ldisc_restore(struct tty_struct *tty, struct tty_ldisc *old)
 static int tty_ldisc_halt(struct tty_struct *tty)
 {
 	clear_bit(TTY_LDISC, &tty->flags);
-	return cancel_delayed_work(&tty->buf.work);
+	return cancel_delayed_work_sync(&tty->buf.work);
 }
 
 /**
@@ -756,12 +756,9 @@ void tty_ldisc_hangup(struct tty_struct *tty)
 	 * N_TTY.
 	 */
 	if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) {
-		/* Make sure the old ldisc is quiescent */
-		tty_ldisc_halt(tty);
-		flush_scheduled_work();
-
 		/* Avoid racing set_ldisc or tty_ldisc_release */
 		mutex_lock(&tty->ldisc_mutex);
+		tty_ldisc_halt(tty);
 		if (tty->ldisc) {	/* Not yet closed */
 			/* Switch back to N_TTY */
 			tty_ldisc_reinit(tty);
--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux