Re: [PATCH] Bluetooth: Fix hidp disconnect deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday, June 30, 2011 10:41:51 AM Peter Hurley did opine:

> On Wed, 2011-06-29 at 16:52 -0400, Gustavo F. Padovan wrote:
> > * Gustavo F. Padovan <padovan@xxxxxxxxxxxxxx> [2011-06-29 17:24:56 
-0300]:
> > > * Ilia, Kolominsky <iliak@xxxxxx> [2011-06-26 09:16:58 +0200]:
> > > > Hi!
> > > > IMHO the fix isnt good due to possible race condition which
> > > > will destroy session/task objects - either by a call to
> > > > kthread_stop from the timer func or reentry to
> > > > hidp_del_connection() on smp platforms.
> 
> ....
> 
> > This should fix the timer issue. Please test.
> > 
> > 	Gustavo
> 
> Hi Ilia & Gustavo,
> 
> After Ilia pointed out the problem with the timer function, I went back
> and reviewed *all* the synchronization code relevant to the hid session
> thread.
> 
> A number of problems were introduced with commit aabf6f89 - when the
> session thread was converted from a kernel_thread to a kthread. Although
> a kthread is a better choice for representing the session thread, the
> naive conversion of atomic/wakeup to kthread_stop() was inappropriate.
> 
> kthread_stop() has usage semantics that different significantly from
> atomic/wakeup. As we already know, because kthread_stop() blocks on
> thread completion, it can introduce deadlocks in code that already uses
> exclusion mechanisms. Even with Ilia's new patch, consider the following
> sequence:
> 
> Thread 0                 Thread 1                 Thread 2
> in hidp_del_connection                            in hidp session
>   claim r/w sem                                     .
>                          timer triggers             .
>                            kthread_stop() --------->.
>                            *blocks on thread 2*     exits loop
>                                                     *blocks for r/w sem*
>   in hidp_del_timer
>     del_timer_sync()
>       *blocks on thread 1*
> 
> Deadlock occurs because:
> + thread 0 holds reader lock but is waiting for the timer function on
> thread 1 to finish
> + thread 1 has stopped kthread and is waiting for thread completion
> + thread 2 (aka kthread) is waiting to claim writer lock held by thread
> 0
> 
> In addition to the deadlocks and races, kthread_stop() is being called
> by hidp_process_hid_control() which is *in the session thread context* -
> kthread_stop() cannot be called on itself!
> 
> I've been testing patch v2 since Monday which continues to use kthread
> but reverts back to the old behavior of atomic/wakeup. It also fixes the
> potential for a lost wakeup. Of course, "testing" means running it and
> looking at it carefully - as someone on IRC pointed out, kthread really
> needs to get instrumented with lockdep.
> 
> Regards,
> Peter Hurley

Peter;  You may have hit on a problem I'm seeing here also, to wit:
I can reboot, start bluetoothd by hand, and using blueman-manager make a 
connection that works well despite the signal being reported as a bit less 
than perfect since the two devices are a couple of walls and 20 feet apart.  
It will work for 12 to 18 hours, then die 'deadlocked' and cannot be 
restored without a reboot.

Seems to me we at least have similar ducks, they both waddle.  ;-)

Cheers, gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
When God endowed human beings with brains, He did not intend to guarantee 
them.
--
To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux