Hi Marcel, On Tue, Jun 25, 2013 at 6:33 AM, Marcel Holtmann <marcel@xxxxxxxxxxxx> wrote: >> I enabled dynamic debug on this kernel. Again, I'm running 3.8.11. I >> added a debug line to check HCI_QUIRK_RESET_ON_CLOSE and >> "test_bit(HCI_QUIRK_RESET_ON_CLOSE, &hdev->quirks)" evaluates to 0; so >> the reset is sent during device up. Nevertheless, the bug is because a >> combination of both: >> >> The timelime for the bug is the following. >> 5. hci_dev_do_close clears the cmd, rx and tx queues in the kernel, >> and sets the cmd_cnt to 1, even if the current command >> (HCI_Write_Class_of_Device in this case) did not finish. From this >> point we have an inconsistent state, since cmd_cnt is 1, but we are >> not allowed to send a command to the adapter. > > I think this forcefully setting cmd_cnt to 1 is actually a problem when we have pending commands. Yes, but the problem is that if the driver powers down the controller on close or the command complete event is in the rx_work queue (that we just flushed) then we need to set this cmd_cnt to 1. > You can trigger this via ioctl and also rfkill since both go via hci_dev_do_close to shutdown the handling of the device. > > We need to wait for the command timeout of the pending command. While we could pretend that the ioctl is deprecated and should not be used, but rfkill clearly isn't. And honestly hciconfig hci0 up/down is way to useful for low-level testing that it might not go away any time soon. > >> I agree with Marcel that the right fix should be to wait until we >> receive the completion event from the adapter. I'm certainly not very >> familiar with this code, but I'll try to hack it =) Suggestions are >> welcomed! > > Check if you have the patch that Johan mentioned in your tree. That should at least fix the problem when using mgmt. It is a bit funky, but at least mgmt power off will be blocked until you mgmt power on finished and thus should make this race a lot harder trigger. I'm in 3.8.11 and I don't have that patch... and as I can see from the log in mgmt.c it is not going to apply... there are already several changes from this kernel to that point. But anyway... that won't fix the problem. The scenario is "send a command, btmgmt power off, btmgmt power on". The fact that the command was sent because a previous power on is not relevant, and bluetoothd normally sends more commands when it sees an interface going up that aren't covered by this lock (like Write EIR, Write Class of Device, Write Local Name)... right? I'll try to test that on a linux desktop with a recent compat-drivers and an usb adapter. Alex. -- To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html