Re: [PATCH v2] Bluetooth: qca: Fix BT enable failure again for QCA6390 after warm reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Tue, May 21, 2024 at 10:52 AM Lk Sii <lk_sii@xxxxxxx> wrote:
>
>
>
> On 2024/5/16 23:55, Luiz Augusto von Dentz wrote:
> > Hi,
> >
> > On Thu, May 16, 2024 at 10:57 AM Lk Sii <lk_sii@xxxxxxx> wrote:
> >>
> >>
> >>
> >> On 2024/5/16 21:31, Zijun Hu wrote:
> >>> Commit 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed
> >>> serdev") will cause below regression issue:
> >>>
> >>> BT can't be enabled after below steps:
> >>> cold boot -> enable BT -> disable BT -> warm reboot -> BT enable failure
> >>> if property enable-gpios is not configured within DT|ACPI for QCA6390.
> >>>
> >>> The commit is to fix a use-after-free issue within qca_serdev_shutdown()
> >>> by adding condition to avoid the serdev is flushed or wrote after closed
> >>> but also introduces this regression issue regarding above steps since the
> >>> VSC is not sent to reset controller during warm reboot.
> >>>
> >>> Fixed by sending the VSC to reset controller within qca_serdev_shutdown()
> >>> once BT was ever enabled, and the use-after-free issue is also fixed by
> >>> this change since the serdev is still opened before it is flushed or wrote.
> >>>
> >>> Verified by the reported machine Dell XPS 13 9310 laptop over below two
> >>> kernel commits:
> >>> commit e00fc2700a3f ("Bluetooth: btusb: Fix triggering coredump
> >>> implementation for QCA") of bluetooth-next tree.
> >>> commit b23d98d46d28 ("Bluetooth: btusb: Fix triggering coredump
> >>> implementation for QCA") of linus mainline tree.
> >>>
> >>> Fixes: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Reported-by: Wren Turkal <wt@xxxxxxxxxxxxxxxx>
> >>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218726
> >>> Signed-off-by: Zijun Hu <quic_zijuhu@xxxxxxxxxxx>
> >>> Tested-by: Wren Turkal <wt@xxxxxxxxxxxxxxxx>
> >>> ---
> >>> V1 -> V2: Add comments and more commit messages
> >>>
> >>> V1 discussion link:
> >>> https://lore.kernel.org/linux-bluetooth/d553edef-c1a4-4d52-a892-715549d31ebe@xxxxxxx/T/#t
> >>>
> >>>  drivers/bluetooth/hci_qca.c | 18 +++++++++++++++---
> >>>  1 file changed, 15 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> >>> index 0c9c9ee56592..9a0bc86f9aac 100644
> >>> --- a/drivers/bluetooth/hci_qca.c
> >>> +++ b/drivers/bluetooth/hci_qca.c
> >>> @@ -2450,15 +2450,27 @@ static void qca_serdev_shutdown(struct device *dev)
> >>>       struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
> >>>       struct hci_uart *hu = &qcadev->serdev_hu;
> >>>       struct hci_dev *hdev = hu->hdev;
> >>> -     struct qca_data *qca = hu->priv;
> >>>       const u8 ibs_wake_cmd[] = { 0xFD };
> >>>       const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
> >>>
> >>>       if (qcadev->btsoc_type == QCA_QCA6390) {
> >>> -             if (test_bit(QCA_BT_OFF, &qca->flags) ||
> >>> -                 !test_bit(HCI_RUNNING, &hdev->flags))
> >>> +             /* The purpose of sending the VSC is to reset SOC into a initial
> >>> +              * state and the state will ensure next hdev->setup() success.
> >>> +              * if HCI_QUIRK_NON_PERSISTENT_SETUP is set, it means that
> >>> +              * hdev->setup() can do its job regardless of SoC state, so
> >>> +              * don't need to send the VSC.
> >>> +              * if HCI_SETUP is set, it means that hdev->setup() was never
> >>> +              * invoked and the SOC is already in the initial state, so
> >>> +              * don't also need to send the VSC.
> >>> +              */
> >>> +             if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks) ||
> >>> +                 hci_dev_test_flag(hdev, HCI_SETUP))
> >>>                       return;
> The main purpose of above checking is NOT to make sure the serdev within
> open state as its comments explained.
> >>>
> >>> +             /* The serdev must be in open state when conrol logic arrives
> >>> +              * here, so also fix the use-after-free issue caused by that
> >>> +              * the serdev is flushed or wrote after it is closed.
> >>> +              */
> >>>               serdev_device_write_flush(serdev);
> >>>               ret = serdev_device_write_buf(serdev, ibs_wake_cmd,
> >>>                                             sizeof(ibs_wake_cmd));
> >> i believe Zijun's change is able to fix both below issues and don't
> >> introduce new issue.
> >>
> >> regression issue A:  BT enable failure after warm reboot.
> >> issue B:  use-after-free issue, namely, kernel crash.
> >>
> >>
> >> For issue B, i have more findings related to below commits ordered by time.
> >>
> >> Commit A: 7e7bbddd029b ("Bluetooth: hci_qca: Fix qca6390 enable failure
> >> after warm reboot")
> >>
> >> Commit B: de8892df72be ("Bluetooth: hci_serdev: Close UART port if
> >> NON_PERSISTENT_SETUP is set")
> >> this commit introduces issue B, it is also not suitable to associate
> >> protocol state with state of lower level transport type such as serdev
> >> or uart, in my opinion, protocol state should be independent with
> >> transport type state, flag HCI_UART_PROTO_READY is for protocol state,
> >> it means if protocol hu->proto is initialized and if we can invoke its
> >> interfaces.it is common for various kinds of transport types. perhaps,
> >> this is the reason why Zijun's change doesn't use flag HCI_UART_PROTO_READY.
> >
> > Don't really follow you here, if HCI_UART_PROTO_READY indicates the
> > protocol state they is even _more_ important to use before invoking
> > serdev APIs, so checking for the quirk sound like a problem because:
> >
> > [1] hci_uart_close
> >      /* When QUIRK HCI_QUIRK_NON_PERSISTENT_SETUP is set by driver,
> >      * BT SOC is completely powered OFF during BT OFF, holding port
> >      * open may drain the battery.
> >      */
> >     if (test_bit(HCI_QUIRK_NON_PERSISTENT_SETUP, &hdev->quirks)) {
> >         clear_bit(HCI_UART_PROTO_READY, &hu->flags);
> >         serdev_device_close(hu->serdev);
> >     }
> >
> > [2] hci_uart_unregister_device
> >     if (test_bit(HCI_UART_PROTO_READY, &hu->flags)) {
> >         clear_bit(HCI_UART_PROTO_READY, &hu->flags);
> >         serdev_device_close(hu->serdev);
> >     }
> >both case 1 and case 2 were introduced by Commit B in question which
> uses protocol state flag HCI_UART_PROTO_READY to track lower level
> transport type state, i don't think it is perfect.
>
> for common files hci_serdev.c and hci_ldisc.c, as you saw, the purpose
> of checking HCI_UART_PROTO_READY is to call protocol relevant
> interfaces, moreover, these protocol relevant interfaces do not deal
> with lower transport state. you maybe even notice below present function
> within which lower level serdev is flushed before HCI_UART_PROTO_READY
> is checked:
>
> static int hci_uart_flush(struct hci_dev *hdev)
> {
> ......
>         /* Flush any pending characters in the driver and discipline. */
>         serdev_device_write_flush(hu->serdev);
>
>         if (test_bit(HCI_UART_PROTO_READY, &hu->flags))
>                 hu->proto->flush(hu);
>
>         return 0;
> }
>
> in my opinion, that is why qca_serdev_shutdown() does not check
> HCI_UART_PROTO_READY for later lower level serdev operations.
> > So only in case 1 checking the quirk is equivalent to
> > HCI_UART_PROTO_READY on case 2 it does actually check the quirk and
> > will proceed to call serdev_device_close, now perhaps the code is
> > assuming that shutdown won't be called after that, but it looks it
> > does since:
> >
> qca_serdev_shutdown() will never be called after case 2 as explained
> in the end.
> > static void serdev_drv_remove(struct device *dev)
> > {
> >     const struct serdev_device_driver *sdrv =
> > to_serdev_device_driver(dev->driver);
> >     if (sdrv->remove)
> >         sdrv->remove(to_serdev_device(dev));
> >
> >     dev_pm_domain_detach(dev, true);
> > }
> >
> > dev_pm_domain_detach says it will power off so I assume that means
> > that shutdown will be called _after_ remove, so not I'm not really
> > convinced that we can avoid using HCI_UART_PROTO_READY, in fact the
> > following sequence might always be triggering:
> >
> dev_pm_domain_detach() should be irrelevant with qca_serdev_shutdown(),
> should not trigger call of qca_serdev_shutdown() as explained in the end
> > serdev_drv_remove -> qca_serdev_remove -> hci_uart_unregister_device
> > -> serdev_device_close -> qca_close -> kfree(qca)
> > dev_pm_domain_detach -> ??? -> qca_serdev_shutdown
> >
> > If this sequence is correct then qca_serdev_shutdown accessing
> > qca_data will always result in a UAF problem.
> >
> above sequence should not correct as explained below.
>
> serdev and its driver should also follow below generic device and driver
> design.
>
> 1)
> driver->shutdown() will be called during shut-down time at this time
> driver->remove() should not have been called.
>
> 2)
> driver->shutdown() is impossible to be called once driver->remove()
> was called.
>
> 3) for serdev, driver->remove() does not trigger call of
> driver->shutdown() since PM relevant poweroff is irrelevant with
> driver->shutdown() and i also don't find any PM relevant interfaces will
> call driver->shutdown().
>
> i would like to explain issue B based on comments Zijun posted by public
> as below:
>
> issue B actually happens during reboot and let me look at these steps
> boot -> enable BT -> disable BT -> reboot.
>
> 1) step boot will call driver->probe() to register hdev and the serdev
> is opened after boot.
>
> 2) step enable will call hdev->open() and the serdev will still in open
> state
>
> 3) step disable will call hdev->close() and the serdev will be closed
> after hdev->close() for machine with config which results in
> HCI_QUIRK_NON_PERSISTENT_SETUP is set.
>
> 4) step reboot will call qca_serdev_shutdown() which will flush and
> write the serdev which are closed by above step disable, so cause the
> UAF issue, namely, kernel crash issue.
>
> so this issue is caused by commit B which close the serdev during
> hdev->close().
>
> driver->remove() even is not triggered during above steps.
> >> Commit C: 272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on
> >> closed serdev")
> >> this commit is to fix issue B which is actually caused by Commit B, but
> >> it has Fixes tag for Commit A. and it also introduces the regression
> >> issue A.
> >>
> >
> >

Reading again the commit message for the UAF fix it sounds like a
different problem:

    The driver shutdown callback (which sends EDL_SOC_RESET to the device
    over serdev) should not be invoked when HCI device is not open (e.g. if
    hci_dev_open_sync() failed), because the serdev and its TTY are not open
    either.  Also skip this step if device is powered off
    (qca_power_shutdown()).

So if hci_dev_open_sync has failed it says serdev and its TTY will not
be open either, so I guess that's why HCI_SETUP was added as a
condition to bail out? So it seems correct to do that although I'd
change the comments.

@Krzysztof Kozlowski do you still have a test setup for 272970be3dab
("Bluetooth: hci_qca: Fix driver shutdown on closed serdev"), can you
try with these changes?

-- 
Luiz Augusto von Dentz





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux