Hi Sonny, On Fri, Aug 21, 2020 at 11:36 AM Sonny Sasaka <sonnysasaka@xxxxxxxxxxxx> wrote: > > When device_attach_att is invoked, it may be that the old att is still > connected (the disconnection hasn't been detected). > > This patch calls the disconnection callback before creating the new att > since the disconnection callback will never be invoked after the new att > is created. The disconnection callback also cleans up the att of the > device object before assigning a new one. This way the old att will not > at later time fire disconnect event which would operate on the already > freed device pointer. > > When there is a HCI LE Disconnection Complete event followed by HCI LE > Connection Complete event and they are very close together, this > sequence could happen because the kernel doesn't guarantee that the > closing of the l2cap socket (due to LE Disconnection Complete event) > always happens earlier than the creation of the new l2cap socket (due to > LE Connection Complete event). This actually sounds like a bug in the kernel, either it should cleanup the existing sockets or not disconnect them at all if there is a new connection in place, otherwise it quite possible that there will be many more races like this. Have you tried doing something similar with BR/EDR, we actually depend on the socket being disconnected to notify all the profiles so then can cleanup properly. > Tested by repeatedly connecting/disconnecting to a device until the > situation happens and checking that bluetoothd doesn't crash. > > Reviewed-by: Shyh-In Hwang <josephsih@xxxxxxxxxxxx> > > --- > src/device.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/src/device.c b/src/device.c > index 7b7808405..bed8f38b5 100644 > --- a/src/device.c > +++ b/src/device.c > @@ -5304,6 +5304,15 @@ bool device_attach_att(struct btd_device *dev, GIOChannel *io) > return false; > } > > + /* This may be reached when the device already has att attached to it. > + * In this case disconnect the att first before assigning the new one, > + * otherwise the old att may fire a disconnect event at later time > + * and will invoke operations on the already freed device pointer. > + * The error code (ECONNRESET) is chosen arbitrarily since the > + * disconnection event (with an error code) is not yet detected. > + */ > + if (dev->attrib || dev->att) > + att_disconnected_cb(ECONNRESET, dev); Another way to resolve this is to cleanup earlier at device_remove_connection since we know at that point ATT will be disconnected. > dev->attrib = attrib; > dev->att = g_attrib_get_att(attrib); > > -- > 2.26.2 > -- Luiz Augusto von Dentz