Re: Multiple peers with bluetooth_6lowpan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Josua,

On Thu, Jan 10, 2019 at 1:49 PM Josua Mayer <josua.mayer97@xxxxxxxxx> wrote:
>
> Good day once more,
>
> I have now identified a chain of calls leading up to the described
> issue, along with a work-around:
>
> The first step was discovering fishy debug messages in dmesg:
> [  235.505517] Connecting to first module while both are powered
> [  237.394085] ifindex 5 peer bdaddr 00:b1:fc:8c:6e:47 type 1 my addr
> a4:d5:78:11:cf:6f type 1
> [  241.872089] dest IP fe80::b1:fcff:fe8c:6e47
> [  241.872104] peers 1 addr fe80::b1:fcff:fe8c:6e47 rt   (null)
> [  241.872124] xmit bt0 to 00:b1:fc:8c:6e:47 type 1 IP 6800::600 chan
> (ptrval)
> [  242.356387] Connecting to second module
> [  244.073592] dest IP fe80::b1:fcff:fe8c:6e47
> [  244.073603] peers 2 addr fe80::b1:fcff:fe8c:6e47 rt   (null)
> [  244.073607] no such peer
>
> We can see here how the first module is connected, and packets to its
> link-local address are being transmitted.
> Then the second module is connected and the number of peers updates to 2.
> Now we have another packet for the first modules link-local address. We
> know there are 2 peers, but for some reason we get a message saying "no
> such peer"!
>
> Luckily this message was easy to trace:
> See net/bluetooth/6lowpan.c:setup_header
> This message is a direct result of a previous call to peer_lookup_dst
> returning null.
> Now while reviewing peer_lookup_dst, keep in mind that we are looking
> for a difference in behaviour when there is one, and when tehre are at
> least two peers.
> Let me just quote here:
> if (count == 1) {
> peer = list_first_or_null_rcu(&dev->peers, struct lowpan_peer, list);
> return peer;
> }
>
> If there is only one peer, no checks are performed at all, it is simply
> assumed that this peer mist be the one to receive packets for the given
> address.
> So this is the one peers, or one module connected case - which works
> just fine.
>
> Then follows a curious case that I do not fully understand:
> if no route is known, and no gateway was specified in packet data, do
> not even search for the right peer, simply return 0:
> if (!rt) {
> nexthop = &lowpan_cb(skb)->gw;
>
> if (ipv6_addr_any(nexthop))
> return NULL;
> }
> ^^ I believe this decision is wrong.
> There might be neither route nor gateway, if the destination is a peer.
>
> I have come up with the following work-around:
> -               nexthop = &lowpan_cb(skb)->gw;
> -
> -               if (ipv6_addr_any(nexthop))
> -                       return NULL;
> +               if (ipv6_addr_any(&lowpan_cb(skb)->gw)) {
> +                       /* There is neither route nor gateway,
> +                        * probably the destination is a direct peer.
> +                        */
> +                       nexthop = daddr;
> +               } else {
> +                       /* There is a known gateway
> +                        */
> +                       nexthop = &lowpan_cb(skb)->gw;
> +               }
> I am submitting this patch as separately as:
> [RFC] bluetooth_6lowpan: search for destination address in all peers
> It is by no means finished and meant to illustrate the core issue, and
> allow for a discussion around the control logic, and purpose of
> Please comment if I have understood the purpose of the peer_lookup_dst
> function.
> I might even suggest removing the special handling of one peer ... .

I like this version better but apparently the patch you have sent is
only matching part of the address, not sure why you had refactored
that. If I recall the reason why peer_lookup_dst exists is that we
need to resolve the channel where to send the packets.

> Yours sincerely
> Josua Mayer
>
>
> Am 08.01.19 um 19:57 schrieb Josua Mayer:
> > Greetings everybody,
> >
> > I want to present to you an issue I am having the 6LoWPAN over BLE
> > facility in the kernel.
> > I have reached the point where I don't know where, what and how to debug
> > the situation and am hoping for some advice here:
> >
> > First an overview of the setup:
> > 1. an SBC with BLE capable Bluetooth chip
> > 2. multiple Nordic nRF52840 modules
> >
> > This is the problem I have observed:
> > 1. One Nordic module is powered - SBC connects to it
> > --> ping6 works flawlessly till the module restarts
> > --> communication with the remote server works as expected
> >
> > 2. Two Nordic modules are powered - SBC connects only to one at a time
> > --> ping6 works flawlessly till the module restarts
> > --> communication with the remote server works as expected
> >
> > 3. Two Nordic modules are powered - SBC connects to both
> > --> ping6 receives no more replies as soon as the second module is connected
> > --> communication to the remote server stops as soon as the second
> > module is connected
> >
> > Test Case:
> > rfkill unblock 0
> > modprobe bluetooth_6lowpan
> > echo -n 'module bluetooth_6lowpan +p' >
> > /sys/kernel/debug/dynamic_debug/control
> >
> > while true; do ping6 -c 1 -I bt0 fe80::b1:fcff:fe8c:6e47 || true; sleep
> > 1; done
> >
> > echo "Connecting to first module while both are powered" > /dev/kmsg
> > echo "connect 00:B1:FC:8C:6E:47 1" >
> > /sys/kernel/debug/bluetooth/6lowpan_control
> > # sit back and watch pings till module restarts
> > echo "Connecting to first module while both are powered" > /dev/kmsg
> > echo "connect 00:B1:FC:8C:6E:47 1" >
> > /sys/kernel/debug/bluetooth/6lowpan_control
> > # wait till first ping goes through
> > echo "Connecting to second module" > /dev/kmsg
> > echo "connect 00:39:D3:29:92:1C 1" >
> > /sys/kernel/debug/bluetooth/6lowpan_control
> > # Expected: ping6 continues to receive replies
> > # Actual result: ping6 times out
> >
> > Please see attached dmesg.log from this test case, with dynamic
> > debugging enabled for module bluetooth_6lowpan.
> >
> > The Nordic modules are programmed to advertise themselves for
> > establishing a connection; Then they start communication with a server
> > on the internet over ipv6. Finally they are rebooted by a watchdog.
> > While a module is connected, it can be pinged by its link-local address
> > which is derived from its MAC address and thereby known.
> >
> > As you may have noticed I just wrote "SBC" above.
> > That is because I have done this experiment with 3 different SBCs:
> > 1. SolidRun HummingBoard with i.MX6 uSOM Revision 1.5
> > features Ti WL18MODGB combined WiFi and Bluetooth module
> > - linux-image-4.20.0-trunk-armmp_4.20-1~exp2_armhf.deb
> > (+BT_HCIUART=m, +BT_HCIUART_LL=y, +DYNAMIC_DEBUG=y)
> > ^^ This system was used to produce the attached dmesg.log
> >
> > 2. RaspberryPi 3B
> > 3. RaspberryPi 3B+
> > - rpi-4.15.y (from their github)
> > - rpi-4.16.y (from their github)
> > - rpi-4.17.y (from their github)
> > - rpi-4.18.y (from their github)
> > - rpi-4.19.y (from their github)
> > - rpi-4.20.y (from their github)
> > bcm2709_defconfig
> > zImage modules dtbs -j12
> > gcc-linaro-7.3.1-2018.05-x86_64_arm-linux-gnueabihf
> >
> > rpi-4.14.y suffers from a busy kworker (Workqueue: hci0 hci_rx_work
> > [bluetooth]) making tests difficult.
> >
> > Supposedly back in 4.4.8-v7 on raspberrypi this issue with multiple
> > peers did not exist, while the busy kworker did pop up after time
> > requring a reboot.
> > I did not verify or test with that rather old version yet.
> > Would it be a good idea to start from that 4.4.8 rpi fork working up to
> > 4.15 to find the place where it broke? I feel like this kind of work is
> > difficult
> > when forks are involved.
> >
> > Are there any components of the kernel in particular that could be verified
> > in order to figure out what is going wrong?
> >
> >
> > Yours sincerely
> > Josua Mayer
> >



-- 
Luiz Augusto von Dentz



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux