Re: Multiple peers with bluetooth_6lowpan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Luiz,

Am 10.01.19 um 19:01 schrieb Luiz Augusto von Dentz:
> Hi Josua,
> 
> On Thu, Jan 10, 2019 at 1:49 PM Josua Mayer <josua.mayer97@xxxxxxxxx> wrote:
>>
>> Good day once more,
>>
>> I have now identified a chain of calls leading up to the described
>> issue, along with a work-around:
>>
>> The first step was discovering fishy debug messages in dmesg:
>> [  235.505517] Connecting to first module while both are powered
>> [  237.394085] ifindex 5 peer bdaddr 00:b1:fc:8c:6e:47 type 1 my addr
>> a4:d5:78:11:cf:6f type 1
>> [  241.872089] dest IP fe80::b1:fcff:fe8c:6e47
>> [  241.872104] peers 1 addr fe80::b1:fcff:fe8c:6e47 rt   (null)
>> [  241.872124] xmit bt0 to 00:b1:fc:8c:6e:47 type 1 IP 6800::600 chan
>> (ptrval)
>> [  242.356387] Connecting to second module
>> [  244.073592] dest IP fe80::b1:fcff:fe8c:6e47
>> [  244.073603] peers 2 addr fe80::b1:fcff:fe8c:6e47 rt   (null)
>> [  244.073607] no such peer
>>
>> We can see here how the first module is connected, and packets to its
>> link-local address are being transmitted.
>> Then the second module is connected and the number of peers updates to 2.
>> Now we have another packet for the first modules link-local address. We
>> know there are 2 peers, but for some reason we get a message saying "no
>> such peer"!
>>
>> Luckily this message was easy to trace:
>> See net/bluetooth/6lowpan.c:setup_header
>> This message is a direct result of a previous call to peer_lookup_dst
>> returning null.
>> Now while reviewing peer_lookup_dst, keep in mind that we are looking
>> for a difference in behaviour when there is one, and when tehre are at
>> least two peers.
>> Let me just quote here:
>> if (count == 1) {
>> peer = list_first_or_null_rcu(&dev->peers, struct lowpan_peer, list);
>> return peer;
>> }
>>
>> If there is only one peer, no checks are performed at all, it is simply
>> assumed that this peer mist be the one to receive packets for the given
>> address.
>> So this is the one peers, or one module connected case - which works
>> just fine.
>>
>> Then follows a curious case that I do not fully understand:
>> if no route is known, and no gateway was specified in packet data, do
>> not even search for the right peer, simply return 0:
>> if (!rt) {
>> nexthop = &lowpan_cb(skb)->gw;
>>
>> if (ipv6_addr_any(nexthop))
>> return NULL;
>> }
>> ^^ I believe this decision is wrong.
>> There might be neither route nor gateway, if the destination is a peer.
>>
>> I have come up with the following work-around:
>> -               nexthop = &lowpan_cb(skb)->gw;
>> -
>> -               if (ipv6_addr_any(nexthop))
>> -                       return NULL;
>> +               if (ipv6_addr_any(&lowpan_cb(skb)->gw)) {
>> +                       /* There is neither route nor gateway,
>> +                        * probably the destination is a direct peer.
>> +                        */
>> +                       nexthop = daddr;
>> +               } else {
>> +                       /* There is a known gateway
>> +                        */
>> +                       nexthop = &lowpan_cb(skb)->gw;
>> +               }
>> I am submitting this patch as separately as:
>> [RFC] bluetooth_6lowpan: search for destination address in all peers
>> It is by no means finished and meant to illustrate the core issue, and
>> allow for a discussion around the control logic, and purpose of
>> Please comment if I have understood the purpose of the peer_lookup_dst
>> function.
>> I might even suggest removing the special handling of one peer ... .
> 
> I like this version better but apparently the patch you have sent is
> only matching part of the address, not sure why you had refactored
> that. If I recall the reason why peer_lookup_dst exists is that we
> need to resolve the channel where to send the packets.
Actually I didn't reafactor. The two patches actually handle 2 different
issues. The first one deals with what I described in this topic;
The second patch is very hackish and meant to illustrate the next step
where we want to talk to dynamically assigned addresses.

I believe the cache of known neighbours should be checked for the mac
address, which in turn could be the search criteria for peers.
ip -6 neighb
2004::39:d3ff:fe29:921c dev bt0 lladdr 00:39:d3:29:92:1c REACHABLE
2004::b1:fcff:fe8c:6e47 dev bt0 lladdr 00:b1:fc:8c:6e:47 REACHABLE

> 
>> Yours sincerely
>> Josua Mayer
>>
>>
>> Am 08.01.19 um 19:57 schrieb Josua Mayer:
>>> Greetings everybody,
>>>
>>> I want to present to you an issue I am having the 6LoWPAN over BLE
>>> facility in the kernel.
>>> I have reached the point where I don't know where, what and how to debug
>>> the situation and am hoping for some advice here:
>>>
>>> First an overview of the setup:
>>> 1. an SBC with BLE capable Bluetooth chip
>>> 2. multiple Nordic nRF52840 modules
>>>
>>> This is the problem I have observed:
>>> 1. One Nordic module is powered - SBC connects to it
>>> --> ping6 works flawlessly till the module restarts
>>> --> communication with the remote server works as expected
>>>
>>> 2. Two Nordic modules are powered - SBC connects only to one at a time
>>> --> ping6 works flawlessly till the module restarts
>>> --> communication with the remote server works as expected
>>>
>>> 3. Two Nordic modules are powered - SBC connects to both
>>> --> ping6 receives no more replies as soon as the second module is connected
>>> --> communication to the remote server stops as soon as the second
>>> module is connected
>>>
>>> Test Case:
>>> rfkill unblock 0
>>> modprobe bluetooth_6lowpan
>>> echo -n 'module bluetooth_6lowpan +p' >
>>> /sys/kernel/debug/dynamic_debug/control
>>>
>>> while true; do ping6 -c 1 -I bt0 fe80::b1:fcff:fe8c:6e47 || true; sleep
>>> 1; done
>>>
>>> echo "Connecting to first module while both are powered" > /dev/kmsg
>>> echo "connect 00:B1:FC:8C:6E:47 1" >
>>> /sys/kernel/debug/bluetooth/6lowpan_control
>>> # sit back and watch pings till module restarts
>>> echo "Connecting to first module while both are powered" > /dev/kmsg
>>> echo "connect 00:B1:FC:8C:6E:47 1" >
>>> /sys/kernel/debug/bluetooth/6lowpan_control
>>> # wait till first ping goes through
>>> echo "Connecting to second module" > /dev/kmsg
>>> echo "connect 00:39:D3:29:92:1C 1" >
>>> /sys/kernel/debug/bluetooth/6lowpan_control
>>> # Expected: ping6 continues to receive replies
>>> # Actual result: ping6 times out
>>>
>>> Please see attached dmesg.log from this test case, with dynamic
>>> debugging enabled for module bluetooth_6lowpan.
>>>
>>> The Nordic modules are programmed to advertise themselves for
>>> establishing a connection; Then they start communication with a server
>>> on the internet over ipv6. Finally they are rebooted by a watchdog.
>>> While a module is connected, it can be pinged by its link-local address
>>> which is derived from its MAC address and thereby known.
>>>
>>> As you may have noticed I just wrote "SBC" above.
>>> That is because I have done this experiment with 3 different SBCs:
>>> 1. SolidRun HummingBoard with i.MX6 uSOM Revision 1.5
>>> features Ti WL18MODGB combined WiFi and Bluetooth module
>>> - linux-image-4.20.0-trunk-armmp_4.20-1~exp2_armhf.deb
>>> (+BT_HCIUART=m, +BT_HCIUART_LL=y, +DYNAMIC_DEBUG=y)
>>> ^^ This system was used to produce the attached dmesg.log
>>>
>>> 2. RaspberryPi 3B
>>> 3. RaspberryPi 3B+
>>> - rpi-4.15.y (from their github)
>>> - rpi-4.16.y (from their github)
>>> - rpi-4.17.y (from their github)
>>> - rpi-4.18.y (from their github)
>>> - rpi-4.19.y (from their github)
>>> - rpi-4.20.y (from their github)
>>> bcm2709_defconfig
>>> zImage modules dtbs -j12
>>> gcc-linaro-7.3.1-2018.05-x86_64_arm-linux-gnueabihf
>>>
>>> rpi-4.14.y suffers from a busy kworker (Workqueue: hci0 hci_rx_work
>>> [bluetooth]) making tests difficult.
>>>
>>> Supposedly back in 4.4.8-v7 on raspberrypi this issue with multiple
>>> peers did not exist, while the busy kworker did pop up after time
>>> requring a reboot.
>>> I did not verify or test with that rather old version yet.
>>> Would it be a good idea to start from that 4.4.8 rpi fork working up to
>>> 4.15 to find the place where it broke? I feel like this kind of work is
>>> difficult
>>> when forks are involved.
>>>
>>> Are there any components of the kernel in particular that could be verified
>>> in order to figure out what is going wrong?
>>>
>>>
>>> Yours sincerely
>>> Josua Mayer
>>>
> 
> 
> 



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux