On 23.04.22 08:21, Kalle Valo wrote:
Alexander Wetzel <alexander@xxxxxxxxxxxxxx> writes:
Using not existing queues can panic the kernel with rtl8180/rtl8185
cards. Ignore the skb priority for those cards, they only have one
tx queue.
Cc: stable@xxxxxxxxxxxxxxx
Reported-by: pa@xxxxxxxxx
Tested-by: pa@xxxxxxxxx
Signed-off-by: Alexander Wetzel <alexander@xxxxxxxxxxxxxx>
---
Pierre Asselin (pa@xxxxxxxxx) reported a kernel crash in the Gentoo forum:
https://forums.gentoo.org/viewtopic-t-1147832-postdays-0-postorder-asc-start-25.html
He also confirmed that this patch fixes the issue.
In summary this happened:
After updating wpa_supplicant from 2.9 to 2.10 the kernel crashed with a
"divide error: 0000" when connecting to an AP.
Control port tx now tries to use IEEE80211_AC_VO for the priority, which
wpa_supplicants starts to use in 2.10.
Since only the rtl8187se part of the driver supports QoS, the priority
of the skb is set to IEEE80211_AC_BE (2) by mac80211 for rtl8180/rtl8185
cards.
rtl8180 is then unconditionally reading out the priority and finally crashes on
drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c line 544 without this
patch:
idx = (ring->idx + skb_queue_len(&ring->queue)) % ring->entries
"ring->entries" is zero for rtl8180/rtl8185 cards, tx_ring[2] never got
initialized.
All this after "---" line is very useful information but the actual
commit log is just two sentences. I would copy all to the commit log.
We don't need to limit the size of the commit log, on the contrary we
should include all the information in it.
I see what you mean, fine for me.
If you prefer I can also make an update but feel to handle that at your
convenience. If you e.g. see a better way to do that drop the patch and
simply submit your version.
While I spent some time figuring out how QoS is intended to work and I'm
pretty sure I finally got the outline it I'm still wondering why we
never set the priority for skb's on the normal transmit path.
Obviously the idea is to keep the queue from whoever set it prior to us
and just overwriting it with good reason.
I plan to look a bit more into that, especially since Pierre's system
was working when wpa_supplicant is not using control Port. Thus
skb_get_queue_mapping() must return zero - or max one - on that path.
That only makes sense when the network subsystem knows that QoS is not
supported and is not bothering to set the queue. (Or if we would map
zero to IEEE80211_AC_BE, but we are not handling it that way)
It basically drills down to the fact that we only call
_ieee80211_select_queue() on the normal tx path for drivers supporting
wake_tx_queue. I would have expected that call to be done for all
drivers. (Or at least all drivers supporting QoS.)
So there is either a strange bug or - so far more likely - some serious
gap in my still evolving understanding of QoS.