On 12/25/23 16:40, Johannes Berg wrote:
On Mon, 2023-12-25 at 14:00 +0300, Dmitry Antipov wrote:
I'm trying to investigate the following WARN_ONCE() observed at
https://syzkaller.appspot.com/bug?extid=fdc5123366fb9c3fdc6d:
------------[ cut here ]------------
no supported rates for sta (null) (0xffffffff, band 1) in rate_mask 0x0 with flags 0x0
WARNING: CPU: 1 PID: 2875 at net/mac80211/rate.c:379 __rate_control_send_low+0x6d9/0x800 net/mac80211/rate.c:379
...
There is a (weird and completely unreadable) reproducer, the most recent one
https://syzkaller.appspot.com/text?tag=ReproC&x=10437de6e80000 matches 6.7.0-rc6.
IIUC it creates a kind of a virtual subnet of 'mac80211_hwsim' instances and then
enforces an attempt to setup an incorrect set of rates. Since I assume that
this WARN_ONCE() shouldn't happen, there might be some missing check for the
contents of rate-related fields of 'struct ieee80211_sub_if_data'. I've found
that this WARN_ONCE() may be avoided one step later by silently dropping the
(presumably invalid?) rate control packet in 'ieee80211_tx_h_rate_ctrl()',
i. e.:
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index ed4fdf655343..3ca1db6bb0fd 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -703,6 +703,9 @@ ieee80211_tx_h_rate_ctrl(struct ieee80211_tx_data *tx)
txrc.reported_rate.idx = -1;
txrc.rate_idx_mask = tx->sdata->rc_rateidx_mask[info->band];
+ if (unlikely(txrc.rate_idx_mask == 0))
+ return TX_DROP;
+
if (tx->sdata->rc_has_mcs_mask[info->band])
txrc.rate_idx_mcs_mask =
tx->sdata->rc_rateidx_mcs_mask[info->band];
but most likely this is wrong and should be handled in some another
way somewhere else.
Yeah. I'd say rate_mask 0 is the problem, but ... oh, right. I think the
problem is that we apply the rate mask while scanning, that doesn't
really make sense ...
If you're making a connection on 2.4 GHz (band 0) then you need not have
a rate mask set up for 5 GHz (band 1), and so it's probably 0 by
default, but scanning will go on that band anyway ...
I've found that the default mask for band 1 is 0xff (as set in ieee80211_if_add()):
2139 for (i = 0; i < NUM_NL80211_BANDS; i++) {
2140 struct ieee80211_supported_band *sband;
2141 sband = local->hw.wiphy->bands[i];
2142 sdata->rc_rateidx_mask[i] =
2143 sband ? (1 << sband->n_bitrates) - 1 : 0; /* here */
but it is changed to 0 in ieee80211_set_bitrate_mask() via the 'legacy' mask:
3346 for (i = 0; i < NUM_NL80211_BANDS; i++) {
3347 struct ieee80211_supported_band *sband = wiphy->bands[i];
3348 int j;
3349
3350 sdata->rc_rateidx_mask[i] = mask->control[i].legacy; /* here */
3351 memcpy(sdata->rc_rateidx_mcs_mask[i], mask->control[i].ht_mcs,
3352 sizeof(mask->control[i].ht_mcs));
3353 memcpy(sdata->rc_rateidx_vht_mcs_mask[i],
3354 mask->control[i].vht_mcs,
3355 sizeof(mask->control[i].vht_mcs));
(IIUC this happens while handing the message comes from userspace via the
netlink socket, and this was one of the reasons to design the reproducer).
But is it acceptable at all? Shouldn't resetting the rate mask to zero disable
the corresponding band completely (including scanning)? Or should it be
prohibited to zero the default non-zero rate mask?
Dmitry