On Mon, Nov 11, 2019 at 01:51:33PM +0000, Jules Irenge wrote: > > > NAK. force-cast (and it's not a gcc extension, BTW - it's sparse) is basically > > "I know better; the code is right, so STFU already". *IF* counters.count_... > > is really little-endian 32bit, then why isn't it declared that way? And if > > it's host-endian, you've just papered over a real bug here. > > > > As a general rule "fix" doesn't mean "tell it to shut up"... > > > > Thanks for the comments, I have updated but I have a mixed mind on the > __le32. I have to read more about it. > > I would appreciate if you can comment again on the update. >From the look at the driver, it seems that all these counters are a part of structure that is read from the hardware and only used as little-endian. So just declare all fields in struct hif_mib_extended_count_table as __le32; easy enough. Looking a bit further, the same goes for struct hif_mib_bcn_filter_table ->num_of_info_elmts struct hif_mib_keep_alive_period ->keep_alive_period (__le16) struct hif_mib_template_frame ->frame_length (__le16) struct hif_mib_set_association_mode ->basic_rate_set (__le32) struct hif_req_update_ie ->num_i_es (__le16) struct hif_req_write_mib ->mib_id, ->length (__le16 both) struct hif_req_read_mib ->mib_id (__le16) struct hif_req_configuration ->length (__le16) With those annotated we are left with drivers/staging/wfx/bh.c:73:35: warning: incorrect type in argument 1 (different base types) drivers/staging/wfx/bh.c:73:35: expected restricted __le16 const [usertype] *p drivers/staging/wfx/bh.c:73:35: got unsigned short [usertype] * drivers/staging/wfx/bh.c:106:41: warning: cast to restricted __le32 drivers/staging/wfx/bh.c:175:22: warning: cast to restricted __le16 drivers/staging/wfx/hwio.c:199:16: warning: cast from restricted __le32 drivers/staging/wfx/hwio.c:199:14: warning: incorrect type in assignment (different base types) drivers/staging/wfx/hwio.c:199:14: expected unsigned int [usertype] drivers/staging/wfx/hwio.c:199:14: got restricted __le32 [usertype] drivers/staging/wfx/hif_tx.c:36:18: warning: incorrect type in assignment (different base types) drivers/staging/wfx/hif_tx.c:36:18: expected unsigned short [usertype] len drivers/staging/wfx/hif_tx.c:36:18: got restricted __le16 [usertype] drivers/staging/wfx/data_tx.c:646:22: warning: incorrect type in assignment (different base types) drivers/staging/wfx/data_tx.c:646:22: expected unsigned short [usertype] len drivers/staging/wfx/data_tx.c:646:22: got restricted __le16 [usertype] drivers/staging/wfx/hif_tx_mib.h:111:27: warning: incorrect type in initializer (different base types) drivers/staging/wfx/hif_tx_mib.h:111:27: expected unsigned int [usertype] enable drivers/staging/wfx/hif_tx_mib.h:111:27: got restricted __le32 [usertype] drivers/staging/wfx/hif_tx_mib.h:112:30: warning: incorrect type in initializer (different base types) drivers/staging/wfx/hif_tx_mib.h:112:30: expected unsigned int [usertype] bcn_count drivers/staging/wfx/hif_tx_mib.h:112:30: got restricted __le32 [usertype] drivers/staging/wfx/hif_tx_mib.h:34:38: warning: incorrect type in initializer (different base types) drivers/staging/wfx/hif_tx_mib.h:34:38: expected unsigned char [usertype] wakeup_period_max drivers/staging/wfx/hif_tx_mib.h:34:38: got restricted __le16 [usertype] drivers/staging/wfx/main.c:108:39: warning: incorrect type in initializer (different base types) drivers/staging/wfx/main.c:108:39: expected restricted __le16 [usertype] rx_highest drivers/staging/wfx/main.c:108:39: got int drivers/staging/wfx/./traces.h:155:1: warning: incorrect type in argument 1 (different base types) drivers/staging/wfx/./traces.h:155:1: expected restricted __le16 const [usertype] *p drivers/staging/wfx/./traces.h:155:1: got unsigned short [usertype] * drivers/staging/wfx/./traces.h:155:1: warning: incorrect type in argument 1 (different base types) drivers/staging/wfx/./traces.h:155:1: expected restricted __le16 const [usertype] *p drivers/staging/wfx/./traces.h:155:1: got unsigned short [usertype] * and that's where the real bugs start to show up; leaving the misbegotten forest of macros in misbegotten tracing shite aside, we have this: static const struct ieee80211_supported_band wfx_band_2ghz = { .channels = wfx_2ghz_chantable, .n_channels = ARRAY_SIZE(wfx_2ghz_chantable), .bitrates = wfx_rates, .n_bitrates = ARRAY_SIZE(wfx_rates), .ht_cap = { // Receive caps .cap = IEEE80211_HT_CAP_GRN_FLD | IEEE80211_HT_CAP_SGI_20 | IEEE80211_HT_CAP_MAX_AMSDU | (1 << IEEE80211_HT_CAP_RX_STBC_SHIFT), .ht_supported = 1, .ampdu_factor = IEEE80211_HT_MAX_AMPDU_16K, .ampdu_density = IEEE80211_HT_MPDU_DENSITY_NONE, .mcs = { .rx_mask = { 0xFF }, // MCS0 to MCS7 .rx_highest = 65, drivers/staging/wfx/main.c:108:39: refering to this initializer. Sparse say that it expects rx_highest to be __le16. And that's not a driver-specific structure; it's generic ieee80211 one. Which says struct ieee80211_mcs_info { u8 rx_mask[IEEE80211_HT_MCS_MASK_LEN]; __le16 rx_highest; u8 tx_params; u8 reserved[3]; } __packed; and grepping for rx_highest through the tree shows that everything else is treating it as little-endian 16bit. Almost certainly a bug on big-endian hosts; should be .rx_highest = cpu_to_le16(65), instead. Looking for more low-hanging fruits, we have static int indirect_read32_locked(struct wfx_dev *wdev, int reg, u32 addr, u32 *val) { int ret; __le32 *tmp = kmalloc(sizeof(u32), GFP_KERNEL); if (!tmp) return -ENOMEM; wdev->hwbus_ops->lock(wdev->hwbus_priv); ret = indirect_read(wdev, reg, addr, tmp, sizeof(u32)); *val = cpu_to_le32(*tmp); _trace_io_ind_read32(reg, addr, *val); wdev->hwbus_ops->unlock(wdev->hwbus_priv); kfree(tmp); return ret; } with warnings about val = cpu_to_le32(*tmp); fair enough, since *val is host-endian (u32) and *tmp - little-endian. Trivial misannotation - it should've been le32_to_cpu(), not cpu_to_le32(). Same mapping on all CPUs we are ever likely to support, so it's just a misannotation, not a bug per se. drivers/staging/wfx/hif_tx_mib.h:34:38: warning: incorrect type in initializer (different base types) drivers/staging/wfx/hif_tx_mib.h:34:38: expected unsigned char [usertype] wakeup_period_max drivers/staging/wfx/hif_tx_mib.h:34:38: got restricted __le16 [usertype] is about static inline int hif_set_beacon_wakeup_period(struct wfx_vif *wvif, unsigned int dtim_interval, unsigned int listen_interval) { struct hif_mib_beacon_wake_up_period val = { .wakeup_period_min = dtim_interval, .receive_dtim = 0, .wakeup_period_max = cpu_to_le16(listen_interval), }; and struct hif_mib_beacon_wake_up_period has wakeup_period_max declared as uint8_t. We are shoving a le16 value into it. Almost certain bug - that will result in the listen_interval % 256 on litte-endian host and listen_interval / 256 on big-endian one. Looking at the callers to see what's actually passed as listen_interval shows only hif_set_beacon_wakeup_period(wvif, wvif->dtim_period, wvif->dtim_period); and dtim_period in *wvif (struct wfx_vif) can be assigned 0, 1 or values coming from struct ieee80211_tim_ie ->dtim_period or struct ieee80211_bss_conf ->dtim_period, 8bit in either structure. In other words, the value stored in val.wakeup_period_max will be always zero on big-endian hosts. Definitely bogus, should just store that (8bit) value as-is; cpu_to_le16() is wrong here. Next piece of fun: static inline int hif_beacon_filter_control(struct wfx_vif *wvif, int enable, int beacon_count) { struct hif_mib_bcn_filter_enable arg = { .enable = cpu_to_le32(enable), .bcn_count = cpu_to_le32(beacon_count), }; return hif_write_mib(wvif->wdev, wvif->id, HIF_MIB_ID_BEACON_FILTER_ENABLE, &arg, sizeof(arg)); } Sounds like ->enable and ->bcn_count should both be __le32, which makes sense since the structs passed to hardware appear to be fixed-endian on that thing. However, annotating them as such adds warnigns: drivers/staging/wfx/sta.c:246:35: warning: incorrect type in assignment (different base types) drivers/staging/wfx/sta.c:246:35: expected restricted __le32 [assigned] [usertype] bcn_count drivers/staging/wfx/sta.c:246:35: got int drivers/staging/wfx/sta.c:249:32: warning: incorrect type in assignment (different base types) drivers/staging/wfx/sta.c:249:32: expected restricted __le32 [assigned] [usertype] enable drivers/staging/wfx/sta.c:249:32: got int drivers/staging/wfx/sta.c:253:32: warning: incorrect type in assignment (different base types) drivers/staging/wfx/sta.c:253:32: expected restricted __le32 [assigned] [usertype] enable drivers/staging/wfx/sta.c:253:32: got int drivers/staging/wfx/sta.c:262:62: warning: incorrect type in argument 2 (different base types) drivers/staging/wfx/sta.c:262:62: expected int enable drivers/staging/wfx/sta.c:262:62: got restricted __le32 [assigned] [usertype] enable drivers/staging/wfx/sta.c:262:78: warning: incorrect type in argument 3 (different base types) drivers/staging/wfx/sta.c:262:78: expected int beacon_count drivers/staging/wfx/sta.c:262:78: got restricted __le32 [assigned] [usertype] bcn_count All in the same function (wfx_update_filtering()) and we really do store host-endian values in those (first 3 places). In the last one we pass them to hif_beacon_filter_control(), which does expect host-endian. And that's the only thing we do to the instance of hif_mib_bcn_filter_enable in there... Possible solutions: 1) store them little-endian there, pass to hif_beacon_filter_control() already l-e, get rid of cpu_to_le32() in the latter. 2) store them little-endian, pass the entire pointer to struct instead of forming it again in hif_beacon_filter_control() 3) don't pretend that the objects in hif_beacon_filter_control() and in wfx_update_filtering() are of the same type (different layouts on big-endian) and replace the one in the caller with two local variables. My preference would be (3), as in diff --git a/drivers/staging/wfx/hif_api_mib.h b/drivers/staging/wfx/hif_api_mib.h index 5ed7561c7443..3eaf4affaa5d 100644 --- a/drivers/staging/wfx/hif_api_mib.h +++ b/drivers/staging/wfx/hif_api_mib.h @@ -273,8 +273,8 @@ enum hif_beacon_filter { }; struct hif_mib_bcn_filter_enable { - uint32_t enable; - uint32_t bcn_count; + __le32 enable; + __le32 bcn_count; } __packed; struct hif_mib_group_seq_counter { diff --git a/drivers/staging/wfx/sta.c b/drivers/staging/wfx/sta.c index 93f3739b5f3a..f623ff5b7a02 100644 --- a/drivers/staging/wfx/sta.c +++ b/drivers/staging/wfx/sta.c @@ -216,7 +216,6 @@ void wfx_update_filtering(struct wfx_vif *wvif) bool is_sta = wvif->vif && NL80211_IFTYPE_STATION == wvif->vif->type; bool filter_bssid = wvif->filter_bssid; bool fwd_probe_req = wvif->fwd_probe_req; - struct hif_mib_bcn_filter_enable bf_ctrl; struct hif_ie_table_entry filter_ies[] = { { .ie_id = WLAN_EID_VENDOR_SPECIFIC, @@ -237,21 +236,22 @@ void wfx_update_filtering(struct wfx_vif *wvif) } }; int n_filter_ies; + unsigned enable, bcn_count; if (wvif->state == WFX_STATE_PASSIVE) return; if (wvif->disable_beacon_filter) { - bf_ctrl.enable = 0; - bf_ctrl.bcn_count = 1; + enable = 0; + bcn_count = 1; n_filter_ies = 0; } else if (!is_sta) { - bf_ctrl.enable = HIF_BEACON_FILTER_ENABLE | HIF_BEACON_FILTER_AUTO_ERP; - bf_ctrl.bcn_count = 0; + enable = HIF_BEACON_FILTER_ENABLE | HIF_BEACON_FILTER_AUTO_ERP; + bcn_count = 0; n_filter_ies = 2; } else { - bf_ctrl.enable = HIF_BEACON_FILTER_ENABLE; - bf_ctrl.bcn_count = 0; + enable = HIF_BEACON_FILTER_ENABLE; + bcn_count = 0; n_filter_ies = 3; } @@ -259,7 +259,7 @@ void wfx_update_filtering(struct wfx_vif *wvif) if (!ret) ret = hif_set_beacon_filter_table(wvif, n_filter_ies, filter_ies); if (!ret) - ret = hif_beacon_filter_control(wvif, bf_ctrl.enable, bf_ctrl.bcn_count); + ret = hif_beacon_filter_control(wvif, enable, bcn_count); if (!ret) ret = wfx_set_mcast_filter(wvif, &wvif->mcast_filter); if (ret) but that's a matter of taste. Next is bx.c warning about __le32; that's about num_tx_count being fed to cpu_to_le32(). grepping for that thing results in drivers/staging/wfx/bh.c:106: release_count = le32_to_cpu(((struct hif_cnf_multi_transmit *)hif->body)->num_tx_confs); drivers/staging/wfx/hif_api_cmd.h:316: uint32_t num_tx_confs; drivers/staging/wfx/hif_rx.c:78: int count = body->num_tx_confs; which is troubling - the first line (in rx_helper()) expects to find a little-endian value in that field, while the last (in hif_multi_tx_confirm()) - a host-endian, with nothing in sight that might account for conversion from one to another. Let's look at the call chains: hif_multi_tx_confirm() is called only as hif_handlers[...]->handler(), which happens in in wfx_handle_rx(). The call is hif_handlers[i].handler(wdev, hif, hif->body); and hif has come from struct hif_msg *hif = (struct hif_msg *) skb->data; wfx_handle_rx() is called by the same rx_helper()... skb is created by rx_helper() and apparently filled by the call if (wfx_data_read(wdev, skb->data, alloc_len)) goto err; right next to the allocation... and prior to the release_count = le32_to_cpu(((struct hif_cnf_multi_transmit *)hif->body)->num_tx_confs); where we expect little-endian, with nothing to modify the skb contents between that and the call of wfx_handle_rx(). hif in rx_helper() points to the same place - skb->data. OK, we almost certainly have a bug here. That thing allocates a packet and fills it with incoming data. Then it parses the damn thing, apparently treating the same field of the incoming as little-endian in one place and host-endian in another. In principle it's possible that the rest of the packet determines which one it is, but by the look of that code both places are hit if and only if hif->id is equal to HIF_CNF_ID_MULTI_TRANSMIT. It *can't* be correct on big-endian. Not even theoretically. And since it's over-the-wire data, I would expect it to be fixed-endian. That needs to be confirmed with the driver's authors and/or direct experiment on big-endian host, but I strongly suspect that the right fix is to have int count = le32_to_cpu(body->num_tx_confs); in hif_multi_tx_confirm() (and num_tx_confs annotated as __le32). HOWEVER, that opens another nasty can of worms. We have struct hif_cnf_tx *buf_loc = (struct hif_cnf_tx *) &body->tx_conf_payload; ... for (i = 0; i < count; ++i) { wfx_tx_confirm_cb(wvif, buf_loc); buf_loc++; } with count derived from the packet and body pointing into the packet. And no visible checks that would make sure the loop won't run out of the data we'd actually got. The check in rx_helper() verifies that hif->len matches the amount we'd received; the check for ->num_tx_confs in there doesn't look like what we'd needed (that would be offset of body.tx_conf_payload in packet + num_tx_confs * sizeof(struct hif_cnf_multi_transmit) compared to actual size). So it smells like a remote buffer overrun, little-endian or not. And at that point I would start looking for driver original authors with some rather pointed questions about the validation of data and lack thereof. BTW, if incoming packets are fixed-endian, I would expect more bugs on big-endian hosts - wfx_tx_confirm_cb() does things like tx_info->status.tx_time = arg->media_delay - arg->tx_queue_delay; with media_delay and tx_queue_delay both being 32bit fields in the incoming packet. So another question to the authors (or documentation, or direct experiments) is what endianness do various fields in the incoming data have. We can try and guess, but... _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel