Hi Alexander, thanks for the reply On 23/05/2019 02:58, Alexander Wetzel wrote: >> I'm working on software for a DMG implementation (802.11ad/ay) using RSN, GCMP, >> 802.1X auth, and for a while we've been working around this issue: the MLME >> protocol/NL80211 doesn't provide any ordering between key setup and data >> transmission. During PTK rekeying, on receipt of message 3/4 the supplicant >> derives the new key, sends 4/4, then installs the new key and installs it to the >> MAC. However there's no guarantee that 4/4 is actually sent before the new key >> is installed, so it can get encrypted with the new key, which the authenticator >> has not yet installed. We see this issue for real on our systems when message >> 4/4 shares a stream with other in-flight data. I believe this is a well-known >> issue - is that correct? > > First, that can only be an issue when you rekey connections. It's not a > problem for the initial connect. Rekeying per se is something I would label as > "mostly broken" at the moment, independent of the platform. There are some > driver/cards which are ok but even when it seems to work it could be another > bug preventing the worst side effects... > > Now encrypting EAPOL#4 with the wrong key would be something I would see as a > driver bug. Especially when mac80211 is providing the PN and the card is > encrypting the packet without making sure the PN and the key belong > together. This will freeze the connection till you reconnect or rekey again... > If you want to dig dipper into that you probably should have a look at > https://patchwork.kernel.org/project/linux-wireless/list/?series=13743&state=* > > These patches should allow drivers to rekey a PTK key but without driver > support it may well not work. As long as the driver is using mac80211 and > implements flush() it should also solve your problem. Ah thanks, since my original posting I ended up finding those patches - the device I'm working on doesn't use mac80211 (MAC is in HW/FW) but they helped me understand the problem/solutions much better. I may be mistaken but I still don't think a driver like mine has the necessary information to reliably overwrite PTK0 with current kernel+userspace - more below... > >> It seems to me that the proper solution is described in 802.11-2016 12.7.6.4.4 - >> "4-way handshake message 3". If both peers report Extended Key ID for >> Individually Addressed Frames (WPA_CAPABILITY_EXT_KEY_ID_FOR_UNICAST) in their >> RSN capabilities, then we alternate between key IDs 0 and 1 for each rekeying >> handshake. The authenticator installs the new key for RX as soon as it hears >> 3/4. That means if the supplicant ends up encrypting 4/4 with the new key, no >> problem - the authenticator can still decapsulate it. Then once it's got 4/4 it >> sets up the new key for TX too (and should delete the old one, I guess). >> >> I originally intended this mail to include a hacky patch to implement this, but >> it turns out Linux rejects pairwise GCMP keys with nonzero index - the >> comment[1] says: >> >> /* Disallow pairwise keys with non-zero index unless it's WEP >> * or a vendor specific cipher (because current deployments use >> * pairwise WEP keys with non-zero indices and for vendor >> * specific ciphers this should be validated in the driver or >> * hardware level - but 802.11i clearly specifies to use zero) >> */ >> >> [1] https://elixir.bootlin.com/linux/v5.1.3/source/net/wireless/util.c#L240 >> > > Extended Key ID has been added 2012 to the standard after some guys at intel > looked at rekey and found out that this is simply broken in the > standard. Extended Key ID is optional and there are basically zero > implementations of it in the wild. The good news here is, that is just > changing: I've just finished generic support for that and it will be in linux > 5.2 and hopefully also in hostapd soon. > > Patches for hostapd/wpa_supplicant are available (I've submitted them for > review/merge roughly a month ago here, here the first patch of the series: > http://lists.infradead.org/pipermail/hostap/2019-April/039998.html. Still > waiting for feedback or merge.) > > I'm using Extended Key ID now quite some time. So far I've only got it fully > operational for Sw crypto only devices and all iwlwifi cards. (A-MPDU > handling / iwlwifi needs other patches not in mainline, yet.) > > Which brings us to the bad news: When using HW crypto the card/firmware must > also support Extended Key ID. At least some cards (ath9k, ath10k) are assuming > unicast packets are always using keyID 0 and don't verify the keyID in the > MPDUs. They only accept one unicast key per station. > > So that's only a viable solution when you can be sure all STA's will support > it and can convince the card(s) to actually handle two unicast keys per STA in > the HW correctly. > > On the plus side this is the only solution allowing you to rekey the PTK > without losing MPDU's. All other solutions will drop at least some MPDUs under > load with all the downside that has. Ah, this is great, hopefully I can take a look at the patches and provide some feedback. >> I don't understand that rationale or where in 802.11 we are told to use zero - >> could anyone clarify? > > IEEE-802.11 - 2012 was the first version allowing non-zero keyIDs. > Prior to that the standard did not provide any way to have two unicast keys active for one station. > You can see what has been changed in the standard here: > https://mentor.ieee.org/802.11/dcn/10/11-10-0314-00-000m-rekeying-protocol-fix-text.doc > > In the new versions the content has moved around a bit but did not really change. OK I see, thanks for the context. >> Supposing I was able to change Linux to allow nonzero key indexes, would the >> mechanism I've described be viable or does hostap have its own reasons for >> insisting on zero key indexes? > > hostapd does not (yet) support extended Key ID and all code has been writen with the assumptions that PTK keys must use KeyID 0. > Using a linux kernel, hostpad and a driver supporting Extended Key ID will of > course work. Question is, if you can get that working properly with your card > and it's a valid solution for your scenario. > >> >> If we implement this in hostap, what is the best way to get the information on >> whether the kernel driver supports multiple live keys; add a new NL80211 >> extended feature? If we know the driver supports it then IIUC hostapd and >> wpa_supplicant can assert WPA_CAPABILITY_EXT_KEY_ID_FOR_UNICAST in the nl IE >> attrs that go alongside NL80211_CMD_START_AP and NL80211_CMD_CONNECT >> respectively. > > Linux 5.2 signals support for drivers supporting Extended Key ID with NL80211_EXT_FEATURE_EXT_KEY_ID > > >> Also, when I talked earlier about installing keys "for RX" and "for TX", I am >> assuming that, if the driver supports it, you can do a NL80211_CMD_NEW_KEY to >> install the key without yet starting to encapsulate with it - then subsequently >> NL80211_CMD_SET_KEY(NL80211_ATTR_KEY_DEFAULT) to start actually using it for >> TX. Let me know if I've misunderstood there.. > > Extended Key ID needs a way to tell the driver to not use a PTK key immediately for Tx. linux 5.2 adds the key flags NL80211_KEY_NO_TX and NL80211_KEY_SET_TX for that. > > >> >> PS, another solution I considered is just finding a way to provide an ordering >> guarantee between key installation and the data stream - it sort of looks as >> though NL80211_CMD_CONTROL_PORT_FRAME is designed to do this sort of thing, but >> I think to solve this problem you'd need the kernel driver's >> cfg80211_ops.tx_control_port to block until the frame has actually been sent. >> mac80211 (only provider in the mainline kernel kernel of tx_control_port) >> doesn't seem to do that. > > That also would solve the problem, at least as long as your card generates the > PN in HW. But my stand here is, that a driver which encrypts a packet with the > wrong key is broken and should be fixed. Yep, in my case the PNs and all 802.11 headers are generated in HW. As I said above I don't really understand how it's possible for a driver not to have this bug with current info - I may still be missing something or maybe the situation is different for mac80211 drivers, which I am not really familiar with. Our driver gets a stream of MSDUs (inc. EAPoL) and a stream of cfg80211 calls, with no ordering between the two. Aside from teaching the driver to understand EAPoL and deduce where in the MSDU stream it is supposed to switch keys, I don't see a way out. Since writing my original mail and looking at the patches you linked above, I realised that I do not actually need my .tx_control_port to block until the frame is sent; instead I can just flush the TX queue before overwriting PTK0. IIUC this is an incomplete solution with current userspace, because 4/4 may not have reached the HW ring before the flush happens. However if 4/4 is sent using tx_control_port then the driver is empowered to know that it is already in the ring when it gets the key installation, so a flush is guaranteed to solve the problem. (Like in mac80211 - ieee80211_hw_key_replace says that ieee80211_flush_queues "*may* help prevent the clear text leaks and freezes."; if I'm not mistaken then if tx_control_port is in use and doesn't do any intermediate queueing, it will *certainly* prevent the freezes). Realising that I can solve the issue without having my cfg80211 hooks behave so differently from mac80211 made me less dismissive of EAPoL-over-NL80211 as a solution. So the future is bright - hopefully your Extended Key ID support will eventually provide a "real" solution which even fixes data loss, and in the meantime hopefully EAPoL-over-NL80211 can at least stop the association getting broken (This is also attractive because it will work back to v4.17 kernels). Thanks again! Brendan _______________________________________________ Hostap mailing list Hostap@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/hostap