Hi, > From: Johannes Berg johannes@xxxxxxxxxxxxxxxx > Sent: Thursday, September 23, 2021 12:20 > >> Our target is to give the guest VM a similar level of control over >> WiFi as other applications on the host. The host OS keeps control of >> the NIC. Requests from the guest are executed through calls to public >> host wlan APIs and the result is returned to the guest driver. > > That makes some sense. I say some intentionally though, because consider the differences - a typical application on the host will definitely not care (even the browser, skype, etc. will not), unless they specifically want to do something with wifi such as for IOT onboarding or whatnot. > > A typical guest VM on the other hand will run a pretty typical operating system, and that *will* "care", in the sense that it always wants to use and control a wifi device (if present). > > This might just mean that it's continuously scanning for networks it knows about and can connect to, or it might mean that it's actually connecting to the networks that it knows about. The host, on the other hand, might have its own ideas about which networks you should be connected to? I fear that having both of this might conflict, so I was curious how you'd be solving that. You are right, if network managers run both on the host and the guest, they may conflict. In our scenario, since we focus on highly integrated VM (such as Windows Subsystem for Linux), we avoid the issue by disabling network management in the guest. This leave us with only connection requests from programs targeting specific networks coming from the guest. I guess other policies could also be implemented by the host component, depending on the degree of control one want the guest to have: either disabling the host network manager, or defining some priority and dropping unwanted requests... >> Since the host keeps control of the NIC, it handles multiple things >> trying to use WiFi the same way it handles different host applications trying to use Wi-Fi. >> This means the host OS can reject a command from the guest, or that >> the guest VM could get disconnected if another program on the host >> initiates a connection to a different Wi-Fi network. > > Right. > > I *think* that to some extent I'm actually thinking of "OS" vs. > "applications" in too strict a separation, and on Windows it might actually be that the part of the OS that implements the wifi network selection is "just" an application? A la Intel ProSet (not that I know anything about it)? Your initial understanding was correct, network selection in Windows is mostly implemented in a OS service (wlansvc), not an application. However, the proxy host component is just an application that calls this service public API. So, from the point of view of wlansvc and the host OS, the guest is "just" another application as far as the Wi-Fi control path is concerned. >> We also considered forwarding nl80211 messages directly, since it >> could avoid the need for a specialized guest driver. However, we >> wondered about compatibility issues (what if the host and guest >> versions of nl80211 don’t match?), and it seemed much more complex to >> implement, with significant changes to cfg80211 and likely other parts >> of the wireless subsystem. Overall, the >> nl80211 forwarding option appears architecturally sound, but given the >> much larger scope and impact, we focused on a more targeted solution >> in which the guest driver doesn’t own the host NIC. We feel this >> solution provides a middle ground where the host can decide which >> parts of its wireless stack to proxy to the guest. > > Ah, that's interesting. I had only considered this for the *guest*, and assumed that the host would handle the (nl80211) messages in a special device implementation software, not forward those directly to the host > (Linux) kernel. > > It sounds like you considered the case of basically letting the guest applications direction talk nl80211 to the host kernel, which is far beyond what I considered! > > I completely agree here though - you definitely want some proxy on the host side. > > But like I said, I was just considering that as the guest side implementation. We don't have machinery for this right now in netlink, but I could see perhaps some way of allowing "pre_doit" to return say "1" to say "we abort here but please don't send a response to userspace". Then, the pre_doit() could call a driver method passing the > nl80211 message down instead of calling the real operation, and the application using nl80211 would end up directly talking to the nl80211 implementation of the device. > > I don't think the device _could_ even implement it by talking to the host kernel (even if it is Linux) because the netdev IDs and whatnot would be different, but it might be feasible for the guest implementation. > > The only place where this might run into trouble is with things that > nl80211 supports (enum nl80211_protocol_features), which we handle directly, and so an updated guest kernel might support more than would actually end up working. But the truth is that we added _exactly_ one such feature (NL80211_PROTOCOL_FEATURE_SPLIT_WIPHY_DUMP), and wiphy discovery is of course something that would anyway have to be handled by the guest. So not sure this is such a big deal. > > Anyway, not saying it should be done one way or the other, was just considering this as one possible way of simply pushing _all_ the APIs though to the device, and then the nl80211 implementation in the device can decide what it supports and whatnot, just like on older kernels we don't support certain things. The *driver* would then be fairly simple and basically would never have to be extended, but the device implementation (in the hypervisor or wherever) might be more difficult. Thanks for explaining for in details, I think I understand better your idea. All nl80211 messages being forwarded to the host would indeed add a lot of complexity to the host device implementation, a lot of the processing done by cfg80211 would likely have to be re-implemented. I imagine it would give more freedom to the host device implementation, too. This could be an interesting alternative if a full-mac based implementation ends up being too restrictive. Based on the discussion, your recommendations concerning our initial questions seem to be: - we should create a new driver, rather than modifying virt_wifi - netlink could be used as a protocol to communicate with the host Is that correct? Thanks, Guillaume