ath11k and ath12k suffer from a long-standing issue that is partly caused by the QRTR implementation, which only supports one device per node/port combination and partly caused by the fact that the QMI instance ID of the devices are statically set to 1. P Praneesh <quic_ppranees@xxxxxxxxxxx> submitted a patch[0] that fixes this issue generating a unique QMI instance ID based on the PCI bus data for the device and passing that information to the QMI subsystem and the device's firmware via a special PCI register. However, it quickly turned out that this approach works for the hardware he tested, but fails for other ath11k-based devices, including the popular QCA6930, since its firmware just ignores the special register being used. Since we need QMI (and, for matter, QRTR) to work for the initial firmware upload, this approach will not work generically. Fortunately, Denis Kenzior <denkenz@xxxxxxxxx> cooked up a patch set for QRTR[1] that introduces the concept of endpoint IDs, which are dynamically allocated and can be used to distinguish different devices even though they use the same node/port combination. Using this patch set, endpoint IDs can be reported as part of auxiliary data in the QRTR socket and bound to for client sockets, which will automatically filter messages from other endpoints and also make sure that client messages are routed to the correct endpoint. This looked promising, and with that functionality, the only challenge is to find out the correct endpoint IDs and bind to them in drivers to finally support multiple devices in a generic way. This patch set implements exactly that, and it WORKSFORME, but unfortunately it turns out that "the only challenge" is very difficult to overcome due to the socket-based architecture. ath1{1,2}k and QRTR are at opposite sides of the socket, with QRTR assigning endpoint IDs and ath1{1,2}k needing a way to fetch and operate on them. The endpoint reporting feature in QRTR is not helpful in this case, because drivers do not generally know which endpoint belongs to the device they currently handle (i.e., there is no central registry) and even if we were to snoop on the socket and take the first endpoint ID we are unaware of, this might not be the correct one, because it might be in use by a different driver for instance, or correspond to a different device. The first iteration of this patch set[2] extended struct mhi_device with a qrtr_endpoint_id field that was initialized to zero and populated by the QRTR MHI driver as soon as it was loaded. Drivers could then query this field through an mhi_device->mhi_cntrl->mhi_device chain (if they also use the MHI bus, of course). This, however, was an incredibly ugly hack because QRTR data should not be part of MHI device structures in the first place, timing is critical (drivers querying the endpoint ID must do so after the QRTR MHI module initialized, which is typically only the case after QRTR socket was created) and it was not possible to query or pre-assign an endpoint ID before creating a socket and directly bind to it (which might lead to races such as seeing messages over the socket that are not meant for the endpoint ID drivers are actually interested in). Since that was not elegant at all, and due to the other mentioned issues, this iteration uses a different approach: endpoint IDs can now be associated with (private) backend endpoint-specific data, which allows us to identify which endpoint ID is being used with what backend, and additionally new API is introduced so that other parts in the kernel can either get an endpoint ID for given endpoint-specific data or even attach endpoint-specific data to a new endpoint ID generated by the QRTR driver. The QRTR system will try to use endpoint-specific data if possible, but falls back to generating endpoint IDs without endpoint-specific data (as in, NULL pointer) if that did not work. Crucially, the endpoint-specific data pointer is used as an opaque void pointer and at most compared with data stored in the endpoints XArray. In the QRTR MHI backend, we use the MHI controller's structure pointer as endpoint-specific data, and since the MHI controller is also the bus master and responsible for the physical link, clients (drivers) can pre-register an endpoint ID for their MHI controllers and directly tell QMI to bind to the endpoint ID at socket creation time. The QRTR SMD backend uses its rpmsg_device pointer and the TUN backend uses the inode pointer as their respective endpoint-specific data pointers. This approach is cleaner and works better, because it is not prone to races (although it requires coordination between QRTR backends and clients/drivers because both must use the same endpoint-specific data for the scheme to work). There are, however, also issues with this approach: - Any kernel part can generate an unlimited number of endpoint IDs with arbitrary pointers. The amount of endpoint IDs that can be tracked is limited, though, so there is potential for exhaustion of ID space. - Since previously endpoint IDs were only generated by the QRTR subsystem, there was no need to use any kind of life cycle management for the endpoint IDs: they were created at node creation time and also deleted at node deletion time. Since other subsystems can now create endpoint IDs, it would probably be good to have a way to reclaim created but unused endpoint IDs. No such implementation is provided here. - Multiple PCI/MHI devices will work, but no AHB + PCI/MHI interaction has been tested. Since PCI/MHI devices bind to their endpoint ID, these will likely work, but the AHB devices might still see messages for all endpoints and fail to work correctly. AHB devices seem to be using QMI and thus also QRTR, but without a specific QRTR backend driver (going through REMOTEPROC instead?), so this approach might not be viable for AHB devices. I am much more comfortable with this patch set, even if it has some rough edges and might not fix the situation for AHB devices. [0] https://patch.msgid.link/20230111170033.32454-1-kvalo@xxxxxxxxxx [1] https://patch.msgid.link/20241018181842.1368394-1-denkenz@xxxxxxxxx [2] https://msgid.link/cover.1730790058.git.ionic@xxxxxxxx v2: code and metadata cleanup (checkpatch.pl), no functional changes Mihai Moldovan (13): net: qrtr: support registering endpoint-specific data net: qrtr: mhi: register mhi_controller as endpoint-specific data net: qrtr: smd: register rpmsg_device as endpoint-specific data net: qrtr: tun: register inode as endpoint-specific data soc: qcom: qmi_helpers: add QRTR endpoint ID to qmi_handle soc: qcom: qmi_helpers: optionally bind to QRTR endpoint ID in qmi_sock_create wifi: ath11k: add QRTR endpoint ID hif feature wifi: ath11k: stub QRTR endpoint ID fetching for AHB wifi: ath11k: implement QRTR endpoint ID fetching for PCI wifi: ath11k: bind to QRTR endpoint ID in ath11k_qmi_init_service wifi: ath12k: add QRTR endpoint ID hif feature wifi: ath12k: implement QRTR endpoint ID fetching for PCI wifi: ath12k: bind to QRTR endpoint ID in ath12k_qmi_init_service MAINTAINERS | 1 + drivers/net/wireless/ath/ath11k/ahb.c | 7 ++ drivers/net/wireless/ath/ath11k/hif.h | 9 ++ drivers/net/wireless/ath/ath11k/mhi.c | 22 +++++ drivers/net/wireless/ath/ath11k/mhi.h | 1 + drivers/net/wireless/ath/ath11k/pci.c | 1 + drivers/net/wireless/ath/ath11k/qmi.c | 8 ++ drivers/net/wireless/ath/ath12k/hif.h | 10 +++ drivers/net/wireless/ath/ath12k/mhi.c | 22 +++++ drivers/net/wireless/ath/ath12k/mhi.h | 2 + drivers/net/wireless/ath/ath12k/pci.c | 1 + drivers/net/wireless/ath/ath12k/qmi.c | 9 ++ drivers/soc/qcom/qmi_interface.c | 28 ++++++ include/linux/soc/qcom/qmi.h | 3 + include/net/qrtr.h | 11 +++ net/qrtr/af_qrtr.c | 124 +++++++++++++++++++++++++- net/qrtr/mhi.c | 1 + net/qrtr/qrtr.h | 5 ++ net/qrtr/smd.c | 1 + net/qrtr/tun.c | 1 + 20 files changed, 264 insertions(+), 3 deletions(-) create mode 100644 include/net/qrtr.h -- 2.45.2