On Fri, 2025-01-17 at 16:02 +0100, Andrew Lunn wrote: > > One important point I see is that there is a bit of a misnomer in the > > existing ISM name in that our ISM device does in fact *not* share > > memory in the common sense of the "shared memory" wording. > > Maybe this is the trap i fell into. So are you saying it is not a dual > port memory mapped into two CPUs physical address space? > Conceptually kind of but the existing s390 specific ISM device is a bit special. But let me start with some background. On s390 aka Mainframes OSs including Linux runs in so called logical partitions (LPARs) which are machine hypervisor VMs which use partitioned non-paging memory. The fact that memory is partitioned is important because this means LPARs can not share physical memory by mapping it. Now at a high level an ISM device allows communication between two such Linux LPARs on the same machine. The device is discovered as a PCI device and allows Linux to take a buffer called a DMB map that in the IOMMU and generate a token specific to another LPAR which also sees an ISM device sharing the same virtual channel identifier (VCHID). This token can then be transferred out of band (e.g. as part of an extended TCP handshake in SMC-D) to that other system. With the token the other system can use its ISM device to securely (authenticated by the token, LPAR identity and the IOMMU mapping) write into the original systems DMB at throughput and latency similar to doing a memcpy() via a syscall. On the implementation level the ISM device is actually a piece of firmware and the write to a remote DMB is a special case of our PCI Store Block instruction (no real MMIO on s390, instead there are special instructions). Sadly there are a few more quirks but in principle you can think of it as redirecting writes to a part of the ISM PCI devices' BAR to the DMB in the peer system if that makes sense. There's of course also a mechanism to cause an interrupt on the receiver as the write completes. > In another > email there was reference to shm. That would be a VMM equivalent, a > bunch of pages mapped into two processes address space. Yes as on a hypervisor which backs VMs with pages one can simply map the DMBin both guests (one mapping potentially being write only) with writes being literally a memcpy(). > > This comes back to the lack of top level architecture documentation. > Outside reviewers such as i will have difficultly making useful > contributions, and seeing potential overlap and reuse with other > systems, without having a basic understanding of what you are talking > about. > > Andrew I understand your frustration. I do think we're making progress here though. For one loopback ISM makes SMC-D usable/testable even on bare metal and then Alibaba is working on a virtio based ISM device and the draft is public[0]. And there is some information on SMC's use of ISM via a whitepaper[1] [0] https://lore.kernel.org/all/Y1IqX2uVpcD7cvRF@TonyMac-Alibaba/T/ [1] https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Shared%20Memory%20Communications%20Version%202.1%20Emulated-ISM_0.pdf