On Tue, 6 Jun 2017, Ahmad Omary wrote: > In the above use case device, we can have only 64 processes per node which > Is critical for HPC. You can have 64 pages that are mapped by any number of processes. You can map a single page to multiple processes which maybe a requirement if you want to implement your own synchronization primitives as the mention of semaphores suggests. > Vendor driver still allocates and map 4KB pages granularity. But in case th= > e > HW device supports less than 4KB, then the HW must provide the required pro= > tection. The OS needs to provide the protection if that is the case and then it probably is not a HPC device. We are talking about RDMA high end devices here. The design here is for performance and low latency. I dont know of any devices in use in HPC or in HFT that have these tiny memory sizes. Mostly these devices are already engineeded for mmapping. Is this for some kind of embedded device? > Note that the device memory does not necessary have to be mapped to the CPU= > . > i.e. is not necessary accessible by PCI, and can only be accessed by RDMA. > This is why we can't use MMAP for all cases and a dedicated allocation and > copy functions are needed. Can we come up with some sort of ioctl API then to write to the devices inaccessible memory? There must be other drives outside of the RDMA tree that have similar requirements and that may already have implemented some version of it. This seems to be a very specialized application that may be device specific. ioctls are usually used then. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html