On 17/02/2016 10:44, Christoph Hellwig wrote: > That doesn't change how the are managed. We've always suppored mapping > BARs to userspace in various drivers, and the only real news with things > like the pmem driver with DAX or some of the things people want to do > with the NVMe controller memoery buffer is that there are much bigger > quantities of it, and: > > a) people want to be able have cachable mappings of various kinds > instead of the old uncachable default. What if we do want an uncachable mapping for our device's BAR. Can we still expose it under ZONE_DEVICE? > b) we want to be able to DMA (including RDMA) to the regions in the > BARs. > > a) is something that needs smaller amounts in all kinds of areas to be > done properly, but in principle GPU drivers have been doing this forever > using all kinds of hacks. > > b) is the real issue. The Linux DMA support code doesn't really operate > on just physical addresses, but on page structures, and we don't > allocate for BARs. We investigated two ways to address this: 1) allow > DMA operations without struct page and 2) create struct page structures > for BARs that we want to be able to use DMA operations on. For various > reasons version 2) was favored and this is how we ended up with > ZONE_DEVICE. Read the linux-mm and linux-nvdimm lists for the lenghty > discussions how we ended up here. I was wondering what are your thoughts regarding the other questions we raised about ZONE_DEVICE. How can we overcome the section-alignment requirement in the current code? Our HCA's BARs are usually smaller than 128MB. Sagi also asked how should a peer device who got a ZONE_DEVICE page know it should stop using it (the CMB example). Regards, Haggai -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>