On Tue, Jun 22, 2021 at 06:24:28PM +0300, Oded Gabbay wrote: > On Tue, Jun 22, 2021 at 6:11 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > On Tue, Jun 22, 2021 at 04:12:26PM +0300, Oded Gabbay wrote: > > > > > > 1) Setting sg_page to NULL > > > > 2) 'mapping' pages for P2P DMA without going through the iommu > > > > 3) Allowing P2P DMA without using the p2p dma API to validate that it > > > > can work at all in the first place. > > > > > > > > All of these result in functional bugs in certain system > > > > configurations. > > > > > > > > Jason > > > > > > Hi Jason, > > > Thanks for the feedback. > > > Regarding point 1, why is that a problem if we disable the option to > > > mmap the dma-buf from user-space ? > > > > Userspace has nothing to do with needing struct pages or not > > > > Point 1 and 2 mostly go together, you supporting the iommu is not nice > > if you dont have struct pages. > > > > You should study Logan's patches I pointed you at as they are solving > > exactly this problem. > Yes, I do need to study them. I agree with you here. It appears I > have a hole in my understanding. I'm missing the connection between > iommu support (which I must have of course) and struct pages. Chistian explained what the AMD driver is doing by calling dma_map_resource(). Which is a hacky and slow way of achieving what Logan's series is doing. > > No, the design of the dmabuf requires the exporter to do the dma maps > > and so it is only the exporter that is wrong to omit all the iommu and > > p2p logic. > > > > RDMA is OK today only because nobody has implemented dma buf support > > in rxe/si - mainly because the only implementations of exporters don't > > Can you please educate me, what is rxe/si ? Sorry, rxe/siw - these are the all-software implementations of RDMA and they require the struct page to do a SW memory copy. They can't implement dmabuf without it. > ok... > so how come that patch-set was merged into 5.12 if it's buggy ? We only implemented true dma devices for RDMA DMABUF support, so it is isn't buggy right now. > Yes, that's what I expect to see. But I want to see it with my own > eyes and then figure out how to solve this. It might be tricky to test because you have to ensure the iommu is turned on and has a non-idenity page table. Basically if it doesn't trigger a IOMMU failure then the IOMMU isn't setup properly. Jason