Hi Dominique, > -----Original Message----- > From: Dominique Martinet [mailto:dominique.martinet@xxxxxx] > Sent: Thursday, April 02, 2015 8:44 PM > To: Shachar Raindel > Subject: Re: [oss-security] RE: CVE-2014-8159 kernel: infiniband: > uverbs: unprotected physical memory access > > Hi, > > Shachar Raindel wrote on Thu, Apr 02, 2015: > > From: Yann Droneaud [mailto:ydroneaud@xxxxxxxxxx] > > > Another related question: as the large memory range could be > registered > > > by user space with ibv_reg_mr(pd, base, size, IB_ACCESS_ON_DEMAND), > > > what's prevent the kernel to map a file as the result of mmap(0, > ...) > > > in this region, making it available remotely through > IBV_WR_RDMA_READ / > > > IBV_WR_RDMA_WRITE ? > > > > > > > This is not a bug. This is a feature. > > > > Exposing a file through RDMA, using ODP, can be done exactly like > this. > > Given that the application explicitly requested this behavior, I don't > > see why it is a problem. Actually, some of our tests use such flows. > > The mmu notifiers mechanism allow us to do this safely. When the page > is > > written back to disk, it is removed from the ODP mapping. When it is > > accessed by the HCA, it is brought back to RAM. > > Forgive the private reply, but I've actually tried that a while ago and No problem, better than e-mailing the entire world on RDMA specific topics. As this discussion is relevant to the Linux RDMA users, I'm adding back the linux-rdma mailing list. > couldn't get it to work - ibv_reg_mr would return EINVAL on an address > obtained by mmap. Were you mmaping a normal disk file, or was the mmap targeting an MMIO of another hardware device? mmap of a normal disk file should work also with normal memory registration, assuming you are providing the proper length. mmap of the MMIO area of another hardware device (i.e. interfacing an FPGA, NVRAM, or similar things) requires some code changes on both sides. The current kernel code in the infiniband side is using get_user_pages, which does not support MMIO pages. The proposed PeerDirect patches [1] allows peer device to declare ownership of virtual address ranges, and enable such registration. However, these patches are have not yet been merged upstream. > > Conceptually as well I'm not sure how it's supposed to work, mmap should > only actually issue reads when memory access issues page faults (give or > take suitable readahead logic), but I don't think direct memory access > from the IB/RDMA adapter would issue such page faults ? You are correct. RDMA adapters without ODP support do not issue page faults. Instead, during memory registration, the ib_umem code calls get_user_pages, which ensures all relevant pages are in memory, and pins them as needed. > Likewise on writes, would need the kernel to notice memory has been > written and pages are dirty/needs flushing. > Similarly, when deregistering a writable memory region, the kernel driver marks all pages as dirty before unpinning them. You can see the code doing this in [2]. > So, what am I missing? I'd wager it's that "ODP" you're talking about, > do you have any documentation I could skim through ? > Liran Liss gave a presentation about ODP at OFA [3]. The technology is available for ConnectIB devices using the most recent firmware and kernel versions above 3.19. Thanks, --Shachar [1] http://www.spinics.net/lists/linux-rdma/msg21770.html [2] http://lxr.free-electrons.com/source/drivers/infiniband/core/umem.c#L62 [3] https://www.openfabrics.org/images/Workshops_2014/DevWorkshop/presos/Tuesday/pdf/09.30_2014_OFA_Workshop_ODP_update_final.pdf and https://www.youtube.com/watch?v=KbrlsXQbHCw ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f