On Fri, May 18, 2018 at 8:24 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > On Fri, May 18, 2018 at 07:33:41PM -0700, John Hubbard wrote: >> On 05/18/2018 01:23 PM, Dan Williams wrote: >> > On Fri, May 18, 2018 at 10:36 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: >> >> On Fri, May 18, 2018 at 04:47:48PM +0000, Christopher Lameter wrote: >> >>> On Fri, 18 May 2018, Jason Gunthorpe wrote: >> >>> >> >>> >> >>> The newcomer here is RDMA. The FS side is the mainstream use case and has >> >>> been there since Unix learned to do paging. >> >> >> >> Well, it has been this way for 12 years, so it isn't that new. >> >> >> >> Honestly it sounds like get_user_pages is just a broken Linux >> >> API?? >> >> >> >> Nothing can use it to write to pages because the FS could explode - >> >> RDMA makes it particularly easy to trigger this due to the longer time >> >> windows, but presumably any get_user_pages could generate a race and >> >> hit this? Is that right? >> >> +1, and I am now super-interested in this conversation, because >> after tracking down a kernel BUG to this classic mistaken pattern: >> >> get_user_pages (on file-backed memory from ext4) >> ...do some DMA >> set_pages_dirty >> put_page(s) > > Ummm, RDMA has done essentially that since 2005, since when did it > become wrong? Do you have some references? Is there some alternative? > > See __ib_umem_release > >> ...there is (rarely!) a backtrace from ext4, that disavows ownership of >> any such pages. > > Yes, I've seen that oops with RDMA, apparently isn't actually that > rare if you tweak things just right. > > I thought it was an obscure ext4 bug :( > >> Because the obvious "fix" in device driver land is to use a dedicated >> buffer for DMA, and copy to the filesystem buffer, and of course I will >> get *killed* if I propose such a performance-killing approach. But a >> core kernel fix really is starting to sound attractive. > > Yeah, killed is right. That idea totally cripples RDMA. > > What is the point of get_user_pages FOLL_WRITE if you can't write to > and dirty the pages!?! > You're oversimplifying the problem, here are the details: https://www.spinics.net/lists/linux-mm/msg142700.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html