Re: [LSFMM] RDMA data corruption potential during FS writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 18, 2018 at 07:33:41PM -0700, John Hubbard wrote:
> On 05/18/2018 01:23 PM, Dan Williams wrote:
> > On Fri, May 18, 2018 at 10:36 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >> On Fri, May 18, 2018 at 04:47:48PM +0000, Christopher Lameter wrote:
> >>> On Fri, 18 May 2018, Jason Gunthorpe wrote:
> >>>
> >>>
> >>> The newcomer here is RDMA. The FS side is the mainstream use case and has
> >>> been there since Unix learned to do paging.
> >>
> >> Well, it has been this way for 12 years, so it isn't that new.
> >>
> >> Honestly it sounds like get_user_pages is just a broken Linux
> >> API??
> >>
> >> Nothing can use it to write to pages because the FS could explode -
> >> RDMA makes it particularly easy to trigger this due to the longer time
> >> windows, but presumably any get_user_pages could generate a race and
> >> hit this? Is that right?
> 
> +1, and I am now super-interested in this conversation, because
> after tracking down a kernel BUG to this classic mistaken pattern:
> 
>     get_user_pages (on file-backed memory from ext4)
>     ...do some DMA
>     set_pages_dirty
>     put_page(s)

Ummm, RDMA has done essentially that since 2005, since when did it
become wrong? Do you have some references? Is there some alternative?

See __ib_umem_release

> ...there is (rarely!) a backtrace from ext4, that disavows ownership of
> any such pages.

Yes, I've seen that oops with RDMA, apparently isn't actually that
rare if you tweak things just right.

I thought it was an obscure ext4 bug :(

> Because the obvious "fix" in device driver land is to use a dedicated
> buffer for DMA, and copy to the filesystem buffer, and of course I will
> get *killed* if I propose such a performance-killing approach. But a
> core kernel fix really is starting to sound attractive.

Yeah, killed is right. That idea totally cripples RDMA.

What is the point of get_user_pages FOLL_WRITE if you can't write to
and dirty the pages!?!

Jason




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux