Re: [LSFMM] RDMA data corruption potential during FS writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 18, 2018 at 04:47:48PM +0000, Christopher Lameter wrote:
> On Fri, 18 May 2018, Jason Gunthorpe wrote:
> 
> > > The solution that was proposed at the meeting was that mmu notifiers can
> > > remedy that situation by allowing callbacks to the RDMA device to ensure
> > > that the RDMA device and the filesystem do not do concurrent writeback.
> >
> > This keeps coming up, and I understand why it seems appealing from the
> > MM side, but the reality is that very little RDMA hardware supports
> > this, and it carries with it a fairly big performance penalty so many
> > users don't like using it.
> 
> Ok so we have a latent data corruption issue that is not being addressed.
> 
> > > But could we do more to prevent issues here? I think what may be useful is
> > > to not allow the memory registrations of file back writable mappings
> > > unless the device driver provides mmu callbacks or something like that.
> >
> > Why does every proposed solution to this involve crippling RDMA? Are
> > there really no ideas no ideas to allow the FS side to accommodate
> > this use case??
> 
> The newcomer here is RDMA. The FS side is the mainstream use case and has
> been there since Unix learned to do paging.

Well, it has been this way for 12 years, so it isn't that new.

Honestly it sounds like get_user_pages is just a broken Linux
API??

Nothing can use it to write to pages because the FS could explode -
RDMA makes it particularly easy to trigger this due to the longer time
windows, but presumably any get_user_pages could generate a race and
hit this? Is that right?

I am left with the impression that solving it in the FS is too
performance costly so FS doesn't want that overheard? Was that also
the conclusion?

Could we take another crack at this during Linux Plumbers? Will the MM
parties be there too? I'm sorry I wasn't able to attend LSFMM this
year!

> > > There may even be more issues if DAX is being used but the FS writeback
> > > has the potential of biting anyone at this point it seems.
> >
> > I think Dan already 'solved' this via get_user_pages_longterm which
> > just fails for DAX backed pages.
> 
> That is indeed crippling and would be killing the ideas that we had around
> here for using DAX. There needs to be an ability to shove large amounts of
> data into memory via RDMA and from there onto a disk without too much of a
> fuss and without copying. In the case of DAX this trivially should avoid
> the copying to disk since its already in memory. If this does not work
> then the whole thing is really not that high performant anymore since it
> requires a copy operation.

AFIAK, if you enable ODP on your MR then DAX will work as you want,
but you take lower network performance to get it. You might be the
first person to test this though ;)

Jason




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux