Re: [RFC] Heterogeneous memory management (mirror process address space on a device mmu).

Jerome Glisse <j.glisse@xxxxxxxxx> · Tue, 6 May 2014 12:49:33 -0400

On Tue, May 06, 2014 at 09:32:16AM -0700, Linus Torvalds wrote:
> On Tue, May 6, 2014 at 9:18 AM, Jerome Glisse <j.glisse@xxxxxxxxx> wrote:
> >
> > I do understand that i was pointing out that if i move to, tlb which i
> > am fine with, i will still need to sleep there.
> 
> No can do. The TLB flushing itself is called with a spinlock held, and
> we need to continue to do that.
> 
> Why do you really need to sleep? Because that sounds bogus.
> 
> What you *should* do is send the flush message, and not wait for any
> reply. You can then possibly wait for the result later on: we already
> have this multi-stage TLB flush model (using the "mmu_gather"
> structure) that has three phases:
> 
>  - create mmu_gather (allocate space for batching etc). This can sleep.
>  - do the actual flushing (possibly multiple times). This is the
> "synchronous with the VM" part and cannot sleep.
>  - tear down the mmu_gather data structures and actually free the
> pages we batched. This can sleep.
> 
> and what I think a GPU flush has to do is to do the actual flushes
> when asked to (because that's what it will need to do to work with a
> real TLB eventually), but if there's some crazy asynchronous
> acknowledge thing from hardware, it's possible to perhaps wait for
> that in the final phase (*before* we free the pages we gathered).

Plan i had in mind was to add an item atomicly inside mmu notifier to
schedule work on the gpu and have the tlb wait on the gpu to acknowledge
that it did update its page table and it is done using those pages.
This would happen in tlb_flush_mmu

> 
> Now, such an asynchronous model had better not mark page tables dirty
> after we flushed (we'd lose that information), but quite frankly,
> anything that is remote enough to need some async flush thing cannor
> sanely be close enough to be closely tied to the actual real page
> tables, so I don't think we need to care.

That's an issue as soon as i schedule the work (read as early on as i
can) the gpu can report any of the page as dirty and it can possibly
do so only once we wait for it in tlb_flush_mmu.

> 
> Anyway, I really think that the existing mmu_gather model *should*
> work fine for this all. It may be a bit inconvenient for crazy
> hardware, but the important part is that it definitely should work for
> any future hardware that actually gets this right.
> 

I stress again the GPU with dedicated memory is not going away on the
opposite you might see more dedicated memory not accessible from the
CPU.

> It does likely involve adding some kind of device list to "struct
> mm_struct", and I'm sure there's some extension to "struct mmu_gather"
> too, but _conceptually_ it should all be reasonably non-invasive.
> 
> Knock wood.
> 
>             Linus

I will port over to piggy back on mmu gather and other tlb flush. I will
post as soon as i have something that works with features this patchset
has.

Cheers,
Jérôme

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>