On 2018-02-13 01:45 PM, Christian König wrote: > Am 13.02.2018 um 17:56 schrieb Felix Kuehling: >> [SNIP] >> Each process gets a whole page of the doorbell aperture assigned to it. >> The assumption is that amdgpu only uses the first page of the doorbell >> aperture, so KFD uses all the rest. On GFX8 and before, the queue ID is >> used as the offset into the doorbell page. On GFX9 the hardware does >> some engine-specific doorbell routing, so we added another layer of >> doorbell management that's decoupled from the queue ID. >> >> Either way, an entire doorbell page gets mapped into user mode and user >> mode knows the offset of the doorbells for specific queues. The mapping >> is currently handled by kfd_mmap in kfd_chardev.c. > > Ok, wait a second. Taking a look at kfd_doorbell_mmap() it almost > looks like you map different doorbells with the same offset depending > on which process is calling this. > > Is that correct? If yes then that would be illegal and a problem if > I'm not completely mistaken. Why is that a problem. Each process has its own file descriptor. The mapping is done using io_remap_pfn_range in kfd_doorbell_mmap. This is nothing new. It's been done like this forever even on Kaveri and Carrizo. > >>> Do you simply assume that after evicting a process it always needs to >>> be restarted without checking if it actually does something? Or how >>> does that work? >> Exactly. > > Ok, understood. Well that limits the usefulness of the whole eviction > drastically. > >> With later addition of GPU self-dispatch a page-fault based >> mechanism wouldn't work any more. We have to restart the queues blindly >> with a timer. See evict_process_worker, which schedules the restore with >> a delayed worker. >> which was send either by the GPU o >> The user mode queue ABI specifies that user mode update both the >> doorbell and a WPTR in memory. When we restart queues we (or the CP >> firmware) use the WPTR to make sure we catch up with any work that was >> submitted while the queues were unmapped. > > Putting cross process work dispatch aside for a moment GPU > self-dispatch works only when there is work on the GPU running. > > So you can still check if there are some work pending after you > unmapped everything and only restart the queues when there is new work > based on the page fault. > > In other words either there is work pending and it doesn't matter if > it was send by the GPU or by the CPU or there is no work pending and > we can delay restarting everything until there is. That sounds like a useful optimization. Regards,  Felix > > Regards, > Christian. > >> >> Regards, >>   Felix >> >>> Regards, >>> Christian. >