On 2019-04-03 1:24 p.m., Koenig, Christian wrote: > Am 01.04.19 um 20:58 schrieb Kuehling, Felix: >> On 2019-04-01 2:03 p.m., Christian König wrote: >>> Am 01.04.19 um 19:59 schrieb Kuehling, Felix: >>>> On 2019-04-01 7:23 a.m., Christian König wrote: >>>>> Am 30.03.19 um 01:41 schrieb Kuehling, Felix: >>>>>> Patches 1-3 are Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx> >>>>> Thanks. >>>>> >>>>>> About the direct mode, that removes a bunch of synchronization, so it >>>>>> must make some assumptions about the state of the page tables. What >>>>>> makes that safe? >>>>> Direct mode is only supposed to be used during page fault handling. >>>>> >>>>> E.g. we know that the page tables are in the correct place in this >>>>> situation because the hardware is hammering on a PTE and waiting for >>>>> it to become valid. >>>> A fence could also indicate a concurrent modification of the page table. >>>> For example a PTB may be allocated and initialized concurrently, not in >>>> direct mode. Would direct mode need to wait for a fence that indicates >>>> completion of the PTB initialization? Or do we have some way to ensure >>>> such concurrent allocation and initialization of a PTB cannot happen? >>> Yeah, that is a very good question I haven't solved yet either. >>> >>> My currently best idea is to separate the address space, e.g. use the >>> lower address space for on demand paging and the higher with classic >>> pre-filled page tables for the MM and display engines. >> That may work for graphics, but doesn't work for KFD. I need the ability >> to mix pre-filled page tables with HMM in the same SVM address space. > Even after thinking for multiple days about it I can't of hand find a > way to make this work. > >> That's why I was thinking that all page table updates for a given VM >> would need to use the same method. > Well what exactly do you mean with that? Essentially there are two methods: > > 1. Pre-fill the page tables before accessing them with the hardware. > > 2. Fill on demand with page faults. > > I don't think we can mix those two methods together in the same address > range. That's what I was hoping to do. For example an application could use "old" BO-based memory management APIs that pre-fill page tables with "new" HMM-based memory management APIs that rely on page faults. Those may be different libraries written in different languages running in the same application. E.g. a GPU BLAS implementation that's optimized and uses old-style memory allocations linked to an OpenMP application that relies on HMM. If that's not possible, I'd need to emulate all the old memory APIs on top of HMM. I was hoping to avoid that. Even when page faults are enabled, we want to be able to pre-fault stuff to avoid the performance it on the first access. Are you saying that won't be possible? Regards, Felix > > E.g. we can say to use pre-fill for MM engines in the upper range and on > demand filling in the lower range, but we can't mix them. > > Regards, > Christian. > >> Regards, >> Felix >> >>> Christian. >>> >>>> Regards, >>>> Felix >>>> >>>> >>>>> Christian. >>>>> >>>>>> Is it safe to use direct-mode on a >>>>>> per-page-table-update basis? Or do all page table updates have to go >>>>>> through direct mode to avoid hazards? If yes, then maybe this >>>>>> should be >>>>>> a property of the VM rather than a parameter that gets passed to a >>>>>> bunch >>>>>> of function calls. >>>>>> >>>>>> Regards, >>>>>> Felix >>>>>> >>>>>> On 2019-03-29 6:45 a.m., Christian König wrote: >>>>>>> Otherwise we don't correctly use translate further. >>>>>>> >>>>>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++++----- >>>>>>> 1 file changed, 8 insertions(+), 5 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> index 3d221f044183..059d9802e713 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>>>>> @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct >>>>>>> amdgpu_device *adev, >>>>>>> addr = 0; >>>>>>> if (ats_entries) { >>>>>>> - uint64_t ats_value; >>>>>>> + uint64_t value = 0, flags; >>>>>>> - ats_value = AMDGPU_PTE_DEFAULT_ATC; >>>>>>> - if (level != AMDGPU_VM_PTB) >>>>>>> - ats_value |= AMDGPU_PDE_PTE; >>>>>>> + flags = AMDGPU_PTE_DEFAULT_ATC; >>>>>>> + if (level != AMDGPU_VM_PTB) { >>>>>>> + /* Handle leaf PDEs as PTEs */ >>>>>>> + flags |= AMDGPU_PDE_PTE; >>>>>>> + amdgpu_gmc_get_vm_pde(adev, level, &value, &flags); >>>>>>> + } >>>>>>> r = vm->update_funcs->update(¶ms, bo, addr, 0, >>>>>>> ats_entries, >>>>>>> - 0, ats_value); >>>>>>> + value, flags); >>>>>>> if (r) >>>>>>> return r; >>>> _______________________________________________ >>>> amd-gfx mailing list >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx