On 2018-02-14 01:33 PM, Christian König wrote: > Am 14.02.2018 um 19:24 schrieb Felix Kuehling: >> On 2018-02-14 01:15 PM, Christian König wrote: >>> Am 14.02.2018 um 17:35 schrieb Felix Kuehling: >>>> [SNIP] >>>>>>>>> I don't see how you can have separate TTM objects referring to >>>>>>>>> the >>>>>>>>> same memory. >>>>>>>> Well that is trivial, we do this all the time with prime and I+A >>>>>>>> laptops. >>>>>>> As I understand it, you use DMABuf to export/import buffers on >>>>>>> multiple >>>>>>> devices. I believe all devices share a single amdgpu_bo, which >>>>>>> contains >>>>>>> the ttm_buffer_object. >>>>> That's incorrect as well. Normally multiple devices have multiple >>>>> ttm_buffer_object, one for each device. >>>>> Going a bit higher that actually makes sense because the status of >>>>> each BO is deferent for each device. E.g. one device could have >>>>> the BO >>>>> in access while it could be idle on another device. >>>> Can you point me where this is done? I'm looking at >>>> amdgpu_gem_prime_foreign_bo. It is used if an AMDGPU BO is imported >>>> into >>>> a different AMDGPU device. It creates a new GEM object, with a >>>> reference >>>> to the same amdgpu BO (gobj->bo = amdgpu_bo_ref(bo)). To me this looks >>>> very much like the same amdgpu_bo, and cosequently the same TTM BO >>>> being >>>> shared by two GEM objects and two devices. >>> As Michel pointed out as well that stuff isn't upstream and judging >>> from the recent requirements it will never go upstream. >>> >>>>> If this is enabled by any changes that break existing buffer sharing >>>>> for >>>>> A+A or A+I systems, please point it out to me. I'm not aware that >>>>> this >>>>> patch series does anything to that effect. >>>>> As I said it completely breaks scanout with A+I systems. >>>> Please tell me what "it" is. What in the changes I have posted is >>>> breaking A+I systems. I don't see it. >>> Using the same amdgpu_bo structure with multiple devices is what "it" >>> means here. >> That statement seems to contradict this previous statement by you: >>>>>   What we do is map the same >>>>> BO in VMs on other devices. It has no effect on GART mappings. >>> Correct VM mapping is unaffected here. >> Can you clarify that contradiction? Is it OK for us to map the same BO >> in multiple VMs or not? > > By the current requirements I have I think the answer is no. > >>> As I said that concept is incompatible with the requirements on A+A >>> systems, so we need to find another solution to provide the >>> functionality. >> Do you mean you need to find another solution for A+A buffer sharing >> specifically? Or is this a more general statement that includes the >> mapping of BOs to multiple VMs on different devices? > > A more general statement. We need to find a solution which works for > everybody and not just works like this in the KFD but breaks A+A > buffer sharing and so needs to be disabled there. Well, KFD sharing system memory BOs between GPUs doesn't break A+A. Implementing a solution for A+A that involves DMABufs will not affect KFD. And KFD isn't actually broken as far as I know. Once you have a solution for A+A, maybe it will help me understand the problem and I can evaluate whether the solution is applicable to KFD and worth adopting. But for now I have neither a good understanding of the problem, no evidence that there is a problem affecting KFD, and no way towards a solution. > >> >>> What's on my TODO list anyway is to extend DMA-buf to not require >>> pinning and to be able to deal with P2P. >> Sounds good. That said, KFD is not using DMABufs here. >> >>> The former is actually rather easy and already mostly done by sharing >>> the reservation object between exporter and importer. >>> >>> The later is a bit more tricky because I need to create the necessary >>> P2P infrastructure, but even that is doable in the mid term. >> The sooner you can share your plans, the better. Right now I'm in a bit >> of limbo. I feel you're blocking KFD upstreaming based on AMDGPU plans >> and changes that no one has seen yet. > > Well as far as I understand it that is not blocking for the current > upstreaming because you didn't planned to upstream this use case > anyway, didn't you? Which use case? The current patch series enables multi-GPU buffer sharing of system memory BOs. If it is actually broken, I can reduce the scope to single-GPU support. But I have no evidence that multi-GPU is actually broken. Regards,  Felix > > Regards, > Christian. > >> >> Thanks, >>   Felix >> >>> Regards, >>> Christian. >