On 1/24/19 5:02 PM, Julien Grall wrote: > > > On 24/01/2019 14:34, Oleksandr Andrushchenko wrote: >> Hello, Julien! > > Hi, > >> On 1/22/19 1:44 PM, Julien Grall wrote: >>> >>> >>> On 1/22/19 10:28 AM, Oleksandr Andrushchenko wrote: >>>> Hello, Julien! >>> >>> Hi, >>> >>>> On 1/21/19 7:09 PM, Julien Grall wrote: >>>> Well, I didn't get the attributes of pages at the backend side, but >>>> IMO >>>> those >>>> do not matter in my use-case (for simplicity I am not using >>>> zero-copying at >>>> backend side): >>> >>> They are actually important no matter what is your use case. If you >>> access the same physical page with different attributes, then you are >>> asking for trouble. >> So, we have: >> >> DomU: frontend side >> ==================== >> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >> PTE_ATTRINDX(MT_NORMAL) > > I still don't understand how you came up with MT_NORMAL when you seem > to confirm... > >> >> DomD: backend side >> ==================== >> PTE_USER + !PTE_RDONLY + PTE_PXN + PTE_NG + PTE_CONT + PTE_TABLE_BIT + >> PTE_UXN + PTE_ATTRINDX(MT_NORMAL) >> >> From the above it seems that I don't violate cached/non-cached >> agreement here >>> >>> This is why Xen imposes all the pages shared to have their memory >>> attributes following some rules. Actually, speaking with Mark R., we >>> may want to tight a bit more the attributes. >>> >>>> >>>> 1. Frontend device allocates display buffer pages which come from >>>> shmem >>>> and have these attributes: >>>> !PTE_RDONLY + PTE_PXN + PTE_SHARED + PTE_AF + PTE_UXN + >>>> PTE_ATTRINDX(MT_NORMAL) >>> >>> My knowledge of Xen DRM is inexistent. However, looking at the code in >>> 5.0-rc2, I don't seem to find the same attributes. For instance >>> xen_drm_front_gem_prime_vmap and gem_mmap_obj are using >>> pgprot_writecombine. So it looks like, the mapping will be >>> non-cacheable on Arm64. >>> >>> Can you explain how you came up to these attributes? >> pgprot_writecombine is PTE_ATTRINDX(MT_NORMAL_NC), so it seems to be >> applicable here? [1] > > ... that MT_NORMAL_NC is used for the frontend pages. > > MT_NORMAL_NC is different from MT_NORMAL. The use of the former will > result to non-cacheable memory while the latter will result to > cacheable memory. > > To me, this looks like the exact reason why you see artifact on the > display buffer. As the author of this code, can you explain why you > decided to use pgprot_writecombine here instead of relying on the > default VMA prot? > > [...] Sorry for late reply, it took me quite some time to re-check all the use-cases that we have with PV DRM. First of all I found a bug in my debug code which reported page attributes and now I can confirm that all the use-cases sue MT_NORMAL, both backend and frontend. You are right: there is no reason to have pgprot_writecombine and indeed I have to rely on almost default VMA prot. I came up with the following (used by omap drm, for example): diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index ae28ad4b4254..867800a2ed42 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -238,8 +238,7 @@ static int gem_mmap_obj(struct xen_gem_object *xen_obj, vma->vm_flags &= ~VM_PFNMAP; vma->vm_flags |= VM_MIXEDMAP; vma->vm_pgoff = 0; - vma->vm_page_prot = - pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); { int cnt = xen_obj->num_pages > 5 ? 5 : xen_obj->num_pages; @@ -295,7 +294,7 @@ void *xen_drm_front_gem_prime_vmap(struct drm_gem_object *gem_obj) return NULL; return vmap(xen_obj->pages, xen_obj->num_pages, - VM_MAP, pgprot_writecombine(PAGE_KERNEL)); + VM_MAP, PAGE_KERNEL); } With the above all the artifacts are gone now as page attributes are the same across all domains. So, I consider this as a fix and will send it as v3 or better drop this patch and have a new one. THANK YOU for helping figuring this out! > >>> We actually never required to use cache flush in other PV protocol, so >>> I still don't understand why the PV DRM should be different here. >> Well, you are right. But at the same time not flushing the buffer makes >> troubles, >> so this is why I am trying to figure out what is wrong here. > > The cache flush is likely hiding the real problem rather than solving it. > It does hide the real issue. And I have confirmed this >>> >>> To me, it looks like that you are either missing some barriers >> Barriers for the buffer? Not sure what you mean here. > > If you share information between two entities, you may need some > ordering so the information are seen consistently by the consumer > side. This can be achieved by using barriers. > >> Even more, we have >> a use case >> when the buffer is not touched by CPU in DomD and is solely owned by >> the HW. > > Memory ordering issues are subtle. The fact that one of your use-case > works does not imply that barriers are not necessary. I am not saying > there are a missing barriers, I am only pointed out potential reasons. > > Anyway, I don't think your problem is a missing barriers here. It is > more likely because of mismatch memory attributes (see above). > Yes, please see above > Cheers, > Thank you all for helping with the correct fix, Oleksandr _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel