Hi Robin, >> I use Xen PV display. In my case, PV display backend(Dom0) allocates >> contiguous buffer via DMA-API to >> to implement zero-copy between Dom0 and DomU. >> > Well, something's gone badly wrong there - if you have to shadow the > entire thing in a bounce buffer to import it then it's hardly zero-copy, > is it? If you want to do buffer sharing the buffer really needs to be > allocated appropriately to begin with, such that all relevant devices > can access it directly. That might be something which needs fixing in Xen. > Right, in case when we want to use a zero-copy approach need to avoid using swiotlb bounce buffer for all devices which is potentially using this buffer. The root of the problem is that this buffer mapped to foreign pages and when I tried to retrieve dma_addr for this buffer I got a foreign MFN that bigger than 32 bit and swiotlb tries to use bounce buffer. I understood, that, need to find a way to avoid using swiotlb in this case. At the moment, it's unclear how to do this properly. But, this is another story... I guess, we can have the situation when some device like rcar-du needs to use a sufficiently large buffer which is greater than 256 KB (128(CURRENT_IO_TLB_SEGMENT * 2048) and need to adjust this parameter during boot time, not compilation time. In order to this point, this patch was created. Thanks, Roman пт, 17 сент. 2021 г. в 12:44, Robin Murphy <robin.murphy@xxxxxxx>: > > On 2021-09-17 10:36, Roman Skakun wrote: > > Hi, Christoph > > > > I use Xen PV display. In my case, PV display backend(Dom0) allocates > > contiguous buffer via DMA-API to > > to implement zero-copy between Dom0 and DomU. > > Well, something's gone badly wrong there - if you have to shadow the > entire thing in a bounce buffer to import it then it's hardly zero-copy, > is it? If you want to do buffer sharing the buffer really needs to be > allocated appropriately to begin with, such that all relevant devices > can access it directly. That might be something which needs fixing in Xen. > > Robin. > > > When I start Weston under DomU, I got the next log in Dom0: > > ``` > > [ 112.554471] CPU: 0 PID: 367 Comm: weston Tainted: G O > > 5.10.0-yocto-standard+ #312 > > [ 112.575149] Call trace: > > [ 112.577666] dump_backtrace+0x0/0x1b0 > > [ 112.581373] show_stack+0x18/0x70 > > [ 112.584746] dump_stack+0xd0/0x12c > > [ 112.588200] swiotlb_tbl_map_single+0x234/0x360 > > [ 112.592781] xen_swiotlb_map_page+0xe4/0x4c0 > > [ 112.597095] xen_swiotlb_map_sg+0x84/0x12c > > [ 112.601249] dma_map_sg_attrs+0x54/0x60 > > [ 112.605138] vsp1_du_map_sg+0x30/0x60 > > [ 112.608851] rcar_du_vsp_map_fb+0x134/0x170 > > [ 112.613082] rcar_du_vsp_plane_prepare_fb+0x44/0x64 > > [ 112.618007] drm_atomic_helper_prepare_planes+0xac/0x160 > > [ 112.623362] drm_atomic_helper_commit+0x88/0x390 > > [ 112.628029] drm_atomic_nonblocking_commit+0x4c/0x60 > > [ 112.633043] drm_mode_atomic_ioctl+0x9a8/0xb0c > > [ 112.637532] drm_ioctl_kernel+0xc4/0x11c > > [ 112.641506] drm_ioctl+0x21c/0x460 > > [ 112.644967] __arm64_sys_ioctl+0xa8/0xf0 > > [ 112.648939] el0_svc_common.constprop.0+0x78/0x1a0 > > [ 112.653775] do_el0_svc+0x24/0x90 > > [ 112.657148] el0_svc+0x14/0x20 > > [ 112.660254] el0_sync_handler+0x1a4/0x1b0 > > [ 112.664315] el0_sync+0x174/0x180 > > [ 112.668145] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz: > > 3686400 bytes), total 65536 (slots), used 112 (slots) > > ``` > > The problem is happened here: > > https://elixir.bootlin.com/linux/v5.14.4/source/drivers/gpu/drm/rcar-du/rcar_du_vsp.c#L202 > > > > Sgt was created in dma_get_sgtable() by dma_common_get_sgtable() and > > includes a single page chunk > > as shown here: > > https://elixir.bootlin.com/linux/v5.14.5/source/kernel/dma/ops_helpers.c#L18 > > > > After creating a new sgt, we tried to map this sgt through vsp1_du_map_sg(). > > Internally, vsp1_du_map_sg() using ops->map_sg (e.g > > xen_swiotlb_map_sg) to perform > > mapping. > > > > I realized that required segment is too big to be fitted to default > > swiotlb segment and condition > > https://elixir.bootlin.com/linux/latest/source/kernel/dma/swiotlb.c#L474 > > is always false. > > > > I know that I use a large buffer, but why can't I map this buffer in one chunk? > > > > Thanks! > > > > ср, 15 сент. 2021 г. в 16:53, Christoph Hellwig <hch@xxxxxx>: > >> > >> On Wed, Sep 15, 2021 at 03:49:52PM +0200, Jan Beulich wrote: > >>> But the question remains: Why does the framebuffer need to be mapped > >>> in a single giant chunk? > >> > >> More importantly: if you use dynamic dma mappings for your framebuffer > >> you're doing something wrong. > > > > > > -- Best Regards, Roman.