On Tue, Nov 7, 2017 at 12:39 PM, Mauro Carvalho Chehab <mchehab@xxxxxxxxxxxxxxxx> wrote: > Em Tue, 7 Nov 2017 09:43:41 -0800 > Dan Williams <dan.j.williams@xxxxxxxxx> escreveu: > >> On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab >> <mchehab@xxxxxxxxxxxxxxxx> wrote: >> > Em Mon, 06 Nov 2017 16:57:28 -0800 >> > Dan Williams <dan.j.williams@xxxxxxxxx> escreveu: >> > >> >> V4L2 memory registrations are incompatible with filesystem-dax that >> >> needs the ability to revoke dma access to a mapping at will, or >> >> otherwise allow the kernel to wait for completion of DMA. The >> >> filesystem-dax implementation breaks the traditional solution of >> >> truncate of active file backed mappings since there is no page-cache >> >> page we can orphan to sustain ongoing DMA. >> >> >> >> If v4l2 wants to support long lived DMA mappings it needs to arrange to >> >> hold a file lease or use some other mechanism so that the kernel can >> >> coordinate revoking DMA access when the filesystem needs to truncate >> >> mappings. >> > >> > >> > Not sure if I understand this your comment here... what happens >> > if FS_DAX is enabled? The new err = get_user_pages_longterm() >> > would cause DMA allocation to fail? >> >> Correct, any attempt to specify a filesystem-dax mapping range to >> get_user_pages_longterm will fail with EOPNOTSUPP. In the future we >> want to add something like a 'struct file_lock *' argument to >> get_user_pages_longterm so that the kernel has a handle to revoke >> access to the returned pages. Once we have a safe way for the kernel >> to undo elevated page counts we can stop failing the longterm vs >> filesystem-dax case. > > Argh! Perhaps we should make it depend on BROKEN while not fixed :-/ Small consolation, but we do warn that filesystem-dax is still considered experimental when mounting a filesystem with "-o dax" >> Here is more background on why _longterm gup is a problem for filesystem-dax: >> >> https://lwn.net/Articles/737273/ >> >> > If so, that doesn't sound >> > right. Instead, mm should somehow mark this mapping to be out >> > of FS_DAX control range. >> >> DAX is currently global setting for the entire backing device of the >> filesystem, so any mapping of any file when the "-o dax" mount option >> is set is in the "FS_DAX control range". In other words there's >> currently no way to prevent FS_DAX mappings from being exposed to V4L2 >> outside of CONFIG_FS_DAX=n. > > Grrr... > >> > Also, it is not only videobuf-dma-sg.c that does long lived >> > DMA mappings. VB2 also does that (and videobuf-vmalloc). >> >> Without finding the code videobuf-vmalloc sounds like it should be ok >> if the kernel is allocating memory separate from a file-backed DAX >> mapping. > > videobuf-vmalloc do DMA mapping for pages allocated via vmalloc(), > via vmalloc_user()/remap_vmalloc_range(). Ok, that's completely safe since filesystem-dax mappings are not involved in a vmalloc backed virtual address range. > There aren't much drivers using VB1 anymore, but a change at VB2 > will likely break support for almost all webcams if fs DAX is > in usage. Yes, unless / until we can switch userspace to using a new memory registration api that includes a way for the kernel to revoke access to a dax mapping. Another mitigation is following through on support for moving dax support from a global mount flag to a per-inode flag to at least prevent dax from leaking to use cases that need explicit coordination. >> Where is the VB2 get_user_pages call? > > Before changeset 3336c24f25ec, the logic for get_user_pages() were > at drivers/media/v4l2-core/videobuf2-dma-sg.c. Now, the logic > it uses is inside mm/frame_vector.c. Ok, I'll take a look.