Matthew Wilcox <willy@xxxxxxxxxxxxx> writes: > On Mon, Jan 31, 2022 at 10:03:31AM -0600, Eric W. Biederman wrote: >> "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> writes: >> >> > I'm not sure if the VMA list can change under us, but dump_vma_snapshot() >> > is very careful to take the mmap_lock in write mode. We only need to >> > take it in read mode here as we do not care if the size of the stack >> > VMA changes underneath us. >> > >> > If it can be changed underneath us, this is a potential use-after-free >> > for a multithreaded process which is dumping core. >> >> The problem is not multi-threaded process so much as processes that >> share their mm. > > I don't understand the difference. I appreciate that another process can > get read access to an mm through, eg, /proc, but how can another process > (that isn't a thread of this process) modify the VMAs? There are a couple of ways. A classic way is a multi-threads process can call vfork, and the mm_struct is shared with the child until exec is called. A process can do this more deliberately by forking a child using clone(CLONE_VM) and not including CLONE_THREAD. Supporting this case is a hold over from before CLONE_THREAD was supported in the kernel and such processes were used to simulate threads. The practical difference between a CLONE_THREAD thread and a non-CLONE_THREAD process is that the signal handling is not shared. Without sharing the signal handlers it does not make sense for a fatal signal to kill the other process. >From the perspective of coredump generation it stops the execution of all CLONE_THREAD threads that are going to be part of the coredump and allows anyone else who shared the mm_struct to keep running. It also happens that there are subsystems in the kernel that do things like kthread_use_mm that can also be modifying the mm during a coredump. Which is why we have dump_vma_snapshot. Preventing the mm_struct and the vmas from being modified during a coredump is not really practical. >> I think rather than take a lock we should be using the snapshot captured >> with dump_vma_snapshot. Otherwise we have the very real chance that the >> two get out of sync. Which would result in a non-sense core file. >> >> Probably that means that dump_vma_snapshot needs to call get_file on >> vma->vm_file store it in core_vma_metadata. >> >> Do you think you can fix it something like that? > > Uhh .. that seems like it needs a lot more understanding of binfmt_elf > than I currently possess. I'd rather spend my time working on folios > than learning much more about binfmt_elf. I was just trying to fix an > assertion failure with the maple tree patches (we now assert that you're > holding a lock when walking the list of VMAs). Fair enough. I will put it on my list of things to address. Eric