On Tue, 20 Dec 2022 10:01:21 -0500 Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > On 12/19/2022 2:48 AM, Tian, Kevin wrote: > >> From: Steve Sistare <steven.sistare@xxxxxxxxxx> > >> Sent: Saturday, December 17, 2022 2:51 AM > >> @@ -1664,15 +1666,7 @@ static int vfio_dma_do_map(struct vfio_iommu > >> *iommu, > >> * against the locked memory limit and we need to be able to do both > >> * outside of this call path as pinning can be asynchronous via the > >> * external interfaces for mdev devices. RLIMIT_MEMLOCK requires > >> a > >> - * task_struct and VM locked pages requires an mm_struct, however > >> - * holding an indefinite mm reference is not recommended, > >> therefore we > >> - * only hold a reference to a task. We could hold a reference to > >> - * current, however QEMU uses this call path through vCPU threads, > >> - * which can be killed resulting in a NULL mm and failure in the > >> unmap > >> - * path when called via a different thread. Avoid this problem by > >> - * using the group_leader as threads within the same group require > >> - * both CLONE_THREAD and CLONE_VM and will therefore use the > >> same > >> - * mm_struct. > >> + * task_struct and VM locked pages requires an mm_struct. > > > > IMHO the rationale why choosing group_leader still applies... > > I don't see why it still applies. With the new code, we may save a reference > to current or current->group_leader, without error. "NULL mm and failure in the > unmap path" will not happen with mmgrab. task->signal->rlimit is shared, so it > does not matter which task we use, or whether the task is dead, as long as > one of the tasks lives, which is guaranteed by the mmget_not_zero() guard. Am > I missing something? > > I kept current->group_leader for ease of debugging, so that all dma's are owned > by the same task. That much at least would be a good comment to add since the above removes all justification for why we're storing group_leader as the task rather than current. Ex: QEMU typically calls this path through vCPU threads, which can terminate due to vCPU hotplug, therefore historically we've used group_leader for task tracking. With the addition of grabbing the mm for the life of the DMA tracking structure, this is not so much a concern, but we continue to use group_leader for debug'ability, ie. all DMA tracking is owned by the same task. Given the upcoming holidays and reviewers starting to disappear, I suggest we take this up as a fixes series for v6.2 after the new year. The remainder of the vfio next branch for v6.2 has already been merged. Thanks, Alex