On 2019/3/6 16:12, Peter Xu wrote: > On Wed, Mar 06, 2019 at 03:41:06PM +0800, zhong jiang wrote: >> On 2019/3/6 14:26, Mike Rapoport wrote: >>> Hi, >>> >>> On Wed, Mar 06, 2019 at 01:53:12PM +0800, zhong jiang wrote: >>>> On 2019/3/6 10:05, Andrea Arcangeli wrote: >>>>> Hello everyone, >>>>> >>>>> [ CC'ed Mike and Peter ] >>>>> >>>>> On Tue, Mar 05, 2019 at 02:42:00PM +0800, zhong jiang wrote: >>>>>> On 2019/3/5 14:26, Dmitry Vyukov wrote: >>>>>>> On Mon, Mar 4, 2019 at 4:32 PM zhong jiang <zhongjiang@xxxxxxxxxx> wrote: >>>>>>>> On 2019/3/4 22:11, Dmitry Vyukov wrote: >>>>>>>>> On Mon, Mar 4, 2019 at 3:00 PM zhong jiang <zhongjiang@xxxxxxxxxx> wrote: >>>>>>>>>> On 2019/3/4 15:40, Dmitry Vyukov wrote: >>>>>>>>>>> On Sun, Mar 3, 2019 at 5:19 PM zhong jiang <zhongjiang@xxxxxxxxxx> wrote: >>>>>>>>>>>> Hi, guys >>>>>>>>>>>> >>>>>>>>>>>> I also hit the following issue. but it fails to reproduce the issue by the log. >>>>>>>>>>>> >>>>>>>>>>>> it seems to the case that we access the mm->owner and deference it will result in the UAF. >>>>>>>>>>>> But it should not be possible that we specify the incomplete process to be the mm->owner. >>>>>>>>>>>> >>>>>>>>>>>> Any thoughts? >>>>>>>>>>> FWIW syzbot was able to reproduce this with this reproducer. >>>>>>>>>>> This looks like a very subtle race (threaded reproducer that runs >>>>>>>>>>> repeatedly in multiple processes), so most likely we are looking for >>>>>>>>>>> something like few instructions inconsistency window. >>>>>>>>>>> >>>>>>>>>> I has a little doubtful about the instrustions inconsistency window. >>>>>>>>>> >>>>>>>>>> I guess that you mean some smb barriers should be taken into account.:-) >>>>>>>>>> >>>>>>>>>> Because IMO, It should not be the lock case to result in the issue. >>>>>>>>> Since the crash was triggered on x86 _most likley_ this is not a >>>>>>>>> missed barrier. What I meant is that one thread needs to executed some >>>>>>>>> code, while another thread is stopped within few instructions. >>>>>>>>> >>>>>>>>> >>>>>>>> It is weird and I can not find any relationship you had said with the issue.:-( >>>>>>>> >>>>>>>> Because It is the cause that mm->owner has been freed, whereas we still deference it. >>>>>>>> >>>>>>>> From the lastest freed task call trace, It fails to create process. >>>>>>>> >>>>>>>> Am I miss something or I misunderstand your meaning. Please correct me. >>>>>>> Your analysis looks correct. I am just saying that the root cause of >>>>>>> this use-after-free seems to be a race condition. >>>>>>> >>>>>>> >>>>>>> >>>>>> Yep, Indeed, I can not figure out how the race works. I will dig up further. >>>>> Yes it's a race condition. >>>>> >>>>> We were aware about the non-cooperative fork userfaultfd feature >>>>> creating userfaultfd file descriptor that gets reported to the parent >>>>> uffd, despite they belong to mm created by failed forks. >>>>> >>>>> https://www.spinics.net/lists/linux-mm/msg136357.html >>>>> >>>> Hi, Andrea >>>> >>>> I still not clear why uffd ioctl can use the incomplete process as the mm->owner. >>>> and how to produce the race. >>> There is a C reproducer in the syzcaller report: >>> >>> https://syzkaller.appspot.com/x/repro.c?x=172fa5a3400000 >>> >>>> From your above explainations, My underdtanding is that the process handling do_exexve >>>> will have a temporary mm, which will be used by the UUFD ioctl. >>> The race is between userfaultfd operation and fork() failure: >>> >>> forking thread | userfaultfd monitor thread >>> --------------------------------+------------------------------- >>> fork() | >>> dup_mmap() | >>> dup_userfaultfd() | >>> dup_userfaultfd_complete() | >>> | read(UFFD_EVENT_FORK) >>> | uffdio_copy() >>> | mmget_not_zero() >>> goto bad_fork_something | >>> ... | >>> bad_fork_free: | >>> free_task() | >>> | mem_cgroup_from_task() >>> | /* access stale mm->owner */ >>> >> Hi, Mike > Hi, Zhong, > >> forking thread fails to create the process ,and then free the allocated task struct. >> Other userfaultfd monitor thread should not access the stale mm->owner. >> >> The parent process and child process do not share the mm struct. Userfaultfd monitor thread's >> mm->owner should not point to the freed child task_struct. > IIUC the problem is that above mm (of the mm->owner) is the child > process's mm rather than the uffd monitor's. When > dup_userfaultfd_complete() is called there will be a new userfaultfd > context sent to the uffd monitor thread which linked to the chlid > process's mm, and if the monitor thread do UFFDIO_COPY upon the newly > received userfaultfd it'll operate on that new mm too. Thank Mike and Peter for further explanation. I get it. Yes, The race indeed will result in the issue. but as for the patch Andrea has posted. I still has a little worry. The patch use call_rcu to delay free the task_struct, but It is possible to free the task_struct ahead of get_mem_cgroup_from_mm. is it right? Thanks, zhong jiang >> and due to the existence of tasklist_lock, we can not specify the mm->owner to freed task_struct. >> >> I miss something,=-O >> >> Thanks, >> zhong jiang >>>> Thanks, >>>> zhong jiang >> > Regards, >