On Tue, Apr 10, 2018 at 11:28:13AM -0700, Yang Shi wrote: > > > > At the first glance, it looks feasible to me. Will look into deeper > > later. > > A further look told me this might be *not* feasible. > > It looks the new lock will not break check_data_rlimit since in my patch > both start_brk and brk is protected by mmap_sem. The code flow might look > like below: > > CPU A CPU B > -------- -------- > prctl sys_brk > down_write > check_data_rlimit check_data_rlimit (need mm->start_brk) > set brk > down_write up_write > set start_brk > set brk > up_write > > If CPU A gets the mmap_sem first, it will set start_brk and brk, then CPU B > will check with the new start_brk. And, prctl doesn't care if sys_brk is run > before it since it gets the new start_brk and brk from parameter. > > If we protect start_brk and brk with the new lock, sys_brk might get old > start_brk, then sys_brk might break rlimit check silently, is that right? > > So, it looks using new lock in prctl and keeping mmap_sem in brk path has > race condition. I fear so. The check_data_rlimit implies that all elements involved into validation (brk, start_brk, start_data, end_data) are not changed unpredicably until written back into mm. In turn if we guard start_brk,brk only (as it is done in the patch) the check_data_rlimit may pass on wrong data I think. And as you mentioned the race above exact the example of such situation. I think for prctl case we can simply left use of mmap_sem as it were before the patch, after all this syscall is really in cold path all the time. Cyrill