On Fri, Dec 02, 2022 at 09:30:56AM -0500, Waiman Long wrote: > On 12/2/22 05:18, Will Deacon wrote: > > On Thu, Dec 01, 2022 at 12:03:39PM -0500, Waiman Long wrote: > > > On 12/1/22 08:44, Will Deacon wrote: > > > > On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote: > > > > > Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be > > > > > restricted on asymmetric systems"), the setting and clearing of > > > > > user_cpus_ptr are done under pi_lock for arm64 architecture. However, > > > > > dup_user_cpus_ptr() accesses user_cpus_ptr without any lock > > > > > protection. When racing with the clearing of user_cpus_ptr in > > > > > __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and > > > > > double-free in arm64 kernel. > > > > > > > > > > Commit 8f9ea86fdf99 ("sched: Always preserve the user requested > > > > > cpumask") fixes this problem as user_cpus_ptr, once set, will never > > > > > be cleared in a task's lifetime. However, this bug was re-introduced > > > > > in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in > > > > > do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in > > > > > do_set_cpus_allowed(). This time, it will affect all arches. > > > > > > > > > > Fix this bug by always clearing the user_cpus_ptr of the newly > > > > > cloned/forked task before the copying process starts and check the > > > > > user_cpus_ptr state of the source task under pi_lock. > > > > > > > > > > Note to stable, this patch won't be applicable to stable releases. > > > > > Just copy the new dup_user_cpus_ptr() function over. > > > > > > > > > > Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems") > > > > > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()") > > > > > CC: stable@xxxxxxxxxxxxxxx > > > > > Reported-by: David Wang 王标 <wangbiao3@xxxxxxxxxx> > > > > > Signed-off-by: Waiman Long <longman@xxxxxxxxxx> > > > > > --- > > > > > kernel/sched/core.c | 32 ++++++++++++++++++++++++++++---- > > > > > 1 file changed, 28 insertions(+), 4 deletions(-) > > > > As per my comments on the previous version of this patch: > > > > > > > > https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t > > > > > > > > I think there are other issues to fix when racing affinity changes with > > > > fork() too. > > > It is certainly possible that there are other bugs hiding somewhere:-) > > Right, but I actually took the time to hit the same race for the other > > affinity mask field so it seems a bit narrow-minded for us just to fix the > > one issue. > > I focused on this particular one because of a double-free bug report from > David. What other fields have you found to be subjected to data race? See my other report linked above where we race on 'task_struct::cpus_mask'. Will