Re: [PATCH] mm: use READ/WRITE_ONCE() for vma->vm_flags on migrate, mprotect

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Fri, 7 Feb 2025 18:50:14 -0800

On Fri,  7 Feb 2025 17:24:42 +0000 Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> wrote:

> According to the syzbot report referenced here, it is possible to encounter
> a race between mprotect() writing to the vma->vm_flags field and migration
> checking whether the VMA is locked.
> 
> There is no real problem with timing here per se, only that torn
> reads/writes may occur. Therefore, as a proximate fix, ensure both
> operations READ_ONCE() and WRITE_ONCE() to avoid this.
> 
> This race is possible due to the ability to look up VMAs via the rmap,
> which migration does in this case, which takes no mmap or VMA lock and
> therefore does not preclude an operation to modify a VMA.
> 
> When the final update of VMA flags is performed by mprotect, this will
> cause the rmap lock to be taken while the VMA is inserted on split/merge.
> 
> However the means by which we perform splits/merges in the kernel is that
> we perform the split/merge operation on the VMA, acquiring/releasing locks
> as needed, and only then, after having done so, modifying fields.
> 
> We should carefully examine and determine whether we can combine the two
> operations so as to avoid such races, and whether it might be possible to
> otherwise annotate these rmap field accesses.

Thanks.

If some poor person reads this code and wonders "why is it using
READ_ONCE", what's our answer?  I guess it's "poke around with
git-blame".

And I guess we can live with that - it doesn't seem practical to paste
changelog text into every READ_ONCE() site.

Probably most people won't bother and READ_ONCEs of ->vm_flags will get
pasted into other places where unneeded.

I do wonder if we can do better.