On Tue, Oct 29, 2024 at 04:22:42PM +0000, Catalin Marinas wrote: > On Tue, Oct 29, 2024 at 03:16:00PM +0000, Lorenzo Stoakes wrote: > > On Tue, Oct 29, 2024 at 03:04:41PM +0000, Catalin Marinas wrote: > > > On Mon, Oct 28, 2024 at 10:14:50PM +0000, Lorenzo Stoakes wrote: > > > > So continue to check VM_MTE_ALLOWED which arch_calc_vm_flag_bits() sets if > > > > MAP_ANON. > > > [...] > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > > index 4ba1d00fabda..e87f5d6799a7 100644 > > > > --- a/mm/shmem.c > > > > +++ b/mm/shmem.c > > > > @@ -2733,9 +2733,6 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) > > > > if (ret) > > > > return ret; > > > > > > > > - /* arm64 - allow memory tagging on RAM-based files */ > > > > - vm_flags_set(vma, VM_MTE_ALLOWED); > > > > > > This breaks arm64 KVM if the VMM uses shared mappings for the memory > > > slots (which is possible). We have kvm_vma_mte_allowed() that checks for > > > the VM_MTE_ALLOWED flag as the VMM may not use PROT_MTE/VM_MTE directly. > > > > Ugh yup missed that thanks. > > > > > I need to read this thread properly but why not pass the file argument > > > to arch_calc_vm_flag_bits() and set VM_MTE_ALLOWED in there? > > > > Can't really do that as it is entangled in a bunch of other stuff, > > e.g. calc_vm_prot_bits() would have to pass file and that's used in a bunch > > of places including arch code and... etc. etc. > > Not calc_vm_prot_bits() but calc_vm_flag_bits(). > arch_calc_vm_flag_bits() is only implemented by two architectures - > arm64 and parisc and calc_vm_flag_bits() is only called from do_mmap(). > > Basically we want to set VM_MTE_ALLOWED early during the mmap() call > and, at the time, my thinking was to do it in calc_vm_flag_bits(). The > calc_vm_prot_bits() OTOH is also called on the mprotect() path and is > responsible for translating PROT_MTE into a VM_MTE flag without any > checks. arch_validate_flags() would check if VM_MTE comes together with > VM_MTE_ALLOWED. But, as in the KVM case, that's not the only function > checking VM_MTE_ALLOWED. > > Since calc_vm_flag_bits() did not take a file argument, the lazy > approach was to add the flag explicitly for shmem (and hugetlbfs in > -next). But I think it would be easier to just add the file argument to > calc_vm_flag_bits() and do the check in the arch code to return > VM_MTE_ALLOWED. AFAICT, this is called before mmap_region() and > arch_validate_flags() (unless I missed something in the recent > reworking). I mean I totally get why you're suggesting it - it's the right _place_ but... It would require changes to a ton of code which is no good for a backport and we don't _need_ to do it. I'd rather do the smallest delta at this point, as I am not a huge fan of sticking it in here (I mean your point is wholly valid - it's at a better place to do so and we can change flags here, it's just - it's not where you expect to do this obviously). I mean for instance in arch/x86/kernel/cpu/sgx/encl.c (a file I'd _really_ like us not to touch here by the way) we'd have to what pass NULL? I mean passing file to arch_validate_flags() is icky, but it makes some sense since we _always_ have that available and meaningful at the point of invocation, if we added it to arch_calc_vm_flag_bits() now there are places where it's not available. And then we're assuming we can just pass NULL... and it becomes a confusing mess really I think. I also worry we might somehow break something somewhere this way, we're already exposed to subtle issues here. Alternatively, we can change my series by 2 lines (as I've already asked Andrew to do), everything still works, the fix applies, the VM_MTE_ALLOWED flag works still in an obvious way (it behaves exactly as it did before) and all is well with the world and we can frolick in the fields freely and joyously :) > > > I suggest instead we instead don't drop the yucky shmem thing, which will > > set VM_MTE_ALLOWED for shmem, with arch_calc_vm_flag_bits() still setting > > it for MAP_ANON, but the other changes will mean the arch_validate_flags() > > will be fixed too. > > > > So this just means not dropping the mm/shmem.c bit basically and everything > > should 'just work'? > > If we can't get the calc_vm_flag_bits() approach to work, I'm fine with > this as a fix and we'll look to do it properly from 6.13. I think overwhelmingly since I'm going to be backporting this and as a hotfix it's better to just leave the shmem stuff in and leave the rest the same. I really would like us to figure out a better way overall from >=6.13 though and replace all this with something saner :>) Am happy to help and collaborate on that! > > -- > Catalin Cheers, and sorry to fiddle with arm64 stuff here, sadly happens to just be where this issue becomes a thing with this hotfix!