On 01/10/2024 16:10, Peter Xu wrote: > On Tue, Oct 01, 2024 at 03:27:48PM +0100, Ryan Roberts wrote: >> Hi Peter, >> >> On 08/08/2024 12:25, Ryan Roberts wrote: >>> On 07/08/2024 19:59, Peter Xu wrote: >>>> On Wed, Aug 07, 2024 at 12:18:18PM +0200, David Hildenbrand wrote: >>>>> On 07.08.24 10:58, David Hildenbrand wrote: >>>>>> On 06.08.24 22:29, Peter Xu wrote: >>>>>>> On Tue, Aug 06, 2024 at 06:37:55PM +0200, David Hildenbrand wrote: >>>>>>>> On 06.08.24 17:15, Ryan Roberts wrote: >>>>>>>>> Hi Peter, David, >>>>>>> >>>>>>> Hi, Ryan, >>>>>>> >>>>>>>>> >>>>>>>>> syzkaller has found an issue (at least on arm64, but I suspect it will be >>>>>>>>> visible on x86_64 too) that triggers the following warning: >>>>>>> >>>>>>> This is true. I can easily reproduce.. >>>>>>> >> >> [...] >> >>>> When I'm looking at this specific issue again, it's more than ptes that >>>> should need to remove the uffd-wp bit. We have: >>>> >>>> - pmd/pud/hugetlb in other paths that will need similar care.. >>>> >>>> - move_page_tables() smartness on HAVE_MOVE_PUD.. where we may need to >>>> walk the pmd page removing the bits when necessary.. >>>> >>>> - more importantly, mremap_userfaultfd_prep() might be too late if it's >>>> after moving pgtables.. >>>> >>>> - [not yet started looking] the mlock issue Ryan mentioned.. >>>> >>>> Looks like we'll need more things to fix and test.. >>>> >>>> I wished if I can simply disable UFFD_WP + EVENT_REMAP, but I think even >>>> with that, by default when mremap() we should still logically tear down all >>>> those uffd-wp bits which is the same as !EVENT_REMAP now.. >>>> >>>> Let me know if anyone would like to beat me to it on fixing the whole >>>> thing, I'd be more than happy.. >>> >>> Afraid I won't be able to sign up to doing that work. >>> >>> Otherwise, I'll probably need to postpone >>>> the fix of this issue for 1-2 weeks but finish some other things first.. >> >> I'm not sure if there was any progress on this? We are still seeing the problem >> on v6.12-rc1. > > Hi, Ryan, > > I haven't yet got free time to look at this, sorry. I confess I didn't > prioritize this as high, as I doubt anyone would make real use of it, or > hit this issue in real workloads, and it'll even slow down generic > workloads even if slightly. No problem, I'm acting as the middle man really, given -rc1 is out, Mark has been running his usual fuzzing and noted that the issue still exists. So I thought I'd just enquire to see if you were able to make any progress. I agree its not high priority. Although for a panic_on_warn=1 kernel (which I understand some use in deployment), this means that user space can panic the system, so I guess it needs to be addressed eventually. > > Do you want to have a look? It'll be great if so. Or I can try to find > some time this month. I won't personally get time to look at this, since I'm busy with some other commitments. But I might be able to find someone to look into it. Leave it with me for now. Thanks, Ryan > > Thanks, >