On Fri, Aug 27, 2021 at 09:57:10PM +0000, Al Viro wrote: > On Fri, Aug 27, 2021 at 09:48:55PM +0000, Al Viro wrote: > > > [btrfs]search_ioctl() > > Broken with memory poisoning, for either variant of semantics. Same for > > arm64 sub-page permission differences, I think. > > > > So we have 3 callers where we want all-or-nothing semantics - two in > > arch/x86/kernel/fpu/signal.c and one in btrfs. HWPOISON will be a problem > > for all 3, AFAICS... > > > > IOW, it looks like we have two different things mixed here - one that wants > > to try and fault stuff in, with callers caring only about having _something_ > > faulted in (most of the users) and one that wants to make sure we *can* do > > stores or loads on each byte in the affected area. > > > > Just accessing a byte in each page really won't suffice for the second kind. > > Neither will g-u-p use, unless we teach it about HWPOISON and other fun > > beasts... Looks like we want that thing to be a separate primitive; for > > btrfs I'd probably replace fault_in_pages_writeable() with clear_user() > > as a quick fix for now... > > > > Comments? > > Wait a sec... Wasn't HWPOISON a per-page thing? arm64 definitely does have > smaller-than-page areas with different permissions, so btrfs search_ioctl() > has a problem there, but arch/x86/kernel/fpu/signal.c doesn't have to deal > with that... > > Sigh... I really need more coffee... On Intel poison is tracked at the cache line granularity. Linux inflates that to per-page (because it can only take a whole page away). For faults triggered in ring3 this is pretty much the same thing because mm/memory_failure.c unmaps the page ... so while you see a #MC on first access, you get #PF when you retry. The x86 fault handler sees a magic signature in the page table and sends a SIGBUS. But it's all different if the #MC is triggerd from ring0. The machine check handler can't unmap the page. It just schedules task_work to do the unmap when next returning to the user. But if your kernel code loops and tries again without a return to user, then your get another #MC. -Tony