On Mon, Feb 24, 2025 at 08:11:47PM +1300, Barry Song wrote: > Please send a V2 and update your changelog to accurately describe the real > issue. Additionally, clarify how frequently this occurs and why resolving > the root cause is challenging. Gaoxu reported a similar case on the Android > kernel 6.6, while you're reporting it on 5.10. He observed an occurrence > rate of 1 in 500,000 over a week on customer devices but was unable to > reproduce it in the lab. > > BTW, your patch is incorrect, as normally we could have a case _swap_info_get() > returns NULL: > thread 1 thread2 > > > 1. page fault happens > with entry points to > swapfile; > swapoff() > 2. do_swap_page() > > In this scenario, _swap_info_get() may return NULL, which is expected, > and we should not return -ERRNO—the subsequent page fault will > detect that the PTE has changed. Since you have never enabled any > swap, the appropriate action is to do the following: > > /* Prevent swapoff from happening to us. */ > si = get_swap_device(entry); > - if (unlikely(!si)) > + if unlikely(!si)) { > + /* > + * Return VM_FAULT_SIGBUS if the swap entry points to > + * a never-enabled swap file, caused by either hardware > + * issues or a kernel bug. Return an error code to prevent > + * an infinite page fault (#PF) loop. > + if (WARN_ON_ONCE(!swp_swap_info(entry))) > + ret = VM_FAULT_SIGBUS; > goto out; > + } This is overly specific to the case that you're tracking down. So it's entirely appropriate to apply to _your_ kernel while you work on tracking it down, but completely inappropriate to upstream.