On Wed, Feb 26, 2025 at 03:29:06PM -0800, Nhat Pham wrote: > On Tue, Feb 25, 2025 at 7:12 PM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote: > > > > On Tue, Feb 25, 2025 at 01:32:00PM -0800, Nhat Pham wrote: > > > Currently, we crash the kernel when a decompression failure occurs in > > > zswap (either because of memory corruption, or a bug in the compression > > > algorithm). This is overkill. We should only SIGBUS the unfortunate > > > process asking for the zswap entry on zswap load, and skip the corrupted > > > entry in zswap writeback. > > > > Some relevant observations/questions, but not really actionable for this > > patch, perhaps some future work, or more likely some incoherent > > illogical thoughts : > > > > (1) It seems like not making the folio uptodate will cause shmem faults > > to mark the swap entry as hwpoisoned, but I don't see similar handling > > for do_swap_page(). So it seems like even if we SIGBUS the process, > > other processes mapping the same page could follow in the same > > footsteps. > > poisoned, I think? It's the weird SWP_PTE_MARKER thing. Not sure what you mean here, I am referring to the inconsistency between shmem_swapin_folio() and do_swap_page(). > > [...] > > > > > > > (3) If we run into a decompression failure, should we free the > > underlying memory from zsmalloc? I don't know. On one hand if we free it > > zsmalloc may start using it for more compressed objects. OTOH, I don't > > think proper hwpoison handling will kick in until the page is freed. > > Maybe we should tell zsmalloc to drop this page entirely and mark > > objects within it as invalid? Probably not worth the hassle but > > something to think about. > > This might be a fun follow up :) I guess my question is - is there a > chance that we might recover in the future? > > For example, can memory (hardware) failure somehow recover, or the > decompression algorithm somehow fix itself? I suppose not? > > If that is the case, one thing we can do is just free the zsmalloc > slot, then mark the zswap entry as corrupted somehow. We can even > invalidate the zswap entry altogether, and install a (shared) > ZSWAP_CORRUPT_ENTRY. Future readers can check for this and exit if > they encounter a corrupted entry? > > It's not common enough (lol hopefully) for me to optimize right away, > but I can get on with it if there are actual data of this happening > IRL/in product :)ion I am not aware of this being a common problem, but something to keep in mind, perhaps.