Re: zswap_writeback_entry crashes in 6.9.5

Pedro Falcato <pedro.falcato@xxxxxxxxx> · Fri, 19 Jul 2024 19:57:21 +0100

On Mon, Jul 1, 2024 at 2:07 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
>
> On Sun, Jun 30, 2024 at 10:58 AM Pedro Falcato <pedro.falcato@xxxxxxxxx> wrote:
> >
> > Hi everyone,
>
> Hi Pedro,
>

Hi Nhat,

Sorry for necroing, but I've been really busy and this issue totally
left my mind. Following up so I don't leave you hanging:

> Thanks for the bug report! Taking a look now - some preliminary
> questions to narrow down the suspects and aid the debugging process:
>
> a) Do you observe this bug in 6.8? 6.10?

Yes.
>
> b) Have you run the faddr2line script to verify that the line that
> triggers the crash is count_objcg_event(entry->objcg, ZSWPWB);?

I did not (distro kernel does not seem to have debug symbols), but
it's pretty clear from disassembly.
>
> c) Do you have a full dmesg log? Or maybe some other reproduction instructions?
>
> If entry->objcg is garbage, then this smells like a lifetime/reference
> counting issue. Either:
>
> a) The zswap entry itself is garbage. Not impossible, but seems
> unlikely. In 6.9, we effectively isolate the entry first through the
> swap cache, then check and remove it from the zswap tree (under the
> tree's lock). The former locks out concurrent accessors, and the
> latter should have taken care of invalidated entries (and prevents
> future invalidation attempts). Furthermore, after this, if the entry
> is somehow garbage (i.e freed and recycled), it should also be
> possible to blow up in the decompression step first, by feeding a
> garbage handle to zsmalloc and crashing the kernel at that point. IOW,
> we should also see zsmalloc crashes in addition to this particular
> crash, no? I cannot think of any protection mechanism that applies to
> the decompression step and not to count_objcg_event().
>
> b) entry->objcg has been freed/recycled under us. This is much
> trickier, as the culprit could be any holder of the objcg reference
> who accidentally double-released the reference it held. That said, if
> it only happened on zswap shrinker path, then maybe there is something
> to this...
>

I have a separate theory. I also run the NVIDIA proprietary drivers.
slabinfo -a shows us:
[...]
:0000080     <- zswap_entry Acpi-Parse kernfs_iattrs_cache
uvm_tools_replay_data_t Acpi-State audit_tree_mark
[...]

See the uvm_tools_replay_data_t there? Yeah, it's entirely possible
some random nvidia.ko bug has been corrupting zswap_entry from time to
time (which explains why e.g the big server people have not seen
this).
I'm not sure if Yuxuan is running the same driver, but their kernel is
also proprietary-tainted.

As such I'll refrain from posting more about this or similar bugs
until I can get a guarantee it happens with a non-tainted kernel
(fwiw, I have not seen crashes for 2 weeks or so, hopefully this issue
is fixed).

Again, sorry for not checking the taint before posting this, and thank
you for your time :)

-- 
Pedro