Re: Known and unfixed active data loss bug in MM + XFS with large folios since Dec 2021 (any kernel from 6.1 upwards)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 19 Sept 2024 at 08:35, Christian Theune <ct@xxxxxxxxxxxxxxx> wrote:
>
> Happy to! I see there’s still some back and forth on the specific
> patches. Let me know which kernel version and which patches I should
> start trying out. I’m loosing track while following the discussion.

Yeah, right now Jens is still going to run some more testing, but I
think the plan is to just backport

  a4864671ca0b ("lib/xarray: introduce a new helper xas_get_order")
  6758c1128ceb ("mm/filemap: optimize filemap folio adding")

and I think we're at the point where you might as well start testing
that if you have the cycles for it. Jens is mostly trying to confirm
the root cause, but even without that, I think you running your load
with those two changes back-ported is worth it.

(Or even just try running it on plain 6.10 or 6.11, both of which
already has those commits)

> In preparation: I’m wondering whether the known reproducer gives
> insight how I might force my load to trigger it more easily? Would
> running the reproducer above and combining that with a running
> PostgreSQL benchmark make sense?
>
> Otherwise we’d likely only be getting insight after weeks of not
> seeing crashes …

So considering how well the reproducer works for Jens and Chris, my
main worry is whether your load might have some _additional_ issue.

Unlikely, but still .. The two commits fix the repproducer, so I think
the important thing to make sure is that it really fixes the original
issue too.

And yeah, I'd be surprised if it doesn't, but at the same time I would
_not_ suggest you try to make your load look more like the case we
already know gets fixed.

So yes, it will be "weeks of not seeing crashes" until we'd be
_really_ confident it's all the same thing, but I'd rather still have
you test that, than test something else than what caused issues
originally, if you see what I mean.

         Linus





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux