On Tue, Jun 11, 2024 at 11:01 PM Oliver Sang <oliver.sang@xxxxxxxxx> wrote: > > hi, Yang Shi, > > On Wed, Jun 05, 2024 at 08:44:37PM -0700, Yang Shi wrote: > > Oliver Sang <oliver.sang@xxxxxxxxx>于2024年6月5日 周三19:16写道: > > > > > hi, Yang Shi, > > > > > > On Tue, Jun 04, 2024 at 04:53:56PM -0700, Yang Shi wrote: > > > > On Mon, Jun 3, 2024 at 9:54 AM Yang Shi <shy828301@xxxxxxxxx> wrote: > > > > > > > > > > On Mon, Jun 3, 2024 at 7:02 AM Oliver Sang <oliver.sang@xxxxxxxxx> > > > wrote: > > > > > > > > > > > > hi, Yang Shi, > > > > > > > > > > > > On Fri, May 31, 2024 at 01:57:06PM -0700, Yang Shi wrote: > > > > > > > Hi Oliver, > > > > > > > > > > > > > > I just came up with a quick patch (just build test) per the > > > discussion > > > > > > > and attached, can you please to give it a try? Once it is > > > verified, I > > > > > > > will refine the patch and submit for review. > > > > > > > > > > > > what's the base of this patch? I tried to apply it upon efa7df3e3b or > > > > > > v6.10-rc2. both failed. > > > > > > > > > > Its base is mm-unstable. The head commit is 8e06d6b9274d ("mm: add > > > > > swappiness= arg to memory.reclaim"). Sorry for the confusion, I should > > > > > have mentioned this. > > > > > > > > I just figured out a bug in the patch. Anyway, we are going to take a > > > > different approach to fix the issue per the discussion. I already sent > > > > the series to the mailing list. Please refer to > > > > > > > https://lore.kernel.org/linux-mm/20240604234858.948986-1-yang@xxxxxxxxxxxxxxxxxxxxxx/ > > > > > > got it. seems you will submit v2? should we wait v2 to do the tests? > > > > > > The real fix is patch #1, that doesn’t need v2. So you just need to test > > that. > > we've finished tests and confirmed patch #1 fixed the issue. > we also tested upon patch #2, still clean. Thanks for testing. Sorry for the late reply, just came back from vacation. It seems like Andrew didn't take the fix yet. I will resend the patch with your tested-by tag. And I will drop the patch #2 since it is just a clean up and I didn't receive any review comments. In addition, the undergoing hugepd clean up may make this clean up easier, so I will put the clean up on the back burner for now. > > our bot applied your patch upon 306dde9ce5c951 as below > > 5d45cc9b1beb57 mm: gup: do not call try_grab_folio() in slow path > fd3fc964468925 mm: page_ref: remove folio_try_get_rcu() > 306dde9ce5c951 foo > > on 306dde9ce5c951, we still observed the issue we reported. clean on both patch > #1 and #2 > > 306dde9ce5c9516d fd3fc96446892528af48d6271a3 5d45cc9b1beb57386992c005669 > ---------------- --------------------------- --------------------------- > fail:runs %reproduction fail:runs %reproduction fail:runs > | | | | | > 47:50 -94% :50 -94% :50 dmesg.Kernel_panic-not_syncing:Fatal_exception > 47:50 -94% :50 -94% :50 dmesg.Oops:invalid_opcode:#[##]KASAN > 47:50 -94% :50 -94% :50 dmesg.RIP:try_get_folio > 47:50 -94% :50 -94% :50 dmesg.kernel_BUG_at_include/linux/page_ref.h > > > > > > For patch #2, I haven’t received any comment yet and I’m going to travel so > > I’m not going to submit v2 soon . > > > > And I heard if hugepd is going to be gone soon, so I may wait for that then > > rebase on top of it. Anyway it is just a clean up. > > > > > > > > > > > > sorry that due to resource constraint, we cannot respond test request very > > > quickly now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > Yang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > David / dhildenb > > > > > > > > > > > > > > > > > > > > > > > > > > > >