Re: [PATCH mm-unstable v1 5/5] mm/swap: remove boilerplate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Aug 03, 2024 at 11:55:51PM -0700, Hugh Dickins wrote:
> On Wed, 10 Jul 2024, Yu Zhao wrote:
> 
> > Remove boilerplate by using a macro to choose the corresponding lock
> > and handler for each folio_batch in cpu_fbatches.
> > 
> > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> 
> Andrew, please revert this "remove boilerplate" patch (and of course its
> followup fix) from mm-unstable. From the title I presume it was intended
> to make no functional change, but that's far from so.
> 
> Under tmpfs swapping load, on different runs I see various badnesses:
> "Bad page" in __free_one_page(), Oops in __run_timer_base(),
> WARNING at kernel/workqueue.c:790 in set_work_data(), PageBuddy BUG
> at page-flags.h:1009 from __del_page_from_freelist(), something (I'd
> given up taking better notes by this time) in __queue_work(), others.
> 
> All those were including the fix to Barry's report: without that fix,
> the boot is drowned in warnings scrolling past too fast to be read.
> 
> (All the above were on the HP workstation, swapping to SSD; whereas on
> this ThinkPad, swapping to NVMe, no problem seen at all - I mention the
> swapping medium, but have no idea whether that's a relevant difference.
> In each case, MGLRU compiled in but not enabled. THPs and 64kTHPs active.)
> 
> Sorry, but I've put no effort whatsoever into debugging this: "remove
> boilerplate" didn't seem worth the effort, and my personal preference
> is for readable boilerplate over hiding in a macro.  If you prefer the
> macro, I expect Yu can soon come up with a fix (which I could test here):
> but for now please revert "remove boilerplate", since its issues get in
> the way of further mm testing.

Sorry for getting in your way, Hugh.

Apparently I didn't expect local_lock_t to be zero length, i.e., when
CONFIG_DEBUG_LOCK_ALLOC is not set. So that might explain why you only
had problems with one of the two machines, where it failed to disable
IRQ when rotating clean pages after writeback.

The following should fix it, in case you want to verify the above:

diff --git a/mm/swap.c b/mm/swap.c
index 4bc08352ad87..67a246772811 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -254,7 +254,7 @@ static void __folio_batch_add_and_move(struct folio_batch __percpu *fbatch,
 		folio,										\
 		op,										\
 		on_lru,										\
-		offsetof(struct cpu_fbatches, op) > offsetof(struct cpu_fbatches, lock_irq)	\
+		offsetof(struct cpu_fbatches, op) >= offsetof(struct cpu_fbatches, lock_irq)	\
 	)
 
 static void lru_move_tail(struct lruvec *lruvec, struct folio *folio)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux