On Thu, 2023-08-03 at 23:04 +0800, Kefeng Wang wrote: > > On 2023/8/3 21:41, Aaron Lu wrote: > > On Thu, Aug 03, 2023 at 02:06:46PM +0800, Aaron Lu wrote: > > > On Wed, Aug 02, 2023 at 07:54:38PM +0700, Bagas Sanjaya wrote: > > > > Hi, > > > > > > > > I notice a bug report on Bugzilla [1]. Quoting from it: > > > > > > > > > How to reproduce: > > > > > > > > > > Had 24 CPU Alderlake 16GB debian12 system running with default kernel (from makecondig) on 6.5-rc4, exercised with no swap to start with. > > > > > > > > > > using stress-ng tip commit 0f2ef02e9bc5abb3419c44be056d5fa3c97e0137 > > > > > (see https://github.com/ColinIanKing/stress-ng ) > > > > > > > > > > build and run stress-ng for say 60 minutes: > > > > > > > > > > ./stress-ng --cpu-online 50 --brk 50 --swap 50 --vmstat 1 -t 60m > > > > > > > > > > Will hang in mm/swapfile.c:718 add_to_avail_list+0x93/0xa0 > > > > > > > > > > See attached file for an image of the console on the hang (I'm trying to get the full stack dump). > > > > > > > > See Bugzilla for the full thread and attached console image. > > > > > > > > FWIW, I have to forward this bug report to the mailing lists because > > > > Thorsten noted that many developers don't take a look on Bugzilla > > > > (see the BZ thread). > > > > > > Thanks. > > > > > > I can reproduce this issue using below cmdline: > > > $ sudo ./stress-ng --brk 50 --swap 5 --vmstat 1 -t 60m > > > > > > I'll investigate what is happening. > > > > Hi Colin, > > > > Can you try the below diff on top of v6.5-rc4? It works for me here > > although I got the warn in a different place in get_swap_pages(): > > > > WARN(!si->highest_bit, > > "swap_info %d in list but !highest_bit\n", > > si->type); > > > > I think the warn you got in add_to_avail_list() due to the swap device > > is already in the list is similar, see below explanation. > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 8e6dde68b389..cb7e93ec1933 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -2330,7 +2330,8 @@ static void _enable_swap_info(struct swap_info_struct *p) > > * swap_info_struct. > > */ > > plist_add(&p->list, &swap_active_head); > > - add_to_avail_list(p); > > + if (p->highest_bit) > > + add_to_avail_list(p); > > } > > There is a patch in next, > > commit bdfc7028681ddbce5ab08f4888d157a981060544 > Author: Ma Wupeng <mawupeng1@xxxxxxxxxx> > Date: Tue Jun 27 20:08:33 2023 +0800 > > swap: stop add to avail list if swap is full > Ah, should have tried mm-unstable first. I took a look at that commit and it's exact the same issue and same fix so with that fix, we are good now. > > > > > > static void enable_swap_info(struct swap_info_struct *p, int prio, > > > > The finding is, if a swap device failed to be swapoff, then it will be > > reinsert_swap_info() -> _enable_swap_info() -> add_to_avail_list(). The > > problem is, this swap device may run out of space with its highest_bit > > being 0 and shouldn't be added to avail list. In your case, once its > > highest_bit becomes non-zero, it will go through add_to_avail_list() > > and since it's already in the list, thus the warn. > > > > If it works for you, I'll prepare a patch. Thanks. > >