On Fri, 2 Apr 2021 15:03:37 +0800 Stillinux <stillinux@xxxxxxxxx> wrote: > In the case of high system memory and load pressure, we ran ltp test > and found that the system was stuck, the direct memory reclaim was > all stuck in io_schedule, the waiting request was stuck in the blk_plug > flow of one process, and this process fell into an infinite loop. > not do the action of brushing out the request. > > The call flow of this process is swap_cluster_readahead. > Use blk_start/finish_plug for blk_plug operation, > flow swap_cluster_readahead->__read_swap_cache_async->swapcache_prepare. > When swapcache_prepare return -EEXIST, it will fall into an infinite loop, > even if cond_resched is called, but according to the schedule, > sched_submit_work will be based on tsk->state, and will not flash out > the blk_plug request, so will hang io, causing the overall system hang. > > For the first time involving the swap part, there is no good way to fix > the problem from the fundamental problem. In order to solve the > engineering situation, we chose to make swap_cluster_readahead aware of > the memory pressure situation as soon as possible, and do io_schedule to > flush out the blk_plug request, thereby changing the allocation flag in > swap_readpage to GFP_NOIO , No longer do the memory reclaim of flush io. > Although system operating normally, but not the most fundamental way. > Thanks. I'm not understanding why swapcache_prepare() repeatedly returns -EEXIST in this situation? And how does the switch to GFP_NOIO fix this? Simply by avoiding direct reclaim altogether? > --- > mm/page_io.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/page_io.c b/mm/page_io.c > index c493ce9ebcf5..87392ffabb12 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -403,7 +403,7 @@ int swap_readpage(struct page *page, bool synchronous) > } > > ret = 0; > - bio = bio_alloc(GFP_KERNEL, 1); > + bio = bio_alloc(GFP_NOIO, 1); > bio_set_dev(bio, sis->bdev); > bio->bi_opf = REQ_OP_READ; > bio->bi_iter.bi_sector = swap_page_sector(page);