On Wed, 16 Mar 2022 11:41:37 +0800 Dong Aisheng <dongas86@xxxxxxxxx> wrote: > On Wed, Mar 16, 2022 at 6:58 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Tue, 15 Mar 2022 22:45:20 +0800 Dong Aisheng <aisheng.dong@xxxxxxx> wrote: > > > > > --- a/mm/cma.c > > > +++ b/mm/cma.c > > > > > > ... > > > > > > @@ -457,6 +458,16 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, > > > offset); > > > if (bitmap_no >= bitmap_maxno) { > > > spin_unlock_irq(&cma->lock); > > > + pr_debug("%s(): alloc fail, retry loop %d\n", __func__, loop++); > > > + /* > > > + * rescan as others may finish the memory migration > > > + * and quit if no available CMA memory found finally > > > + */ > > > + if (start) { > > > + schedule(); > > > + start = 0; > > > + continue; > > > + } > > > break; > > > > The schedule() is problematic. For a start, we'd normally use > > cond_resched() here, so we avoid calling the more expensive schedule() > > if we know it won't perform any action. > > > > But cond_resched() is problematic if this thread has realtime > > scheduling policy and the process we're waiting on does not. One way > > to address that is to use an unconditional msleep(1), but that's still > > just a hack. > > > > I think we can simply drop schedule() here during the second round of retry > as the estimated delay may not be really needed. That will simply cause a tight loop, so I'm obviously not understanding the proposal.