Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 18, 2021 at 05:26:08PM +0100, David Hildenbrand wrote:
> On 18.02.21 17:19, Minchan Kim wrote:
> > On Thu, Feb 18, 2021 at 10:43:21AM +0100, David Hildenbrand wrote:
> > > On 18.02.21 10:35, Michal Hocko wrote:
> > > > On Thu 18-02-21 10:02:43, David Hildenbrand wrote:
> > > > > On 18.02.21 09:56, Michal Hocko wrote:
> > > > > > On Wed 17-02-21 08:36:03, Minchan Kim wrote:
> > > > > > > alloc_contig_range is usually used on cma area or movable zone.
> > > > > > > It's critical if the page migration fails on those areas so
> > > > > > > dump more debugging message like memory_hotplug unless user
> > > > > > > specifiy __GFP_NOWARN.
> > > > > > 
> > > > > > I agree with David that this has a potential to generate a lot of output
> > > > > > and it is not really clear whether it is worth it. Page isolation code
> > > > > > already has REPORT_FAILURE mode which currently used only for the memory
> > > > > > hotplug because this was just too noisy from the CMA path - d381c54760dc
> > > > > > ("mm: only report isolation failures when offlining memory").
> > > > > > 
> > > > > > Maybe migration failures are less likely to fail but still.
> > > > > 
> > > > > Side note: I really dislike that uncontrolled error reporting on memory
> > > > > offlining path we have enabled as default. Yeah, it might be useful for
> > > > > ZONE_MOVABLE in some cases, but otherwise it's just noise.
> > > > > 
> > > > > Just do a "sudo stress-ng --memhotplug 1" and see the log getting flooded
> > > > 
> > > > Anyway we can discuss this in a separate thread but I think this is not
> > > > a representative workload.
> > > 
> > > Sure, but the essence is "this is noise", and we'll have more noise on
> > > alloc_contig_range() as we see these calls more frequently. There should be
> > > an explicit way to enable such *debug* messages.
> > 
> > alloc_contig_range already has gfp_mask and it respects __GFP_NOWARN.
> 
> I am not 100% sure it does.

Oh, it should. Otherwise, let's fix either of caller or
alloc_contig_range since we have a customer.

```
    ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
            GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0))
```

> 
> > Why shouldn't people use it if they don't care the failure?
> 
> Because flooding the log with noise maybe a handful of people on this planet
> care about is absolutely useless. With the warnings in warn_alloc() people
> can at least conclude something reasonable.
> 
> > Semantically, it makes sense to me.
> > 
> > About the messeage flooding, shouldn't we go with ratelimiting?
> 
> At least that (see warn_alloc()). But I'd even want to see some other
> trigger to enable this explicitly on demand.

No objection.

How about adding verbose knob under CONFIG_CMA_DEBUGFS with
alloc_contig_range(..., bool verbose) like start_isolate_page_range?

If admin turns on the verbose mode under CONFIG_CMA_DEBUGFS,
cma_alloc will pass alloc_contig_range(...., true).




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux