Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy

"Robin H. Johnson" <robbat2@xxxxxxxxxx> · Thu, 1 Dec 2016 06:21:42 +0000



On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote:
> [add more CC's]
> 
> On 11/30/2016 09:19 PM, Robin H. Johnson wrote:
> > Somewhere in the Radeon/DRM codebase, CMA page allocation has either
> > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is
> > doing something different with pages.
> 
> Could be that it didn't use dma_generic_alloc_coherent() before, or you didn't 
> have the generic CMA pool configured.
v4.9-rc7-23-gded6e842cf49:
[    0.000000] cma: Reserved 16 MiB at 0x000000083e400000
[    0.000000] Memory: 32883108K/33519432K available (6752K kernel code, 1244K
rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K
cma-reserved)

> What's the output of "grep CMA" on your 
> .config?

# grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA -e INPUT_CMA3000 -e CRYPTO_CMAC
CONFIG_CMA=y
# CONFIG_CMA_DEBUG is not set
# CONFIG_CMA_DEBUGFS is not set
CONFIG_CMA_AREAS=7
CONFIG_DMA_CMA=y
CONFIG_CMA_SIZE_MBYTES=16
CONFIG_CMA_SIZE_SEL_MBYTES=y
# CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set
# CONFIG_CMA_SIZE_SEL_MIN is not set
# CONFIG_CMA_SIZE_SEL_MAX is not set
CONFIG_CMA_ALIGNMENT=8

> Or any kernel boot options with cma in name? 
None.


> By default config this should not be used on x86.
What do you mean by that statement? 
It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be
mutually exclusive?

> > Given that I haven't seen ANY other reports of this, I'm inclined to
> > believe the problem is drm/radeon specific (if I don't start X, I can't
> > reproduce the problem).
> 
> It's rather CMA specific, the allocation attemps just can't be 100% reliable due 
> to how CMA works. The question is if it should be spewing in the log in the 
> context of dma-cma, which has a fallback allocation option. It even uses 
> __GFP_NOWARN, perhaps the CMA path should respect that?
Yes, I'd say if there's a fallback without much penalty, nowarn makes
sense. If the fallback just tries multiple addresses until success, then
the warning should only be issued when too many attempts have been made.

> 
> > The rate of the problem starts slow, and also is relatively low on an idle
> > system (my screens blank at night, no xscreensaver running), but it still ramps
> > up over time (to the point of generating 2.5GB/hour of "(timestamp)
> > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100
> > unique ranges for a day).
> >
> > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9
> > virtual desktops per monitor).
> So IIUC, except the messages, everything actually works fine?
There's high kernel CPU usage that seems to roughly correlate with the
messages, but I can't yet tell if that's due to the syslog itself, or
repeated alloc_contig_range requests.

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
E-Mail   : robbat2@xxxxxxxxxx
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
Attachment:
signature.asc

Description: Digital signature
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel