On Wed, Nov 30, 2016 at 10:24:59PM +0100, Vlastimil Babka wrote: > [add more CC's] > > On 11/30/2016 09:19 PM, Robin H. Johnson wrote: > > Somewhere in the Radeon/DRM codebase, CMA page allocation has either > > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is > > doing something different with pages. > > Could be that it didn't use dma_generic_alloc_coherent() before, or you didn't > have the generic CMA pool configured. v4.9-rc7-23-gded6e842cf49: [ 0.000000] cma: Reserved 16 MiB at 0x000000083e400000 [ 0.000000] Memory: 32883108K/33519432K available (6752K kernel code, 1244K rwdata, 4716K rodata, 1772K init, 2720K bss, 619940K reserved, 16384K cma-reserved) > What's the output of "grep CMA" on your > .config? # grep CMA .config |grep -v -e SECMARK= -e CONFIG_BCMA -e CONFIG_USB_HCD_BCMA -e INPUT_CMA3000 -e CRYPTO_CMAC CONFIG_CMA=y # CONFIG_CMA_DEBUG is not set # CONFIG_CMA_DEBUGFS is not set CONFIG_CMA_AREAS=7 CONFIG_DMA_CMA=y CONFIG_CMA_SIZE_MBYTES=16 CONFIG_CMA_SIZE_SEL_MBYTES=y # CONFIG_CMA_SIZE_SEL_PERCENTAGE is not set # CONFIG_CMA_SIZE_SEL_MIN is not set # CONFIG_CMA_SIZE_SEL_MAX is not set CONFIG_CMA_ALIGNMENT=8 > Or any kernel boot options with cma in name? None. > By default config this should not be used on x86. What do you mean by that statement? It should be disallowed to enable CONFIG_CMA? Radeon and CMA should be mutually exclusive? > > Given that I haven't seen ANY other reports of this, I'm inclined to > > believe the problem is drm/radeon specific (if I don't start X, I can't > > reproduce the problem). > > It's rather CMA specific, the allocation attemps just can't be 100% reliable due > to how CMA works. The question is if it should be spewing in the log in the > context of dma-cma, which has a fallback allocation option. It even uses > __GFP_NOWARN, perhaps the CMA path should respect that? Yes, I'd say if there's a fallback without much penalty, nowarn makes sense. If the fallback just tries multiple addresses until success, then the warning should only be issued when too many attempts have been made. > > > The rate of the problem starts slow, and also is relatively low on an idle > > system (my screens blank at night, no xscreensaver running), but it still ramps > > up over time (to the point of generating 2.5GB/hour of "(timestamp) > > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100 > > unique ranges for a day). > > > > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9 > > virtual desktops per monitor). > So IIUC, except the messages, everything actually works fine? There's high kernel CPU usage that seems to roughly correlate with the messages, but I can't yet tell if that's due to the syslog itself, or repeated alloc_contig_range requests. -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer E-Mail : robbat2@xxxxxxxxxx GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
Attachment:
signature.asc
Description: Digital signature