Hi Thomas, On Thu, Aug 01, 2019 at 11:59:28AM +0200, Thomas Zimmermann wrote: > Hi > > Am 01.08.19 um 10:37 schrieb Feng Tang: > > On Thu, Aug 01, 2019 at 02:19:53PM +0800, Rong Chen wrote: > >>>>>>>>>>> > >>>>>>>>>>> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation") > >>>>>>>>>>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git master > >>>>>>>>>> Daniel, Noralf, we may have to revert this patch. > >>>>>>>>>> > >>>>>>>>>> I expected some change in display performance, but not in VM. Since it's > >>>>>>>>>> a server chipset, probably no one cares much about display performance. > >>>>>>>>>> So that seemed like a good trade-off for re-using shared code. > >>>>>>>>>> > >>>>>>>>>> Part of the patch set is that the generic fb emulation now maps and > >>>>>>>>>> unmaps the fbdev BO when updating the screen. I guess that's the cause > >>>>>>>>>> of the performance regression. And it should be visible with other > >>>>>>>>>> drivers as well if they use a shadow FB for fbdev emulation. > >>>>>>>>> For fbcon we should need to do any maps/unamps at all, this is for the > >>>>>>>>> fbdev mmap support only. If the testcase mentioned here tests fbdev > >>>>>>>>> mmap handling it's pretty badly misnamed :-) And as long as you don't > >>>>>>>>> have an fbdev mmap there shouldn't be any impact at all. > >>>>>>>> The ast and mgag200 have only a few MiB of VRAM, so we have to get the > >>>>>>>> fbdev BO out if it's not being displayed. If not being mapped, it can be > >>>>>>>> evicted and make room for X, etc. > >>>>>>>> > >>>>>>>> To make this work, the BO's memory is mapped and unmapped in > >>>>>>>> drm_fb_helper_dirty_work() before being updated from the shadow FB. [1] > >>>>>>>> That fbdev mapping is established on each screen update, more or less. > >>>>>>>> From my (yet unverified) understanding, this causes the performance > >>>>>>>> regression in the VM code. > >>>>>>>> > >>>>>>>> The original code in mgag200 used to kmap the fbdev BO while it's being > >>>>>>>> displayed; [2] and the drawing code only mapped it when necessary (i.e., > >>>>>>>> not being display). [3] > >>>>>>> Hm yeah, this vmap/vunmap is going to be pretty bad. We indeed should > >>>>>>> cache this. > >>>>>>> > >>>>>>>> I think this could be added for VRAM helpers as well, but it's still a > >>>>>>>> workaround and non-VRAM drivers might also run into such a performance > >>>>>>>> regression if they use the fbdev's shadow fb. > >>>>>>> Yeah agreed, fbdev emulation should try to cache the vmap. > >>>>>>> > >>>>>>>> Noralf mentioned that there are plans for other DRM clients besides the > >>>>>>>> console. They would as well run into similar problems. > >>>>>>>> > >>>>>>>>>> The thing is that we'd need another generic fbdev emulation for ast and > >>>>>>>>>> mgag200 that handles this issue properly. > >>>>>>>>> Yeah I dont think we want to jump the gun here. If you can try to > >>>>>>>>> repro locally and profile where we're wasting cpu time I hope that > >>>>>>>>> should sched a light what's going wrong here. > >>>>>>>> I don't have much time ATM and I'm not even officially at work until > >>>>>>>> late Aug. I'd send you the revert and investigate later. I agree that > >>>>>>>> using generic fbdev emulation would be preferable. > >>>>>>> Still not sure that's the right thing to do really. Yes it's a > >>>>>>> regression, but vm testcases shouldn run a single line of fbcon or drm > >>>>>>> code. So why this is impacted so heavily by a silly drm change is very > >>>>>>> confusing to me. We might be papering over a deeper and much more > >>>>>>> serious issue ... > >>>>>> It's a regression, the right thing is to revert first and then work > >>>>>> out the right thing to do. > >>>>> Sure, but I have no idea whether the testcase is doing something > >>>>> reasonable. If it's accidentally testing vm scalability of fbdev and > >>>>> there's no one else doing something this pointless, then it's not a > >>>>> real bug. Plus I think we're shooting the messenger here. > >>>>> > >>>>>> It's likely the test runs on the console and printfs stuff out while running. > >>>>> But why did we not regress the world if a few prints on the console > >>>>> have such a huge impact? We didn't get an entire stream of mails about > >>>>> breaking stuff ... > >>>> The regression seems not related to the commit. But we have retested > >>>> and confirmed the regression. Hard to understand what happens. > >>> Does the regressed test cause any output on console while it's > >>> measuring? If so, it's probably accidentally measuring fbcon/DRM code in > >>> addition to the workload it's trying to measure. > >>> > >> > >> Sorry, I'm not familiar with DRM, we enabled the console to output logs, and > >> attached please find the log file. > >> > >> "Command line: ... console=tty0 earlyprintk=ttyS0,115200 > >> console=ttyS0,115200 vga=normal rw" > > > > We did more check, and found this test machine does use the > > mgag200 driver. > > > > And we are suspecting the regression is caused by > > > > commit cf1ca9aeb930df074bb5bbcde55f935fec04e529 > > Author: Thomas Zimmermann <tzimmermann@xxxxxxx> > > Date: Wed Jul 3 09:58:24 2019 +0200 > > Yes, that's the commit. Unfortunately reverting it would require > reverting a hand full of other patches as well. > > I have a potential fix for the problem. Could you run and verify that it > resolves the problem? Sure, please send it to us. Rong and I will try it. Thanks, Feng > Best regards > Thomas > > > > > drm/fb-helper: Map DRM client buffer only when required > > > > This patch changes DRM clients to not map the buffer by default. The > > buffer, like any buffer object, should be mapped and unmapped when > > needed. > > > > An unmapped buffer object can be evicted to system memory and does > > not consume video ram until displayed. This allows to use generic fbdev > > emulation with drivers for low-memory devices, such as ast and mgag200. > > > > This change affects the generic framebuffer console. HW-based consoles > > map their console buffer once and keep it mapped. Userspace can mmap this > > buffer into its address space. The shadow-buffered framebuffer console > > only needs the buffer object to be mapped during updates. While not being > > updated from the shadow buffer, the buffer object can remain unmapped. > > Userspace will always mmap the shadow buffer. > > > > which may add more load when fbcon is busy printing out messages. > > > > We are doing more test inside 0day to confirm. > > > > Thanks, > > Feng > > _______________________________________________ > > dri-devel mailing list > > dri-devel@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel