On 2022/2/9 16:43, Maxime Ripard wrote:
On Fri, Feb 04, 2022 at 12:29:39AM +0800, Sui Jingfeng wrote:
+static int lsdc_modeset = 1;
+MODULE_PARM_DESC(modeset, "Enable/disable CMA-based KMS(1 = enabled(default), 0 = disabled)");
+module_param_named(modeset, lsdc_modeset, int, 0644);
+
+static int lsdc_cached_coherent = 1;
+MODULE_PARM_DESC(cached_coherent, "uss cached coherent mapping(1 = enabled(default), 0 = disabled)");
+module_param_named(cached_coherent, lsdc_cached_coherent, int, 0644);
+
+static int lsdc_dirty_update = -1;
+MODULE_PARM_DESC(dirty_update, "enable dirty update(1 = enabled, 0 = disabled(default))");
+module_param_named(dirty_update, lsdc_dirty_update, int, 0644);
+
+static int lsdc_use_vram_helper = -1;
+MODULE_PARM_DESC(use_vram_helper, "use vram helper based solution(1 = enabled, 0 = disabled(default))");
+module_param_named(use_vram_helper, lsdc_use_vram_helper, int, 0644);
+
+static int lsdc_verbose = -1;
+MODULE_PARM_DESC(verbose, "Enable/disable print some key information");
+module_param_named(verbose, lsdc_verbose, int, 0644);
It's not really clear to me why you need any of those parameters. Why
would a user want to use a non coherent mapping for example?
Because we are Mips architecture. Paul Cercueil already explained it
in his mmap GEM buffers cachedpatch <https://lkml.kernel.org/lkml/20200822164233.71583-1-paul@xxxxxxxxxxxxxxx/T/>. I drag part of it to here for
convenient to reading:
/Traditionally, GEM buffers are mapped write-combine. Writes to the buffer
are accelerated, and reads are slow. Application doing lots////of
alpha-blending paint inside shadow buffers, which is then memcpy'd////into
the final GEM buffer.///
"non coherent mapping" is actually cached and it is for CMA helpers
base driver, not for VRAM helper based driver. For Loongson CPU/SoCs.
The cache coherency is maintained by hardware, therefore there no
need to worry about coherency problems. This is true at least for
ls3a3000, ls3a4000 and ls3a5000.
"non coherent" or "coherent" is not important here, the key point is
that the backing memory of the framebuffer is cached with non coherent
mapping, you don't need a shadow buffer layer when using X server's
modesetting driver.
Read and write to the framebuffer in system memory is much faster than
read and write to the framebuffer in the VRAM.
Why CMA helper based solution is faster than the VRAM based solution on Mips platform?
Partly because of the CPU have L1, L2 and L3 cache, especially L3 cache
is as large as 8MB, read and write from the cache is fast.
Another reason is as Paul Cercueil said, read from VRAM with write-combine
cache mode is slow. it is just uncache read.
Please note that we don't have a GPU here, we are just a display controller.
For the VRAM helper based driver case, the backing memory of the framebuffer
is located at VRAM, When using X server's modesetting driver, we have to enable
the ShadowFB option, Uncache acceleration support(at the kernel size) should
also be enabled. Otherwise the performance of graphic application is just slow.
Beside write-combine cache mode have bugs on our platform, a kernel side
developer have disabled it. Write-combine cache mode just boil down to uncached
now. See [1] and [2]
[1]https://lkml.org/lkml/2020/8/10/255
[2]https://lkml.kernel.org/lkml/1617701112-14007-1-git-send-email-yangtiezhu@xxxxxxxxxxx/T/
This is the reason why we prefer CMA helper base solution with non coherent mapping,
simply because it is fast.
As far as I know, Loongson's CPU does not has the concept of write-combine,
it only support three caching mode: uncached, cached and uncache acceleration.
write-combine is implemented with uncache acceleration on Mips.
My point wasn't just about the VRAM vs CMA stuff, it was about why do
you need all those switches in the first place?
Take the verbose parameter for example: it's entirely redundant with the
already existing, documented, DRM logging infrastructure.
Yes, verbose is redundant, we will use drm_dbg() instead of verbose.
thanks.
I am correcting.
Then, you have "modeset", and I'm not sure why it's supposed to be
there, at all. This is a modesetting driver, why would I want to disable
modesetting entirely?
Something you want fbdev driver, for example simple-framebuffer driver which
using the firmware passed fb (screeninfo).
besides, text mode support.
More fundamentally (and this extends to the CMA, caching and VRAM stuff
you explained above), why can't the driver pick the right decision all
the time and why would that be under the user control?
The right decision for ls7a1000 is to use VRAM based helper, But sometimes
we need CMA helper based solution. Because: The PRIME support is lost, use
lsdc with etnaviv is not possible any more.
Buffer sharing with etnaviv is no longer possible, loongson display controllers
are simple which require scanout buffers to be physically contiguous.
We still need to develop userspace driver(say xf86-video-loongson)
on ls3a4000 + ls7a1000 platform then deploy those driver to ls2k1000 platform.
ls3a4000 and ls3a5000 is fast enough which can build the linux kernel directly.
Build mesa, libdrm, xf86-video-loongson just a piece of cake.
Is is so fast on ls3a5000+ls7a1000, developing driver on ls2k1000 is cumbersome,
embarrassing slow.
I means it(ls3a4000/ls3a5000 + ls7a1000) is not just target for end user, but
it is a platform which can be used to develop software for other platform.
The author of linux device driver told us that device driver is providing mechanism, not policy.
We are able to make the decision, but we want to give the user more choice.
"pick the right decision all the time" is also true, i am considering correct this,
thanks.
You were mentioning that you need to work-around MIPS memory management.
Then fine, just do that on MIPS, and don't it on the other architectures
that don't need it. There's no need for a knob.
Maxime