Hi Am 17.01.22 um 19:47 schrieb Sven Schnelle:
Hi Thomas, Thomas Zimmermann <tzimmermann@xxxxxxx> writes:Hi Am 14.01.22 um 19:11 schrieb Helge Deller:The fbdev layer is orphaned, but seems to need some care. So I'd like to step up as new maintainer. Signed-off-by: Helge Deller <deller@xxxxxx>First of all, thank you for stepping up to maintain the fbdev codebase. It really needs someone actively looking after it. And now comes the BUT. I want to second everything said by Danial and Javier. In addition to purely organizational topics (trees, PRs, etc), there are a number of inherit problems with fbdev. * It's 90s technology. Neither does it fit today's userspace, not hardware. If you have more than just the most trivial of graphical output fbdev isn't for you. * There's no new development in fbdev and there are no new drivers. Everyone works on DRM, which is better in most regards. The consequence is that userspace is slowly loosing the ability to use fbdev.That might be caused by the fact that no new drivers are accepted for
And that is caused by the fact that fbdev is nowhere up to todays requirements.
fbdev. I wrote a driver for the HP Visualize FX5/10 cards end of last year which was rejected for inclusion into fbdev[1].
Yep, I was hoping for a reply.
Based on your recommendation i re-wrote the whole thing in DRM. This works but has several drawbacks: - no modesetting. With fbdev, i can nicely switch resolutions with fbset. That doesn't work, and i've been told that this is not supported[2]
I didn't say that we're not going to support it at all. It's just not supported at the momont. vmwgfx has modesetting code that can serve as starting point.
- It is *much* slower than fbset with hardware blitting. I would have to dig out the numbers, but it's in the ratio of 1:15. The nice thing with fbdev blitting is that i get an array of pixels and the foreground/background colors all of these these pixels should have. With the help of the hardware blitting, i can write 32 pixels at once with every 32-bit transfer.
For comparison, how fast is fbdev with plain memcpy() and memset()?
With DRM, the closest i could find was DRM_FORMAT_C8, which means one byte per pixel. So i can put 4 pixels into one 32-bit transfer.
IIRC the hardware only supported 8-bit palette colors, so C8 was the correct choice. Otherwise, you can add new formats and add them to the console.
fbdev also clears the lines with hardware blitting, which is much faster than clearing it with memcpy. Based on your recommendation i also verified that pci coalescing is enabled. These numbers are with DRM's unnatural scrolling behaviour - it seems to scroll several (text)lines at once if it takes to much time. I guess if DRM would scroll line by line it would be even slower. If DRM would add those things - hardware clearing of memory regions, hw blitting for text with a FG/BG color and modesetting i wouldn't care about fbdev at all. But right now, it's working way faster for me.
I admit that your hardware is at the edge of what DRM currently supports. But I've used some of the DRM stuff on Athlon XPs with PCI graphics. While the performance wasn't good, it was far from unusable.
I guess you used GEM SHMEM for memory buffers. fbdev and mmap with shmem pages use some of the same bits in struct page, so shmem cannot mmap it's pages directly. We have to use an additional shadow buffer. Any display update goes from the shadow buffer into the shmem buffer and into the videoram. That's two memcpys. This can be reduced to one memcpy, but we never had the requirement to do so.
There's also potential for reducing the amount of page mappings/unmappings with gem shmem.
And DRM supports shadow buffers, virtual screen sizes and damage handling in DRM. A sophisticated driver might be able to use shadow buffering, damage handling and hardware panning to reduce the amount of screen updates to a minimum.
Until these things are fixed, adding hardware blitting doesn't really make sense IMHO.
As with other things, we didn't have a requirement for all these optimizations so far. A usually good approach to improve the sitution is to get a basic driver merged and then address the problems one by one.
Best regards Thomas
I also tested the speed on my Thinkpad X1 with Intel graphics, and there a dmesg with 919 lines one the text console took about 2s to display. In x11, i measure 22ms. This might be unfair because encoding might be different, but i cannot confirm the 'memcpy' is faster than hardware blitting' point. I think if that would be the case, no-one would care about 2D acceleration. Don't get me wrong, i'm not saying there's no reason for DRM. I fully understand why it exists and think it's a good way to go. But for system where a (fast) local console is required without X11, fbdev might be the better choice at the moment. Regards Sven [1] https://lore.kernel.org/all/87ee7qvcc7.fsf@xxxxxxxxxxxxxxxxx/T/#m57cdea83608fc78bfc6c2e76eb037bf82017b302 [2] https://lore.kernel.org/all/87ee7qvcc7.fsf@xxxxxxxxxxxxxxxxx/T/#m46a52815036a958f6a11d2f3f62e1340a09bd981
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature