Re: fbdev: Garbage collect fbdev scrolling acceleration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 24, 2022 at 7:27 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Daniel et al,
>
> On Wed, Jan 19, 2022 at 4:39 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
> > On Thu, Jan 13, 2022 at 10:46:03PM +0100, Sven Schnelle wrote:
> > > Helge Deller <deller@xxxxxx> writes:
> > > > I may have missed some discussions, but I'm objecting against this patch:
> > > >
> > > >     b3ec8cdf457e5 ("fbdev: Garbage collect fbdev scrolling acceleration, part 1 (from TODO list)")
> > > >
> > > > Can we please (partly) revert it and restore the scrolling behaviour,
> > > > where fbcon uses fb_copyarea() to copy the screen contents instead of
> > > > redrawing the whole screen?
> > > >
> > > > I'm fine with dropping the ypan-functionality.
> > > >
> > > > Maybe on fast new x86 boxes the performance difference isn't huge,
> > > > but for all old systems, or when emulated in qemu, this makes
> > > > a big difference.
> > > >
> > > > Helge
> > >
> > > I second that. For most people, the framebuffer isn't important as
> > > they're mostly interested in getting to X11/wayland as fast as possible.
> > > But for systems like servers without X11 it's nice to have a fast
> > > console.
> >
> > Fast console howto:
> > - shadow buffer in cached memory
> > - timer based upload of changed areas to the real framebuffer
> >
> > This one is actually fast, instead of trying to use hw bltcopy and having
> > the most terrible fallback path if that's gone. Yes drm fbdev helpers has
> > this (but not enabled on most drivers because very, very few people care).
>
> That depends on the hardware, and the balance between CPU-to-RAM,
> CPU-to-VRAM, and GPU-to-VRAM bandwidths, and CPU and GPU performance.
>
> When scrolling, the fastest copy is the copy that doesn't need to copy
> much.  So that's why fbcon supports (or supported :-( many strategies:
> scrolling by wrapping, panning, copying (either by CPU or by (simple)
> GPU), re-rendering (useful for a GPU with bitmap expansion).  So forcing
> everybody to render into a fully cached shadow buffer and upload changed
> areas is not the silver bullet.
>
> Whether text output is rendered immediately or not is completely
> orthogonal to this.  While timer-based updates would speed up printing
> of large hunks of text (where no one actually reads what was printed at
> the top), that would have almost no impact on actual interactive console
> work: it may still take 0.5s to scroll the screen if you press "enter"
> when your cursor is positioned on the last line.
> BTW, implementing timer-based updates would make measuring real-world
> performance more difficult, as we would have to use a different
> benchmark than "time dmesg" ;-)
>
> Both Daniel and Thomas said: fbdev is not suitable for modern hardware.
> Fine, we do not debate that, and do not want to prevent you from using
> DRM for modern hardware.  Then please accept us saying that DRM (in its
> current form) is not suitable for other types of graphics hardware.
> Still, even modern (embedded) hardware may have small low-color
> displays.
>
> For the last +5 years, we've been pointed to the tinydrm drivers, to
> serve as examples for converting existing fbdev drivers to drm drivers.
> All but one of them are drivers for hi-color or better hardware, thus
> surpassing the capabilities of lots of hardware driven by fbdev drivers.
> The other one is an e-ink driver that exposes an XRGB8888 shadow frame
> buffer, and converts that in a two-step process, first to 8-bit
> grayscale, second to 1-bit monochrome.  If that is considered a good
> example, should I be impressed?
> Compare that to other subsystems boasting about zero-copy...

tiny drivers are the state of the art for small neat drivers. As you
pointed out multiple times now there's not Rx or Cx support for x < 8
in drm or fbdev yet, so that would need to be added. If someone cares
enough for that. Some of the fbtft drivers have gone down
substantially when ported to tiny, which is really the claim we've put
down. Not that you'll find the perfect C4 pixel format example in
there, at most you find C8 support in some of the really old drivers
like i915/radeon/nouveau for old platforms. But that's very well
burried.

I guess in practice (as you point out below) the repaper display is so
glacially slow anyway and connected to machines with enough ram that
generally the only case that mattered was convenience and hence
supporting what every drm userspace can cope with minimally. Which is
xrgb8888. So yeah don't look at a driver which updates at roughly
0.5fps for efficient upload code :-) The space wasting is a bit more
important and should be trivial to add if someone cares enough to do
that.
-Daniel

> Furthermore, for a contemporary e-ink device like[1], the shadow buffer
> would consume 10 MiB.  Of course this device has 4 GiB of RAM, and quad
> Cortex-A55 CPU cores, but not all systems have 10 MiB to spare...
>
> [1] https://linuxgizmos.com/rk3566-based-pinenote-e-ink-tablet-ships-at-399/
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch



[Index of Archives]     [Video for Linux]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Tourism]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux