CairoSDPR status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I append Armin's update with his permission on some exciting work he's been doing with us to improve Cairo rendering performance :-)
Bit of a thread here that should have happened in public:

	FYI,

		Michael.

Hi everyone,

I wanted to take the opportunity to give a short update on the state of the development of the Cairo-based System-Dependent Primitive Renderer (in short: CairoSDPR). I spent quite some time on June to get that going (you will see :-) in the direction to have it working in a product-quality manner. I started with the one provided by Caolan (thanks!) that he based on cc-ing and adapting the one for Direct2D/Windows prototype that I did.

I vastly extended it (leaving no stone where it was basically). It is already a 'complete' renderer in the sense that all primitives that *have* to be supported are supported. It is now quite beyond that minimum requirements by supporting a number of primitives directly that make rendering much faster/effective. As planned when designing that it is possible to map a lot of stuff very directly to Cairo and I did that. It is still not on product quality, mainly because direct text rendering is missing - one of the big next Primitives to support directly. More on that later.

For the basic/necessary primitives: It supports buffering of Bitmap data (plus MipMapping, this time 'diagonal' with equal factors in X/Y to not exceed mem limits) and path data using the stable system-dependent buffering mechanism available for quite some time. Bitmaps are directly supporting RGBA, but also RGB. I also experimented using RGB16, that works well but does not really show huge speedups AFAICS - can be checked by a local define if wanted (look for TEST_RGB16). That may need some more love e.g. when creating masks - special processing of that 16bit rgb coding (R:5,G:6,B:5).

For in-between results the renderer can directly use RGBA buffers, thus avoiding many adaptions/pixel calculations that the VCLRenderer has to make. All in all I know not a single place where it is or can theoretically be slower that the VCLRenderer using the backend, including the 8-10 layers that VCLRenderer and backend have/do process. The contrary: Much stuff can use Cairo directly now, and it shows. Some examples:

UnifiedTransparencePrimitive2D: Most transparency cases are of that type, avoid having to decompose and to use much more expensive general TransparencePrimitive2D. Can be rendered directly in Cairo.

PolygonStrokePrimitive2D: Lines with widths and patter(s) are processed directly, no decompose/own tessellation needed. Thankfully what Cairo does is nearly 100% compatible what our definition does require.

FillGraphicPrimitive2D: Used for repeated graphics/Pattern fill (including vector data). Uses size-dependent fallback for prepared bitmap/direct vector data dependent on target display size (as the Direct2D version does), also buffered, so no 'jitters' will appear when zooming into vector graphics. Also seamless and fast for VERY MANY repeats now. Much faster and looking better than the decompose.

FillGradientPrimitive2D: That was most work, but *all six* variations we internally have are now directly mapped/supported despite the strange old stuff that happens/is defined. I made one exception: for the Elliptical I use standard-circular instead of that insane step-in-two-pixels-and-draw-an-ellipse stuff, this is not really visible in smooth gradients. Also gradients are much smoother that way. Rectangle gradients were hard but solvable: have to stich together as four filled polygons, but works, also much faster (meshes do not work, have no colorsteps).

For some stuff I found multiple solutions which were similar in speed. In those cases - since Cairo backends may be different - I kept both preferring one using a static bool. It might be worth to do a 'test-run' at 1st startup and measure that stuff and define some switches for the renderer, just a possibility.

I added many comments to make it easier to read and change/correct in the future.

Todos:

The TransparencePrimitive2D (which supports a general alpha definition independent from content) should be optimized: I found no simple way (yet) to render Cairo RGB(A) to a Cairo mask using the standard-LuminanceToAlpha calculation (Direct2D has one...sigh). Worth experimenting...

Of course: TextRendering. Does not look bad - the fallback decompose gets the outlines and draws the AAed, but not professional quality.

SVG Gradients: These use 'own' Primitives which still get decomposed. To solve that, either support directly or (better, all usages would profit) change to standard gradient Primitives now that we have MultiClorGradients.

PatternFill: It sometimes feels slow, but it's NOT the rendering, but the HitTest using Primitives when moving the mouse hovering: Needs to directly support FillGraphicPrimitive2D instead of using the decompose.

ColorModifierStack: Add buffering of Bitmaps that have to be ColorModified -> will speedup stuff not only for this renderer, but for all Primitive renderers, maybe even VCL backends.

XOR: Not yet supported. Two possibilities: Get rid of (would be good but requires some work, some already done, not too many cases remaining) or support it (also some work already exists in Cairo backend how to do that).

General stabilization: Still may have errors...

General optimization: Many more possibilities for speedups...


Notes to how to use the CairoSDPR: For now on gerrit (https://gerrit.libreoffice.org/c/core/+/168911) is on green, can go to master soon. Is in a good in-between state to do so. Can be used/tested by having the TEST_SYSTEM_PRIMITIVE_RENDERER env var set. I already checked/compared with pro versions. Maybe you take a look at this in your cases of usage and give feedback.

In the later product: The CairoSDPR *completely* replaces the VCLPixelPrimitiveRenderer, so all Primitives will be rendered to VDevs/Wins using Cairo directly and NO vcl at all. To repeat: This is true for Draw/Impress completely, for Calc/Writer it's the inserted DrawObjects, Graphics and overlay things (Markers, Grid, TextSelection, ...). For more, more needs to be adapted to Primitives...


That's the state of things. I plan to just continue, but let me know about suggestions/thoughts from your side.


Regards,

Armin

My reply:

On 02/07/2024 16:41, Michael Meeks wrote:
> Hi Armin,
>
>      Let me start from the end:
>
> On 02/07/2024 15:51, Armin Le Grand wrote:
>  > That's the state of things. I plan to just continue, but let me know
>  > about suggestions/thoughts from your side.
>
>      TLDR; it sounds awesome :-) exciting times.
>
>      In more detail; it'd be worth sharing this on the public dev list
> if I can bother you to do that & then having the discussion there;
> please feel free to fwd my response too =)
>
>      We can of course, start to back-port to 24.04 and enable
> conditionally for some time on some demo servers to measure performance
> there.
>
>      And then the questions:
>
>      * does this bin the "render everything twice" problem;
>        we draw to alpha transparent surfaces, and will increasingly
>        be rendering to alpha layers and compositing them on the
>        client: do we still have to do that twice ? or can that be
>        avoided for LOK users ?
>
>      * winding / de-composing polygons horror: I've spent my life
>        seeing this take an extremely long time in profiles - and of
>        course cairo will do this as it rasterizes: is there a
>        flag / short-cut we can trigger to avoid the polygon winding /
>        de-composition as we render ? - ie. not caching the result,
>        but avoiding that altogether ?
>
>      And a few comments in-line:
>
>> buffering mechanism available for quite some time. Bitmaps are
>> directly supporting RGBA, but also RGB. I also experimented using
>> RGB16, that works well but does not really show huge speedups AFAICS
>
>      I'm reliably informed by our graphics team, and my AVX2 experience
> that from a CPU perspective RGBX and RGBA are the only things we want;
> everything else is far slower to process.
>
>> For some stuff I found multiple solutions which were similar in speed.
>> In those cases - since Cairo backends may be different - I kept both
>> preferring one using a static bool. It might be worth to do a
>> 'test-run' at 1st startup and measure that stuff and define some
>> switches for the renderer, just a possibility.
>
>      Fun - of course, it's a CPU workload; we don't have super-reliable
> benchmarks; we could get a load of demo users to do the operations twice
> and time them each way for a week of real-world work I guess ;-) and
> come up with a "right answer".
>
>> Of course: TextRendering. Does not look bad - the fallback decompose
>> gets the outlines and draws the AAed, but not professional quality.
>
>      Ah; of course we should not try rendering paths ourselves outside
> fontconfig, that'd not be a good idea =)
>
>> XOR: Not yet supported. Two possibilities: Get rid of (would be good
>> but requires some work, some already done, not too many cases
>> remaining) or support it (also some work already exists in Cairo
>> backend how to do that).
>
>      I guess this is in meta-files, both WMF/EMF and SVP - so - not sure
> how to avoid that really; cairo can support.
>
>> In the later product: The CairoSDPR *completely* replaces the
>> VCLPixelPrimitiveRenderer, so all Primitives will be rendered to
>> VDevs/Wins using Cairo directly and NO vcl at all. To repeat: This is
>> true for Draw/Impress completely, for Calc/Writer it's the inserted
>> DrawObjects, Graphics and overlay things (Markers, Grid,
>> TextSelection, ...). For more, more needs to be adapted to Primitives...
>
>      I guess the double-render / alpha query is there still =)
>
>> That's the state of things. I plan to just continue, but let me know
>> about suggestions/thoughts from your side.
>
>      Great stuff; sounds like really good work,
>
>      It's not hyper-urgent for us; but as/when it's done - we could
> include this as an option in a CODE release if the impact on the rest of
> the code is small.
>
>      Thanks !
>
>          Michael.
>

And more - but best to have the discussion on the list ...





[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux