Re: Unix Device Memory Allocation project

Christian König <deathsimple@xxxxxxxxxxx> · Wed, 4 Jan 2017 17:26:58 +0100

Am 04.01.2017 um 17:16 schrieb Rob Clark:
On Wed, Jan 4, 2017 at 11:02 AM, Christian König
<deathsimple@xxxxxxxxxxx> wrote:
Am 04.01.2017 um 16:47 schrieb Rob Clark:
On Wed, Jan 4, 2017 at 9:54 AM, Daniel Vetter <daniel@xxxxxxxx> wrote:
On Wed, Jan 04, 2017 at 08:06:24AM -0500, Rob Clark wrote:
On Wed, Jan 4, 2017 at 7:03 AM, Daniel Stone <daniel@xxxxxxxxxxxxx>
wrote:
Speaking of compression for display, especially the separate
compression buffer: That should be fully contained in the main DMABUF
and described by the per-BO metadata. Some other drivers want to use a
separate DMABUF for the compression buffer - while that may sound good
in theory, it's not economical for the reason described above.
'Some other drivers want to use a separate DMABUF', or 'some other
hardware demands the data be separate'. Same with luma/chroma plane
separation. Anyway, it doesn't really matter unless you're sharing
render-compression formats across vendors, and AFBC is the only case
of that I know of currently.

jfwiw, UBWC on newer snapdragons too.. seems like we can share these
not just between gpu (render to and sample from) and display, but also
v4l2 decoder/encoder (and maybe camera?)

I *think* we probably can treat the metadata buffers as a separate
plane.. at least we can for render target and blit src/dst, but not
100% sure about sampling from a UBWC buffer.. that might force us to
have them in a single buffer.
Conceptually treating them as two planes, and everywhere requiring that
they're allocated from the same BO are orthogonal things. At least that's
our plan with intel render compression last time I understood the current
state ;-)
If the position of the different parts of the buffer are somewhere
required to be a function of w/h/bpp/etc then I'm not sure if there is
a strong advantage to treating them as separate BOs.. although I
suppose it doesn't preclude it either.  As far as plumbing it through
mesa/st, it seems convenient to have a single buffer.  (We have kind
of a hack to deal w/ multi-planar yuv, but I'd rather not propagate
that.. but I've not thought through those details so much yet.)

Well I don't want to ruin your day, but there are different requirements
from different hardware.

For example the UVD engine found in all AMD graphics cards since r600 must
have both planes in a single BO because the memory controller can only
handle a rather small offset between the planes.

On the other hand I know of embedded MPEG2/H264 decoders where the different
planes must be on different memory channels. In this case I can imagine that
you want one BO for each plane, because otherwise the device must stitch
together one buffer object from two different memory regions (of course
possible, but rather ugly).
true, but for a vendor specific compression/metadata plane, I think I
can ignore oddball settop box SoC constraints and care more about just
other devices that support the same compression.

So if we want to cover everything we essentially need to support all
variants of one plane per BO as well as all planes in one BO with DMA-Buf. A
bit tricky isn't it?
Just to make sure we are on same page, I was only really talking about
whether to have color+meta in same bo or treat it similar to two plane
yuv (ie. pair of fd+offset tuples).  Not generic/vanilla (untiled,
uncompressed, etc) multiplanar YUV.

Ups, sorry. I didn't realized that.

Na, putting the metadata into the BO is probably only a good idea if the 
Metadata can be evaluated by the device and not the CPU as well.

It probably isn't even important that various different vendor's
compression schemes are handled the same way.  Maybe on intel it is
easier to treat it as two planes everywhere, but qcom easier to treat
as one.  Application just sees it as one or more fd+offset tuples
(when it queries EGL img) and passes those blindly through to addfb2.

Yeah, I mean that's the real core of the problem.

On the one hand we want device from different vendors to understand each 
other and there are certain cases where even completely different 
devices can work with the same data.

On the other hand each vendor has extremely specialized data formats for 
certain use cases and it is unlikely that somebody else can handle those.

Oh, and for some extra fun, I think video decoder can hand me
compressed NV12 where both Y and UV have their own meta buffer.  So if
we treat as separate planes, that becomes four planes.  (Hopefully no
compressed I420, or that becomes 6 planes! :-P)

Well talking about extra fun. We additionally have this neat interlaced 
NV12 format that both NVidia and AMD uses for their video decoding.

E.g. one Y plane top field, one UV plane top field, one Y plane bottom 
field and UV plane bottom field.

That makes 4 planes where plane 1 & 3 and 2 & 4 must have the same 
stride but are otherwise unrelated to each other and can have separate 
metadata.

Regards,
Christian.

BR,
-R

Regards,
Christian.

BR,
-R

-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel