Hi Jonas, On Thu, Nov 7, 2019 at 7:34 AM Jonas Karlman <jonas@xxxxxxxxx> wrote: > > A decoded 8-bit 4:2:0 frame need memory for up to 448 bytes per > macroblock with additional 32 bytes on multi-core variants. > > Memory layout is as follow: > > +---------------------------+ > | Y-plane 256 bytes x MBs | > +---------------------------+ > | UV-plane 128 bytes x MBs | > +---------------------------+ > | MV buffer 64 bytes x MBs | > +---------------------------+ > | MC sync 32 bytes | > +---------------------------+ > > Reduce the extra space allocated now that motion vector buffer offset no > longer is based on the extra space. > > Only allocate extra space for 64 bytes x MBs of motion vector buffer > and 32 bytes for multi-core sync. > > Fixes: a9471e25629b ("media: hantro: Add core bits to support H264 decoding") > Signed-off-by: Jonas Karlman <jonas@xxxxxxxxx> > Reviewed-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> > --- > Changes in v3: > - add memory layout to code comment (Boris) > Changes in v2: > - updated commit message > --- > drivers/staging/media/hantro/hantro_v4l2.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > Thanks for the patch! What platform did you test it on and how? Was it tested with IOMMU enabled? Best regards, Tomasz > diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c > index 3dae52abb96c..c8c896a06f58 100644 > --- a/drivers/staging/media/hantro/hantro_v4l2.c > +++ b/drivers/staging/media/hantro/hantro_v4l2.c > @@ -240,14 +240,30 @@ static int vidioc_try_fmt(struct file *file, void *priv, struct v4l2_format *f, > v4l2_fill_pixfmt_mp(pix_mp, fmt->fourcc, pix_mp->width, > pix_mp->height); > /* > + * A decoded 8-bit 4:2:0 NV12 frame may need memory for up to > + * 448 bytes per macroblock with additional 32 bytes on > + * multi-core variants. > + * > * The H264 decoder needs extra space on the output buffers > * to store motion vectors. This is needed for reference > * frames. > + * > + * Memory layout is as follow: > + * > + * +---------------------------+ > + * | Y-plane 256 bytes x MBs | > + * +---------------------------+ > + * | UV-plane 128 bytes x MBs | > + * +---------------------------+ > + * | MV buffer 64 bytes x MBs | > + * +---------------------------+ > + * | MC sync 32 bytes | > + * +---------------------------+ > */ > if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE) > pix_mp->plane_fmt[0].sizeimage += > - 128 * DIV_ROUND_UP(pix_mp->width, 16) * > - DIV_ROUND_UP(pix_mp->height, 16); > + 64 * MB_WIDTH(pix_mp->width) * > + MB_WIDTH(pix_mp->height) + 32; > } else if (!pix_mp->plane_fmt[0].sizeimage) { > /* > * For coded formats the application can specify > -- > 2.17.1 >