On Sun, Aug 27, 2023 at 5:44 AM Marek Vasut <marex@xxxxxxx> wrote: > > On 8/25/23 10:52, Chen-Yu Tsai wrote: > > On Fri, Aug 25, 2023 at 4:33 PM Marek Vasut <marex@xxxxxxx> wrote: > >> > >> On 8/25/23 09:09, Chen-Yu Tsai wrote: > >>> On Thu, Aug 24, 2023 at 9:08 PM Marek Vasut <marex@xxxxxxx> wrote: > >>>> > >>>> On 8/24/23 12:39, Adam Ford wrote: > >>>>> On Wed, Aug 23, 2023 at 8:39 PM Marek Vasut <marex@xxxxxxx> wrote: > >>>>>> > >>>>>> The i.MX8MM/N/P does not define the .reset op since reset of the VPU is > >>>>>> done by genpd. Check whether the .reset op is defined before calling it > >>>>>> to avoid NULL pointer dereference. > >>>>>> > >>>>>> Note that the Fixes tag is set to the commit which removed the reset op > >>>>>> from i.MX8M Hantro G2 implementation, this is because before this commit > >>>>>> all the implementations did define the .reset op. > >>>>> > >>>>> I am surprised I didn't have issues when I was testing the 8MQ and > >>>>> 8MM, but this makes sense. > >>>> > >>>> You need to trigger the VPU watchdog to trigger the crash, that means > >>>> you have to get the VPU into some weird state where it fails to decode > >>>> frame. Then it triggers the reset and ... boom. > >>>> > >>>> See this patch, that contains a gstreamer invocation to generate such > >>>> trigger condition input data: > >>>> > >>>> [PATCH] media: verisilicon: Do not enable G2 postproc downscale if > >>>> source is narrower than destination > >>>> > >>>> " > >>>> To generate input test data to trigger this bug, use e.g.: > >>>> $ gst-launch-1.0 videotestsrc ! > >>>> video/x-raw,width=272,height=256,format=I420 ! \ > >>>> vp9enc ! matroskamux ! filesink location=/tmp/test.vp9 > >>>> To trigger the bug upon decoding (note that the NV12 must be forced, as > >>>> that assures the output data would pass the G2 postproc): > >>>> $ gst-launch-1.0 filesrc location=/tmp/test.vp9 ! matroskademux ! > >>>> vp9parse ! \ > >>>> v4l2slvp9dec ! video/x-raw,format=NV12 ! videoconvert > >>>> ! fbdevsink > >>>> " > >>> > >>> Does it completely recover afterwards? In my previous trials the hardware > >>> ended up in some bizzare state: while decoding succeeds, the output's md5sum > >>> didn't match up. > >> > >> Have you got a testcase that triggers this, one I can try ? > >> > >> I am not entirely sure whether this is happening here as well or not, > >> but I can imagine that the power domain went down and back up between > >> tests, so the VPU would be power cycled (and therefore reset) that way. > >> So, I think it is worth testing that. > > > > This was last year while I was writing HEVC decoding code for Chromium. > > IIRC the SAODBLK_A_MainConcept_4 test vector from the official HEVC test > > suite does cause our stack to crash, but Gstreamer seemed to handle it > > OK. It could be that the Chromium decoder stack is passing bad values to > > the decoder. > > That can be easily tested with ftrace enabled. I was just tracking down > an issue with gstreamer and added the following patch to the hantro > driver. Then: > > echo > /sys/kernel/debug/tracing/trace > <run fail test> > cat /sys/kernel/debug/tracing/trace > /tmp/fail.trace > echo > /sys/kernel/debug/tracing/trace > <run pass test> > cat /sys/kernel/debug/tracing/trace > /tmp/pass.trace > # remove time stamps etc. > diff /tmp/{fail,pass}.trace > > You should see whether some register programming differs between > gstreamer and chromium. I ended up using VISL to compare the controls set. I did find a bug. It still hard hangs after a couple frames, so I guess I'd need to use your method, but do printk instead. BTW, I wonder if we shouldn't add a reset op, if only just to stop the hardware? That is, do the same two register writes as in the Hantro G2 interrupt handler. ChenYu > diff --git a/drivers/media/platform/verisilicon/hantro.h > b/drivers/media/platform/verisilicon/hantro.h > index dea35a501ba30..529f1ab478ec8 100644 > --- a/drivers/media/platform/verisilicon/hantro.h > +++ b/drivers/media/platform/verisilicon/hantro.h > @@ -353,8 +353,7 @@ extern int hantro_debug; > > #define vpu_debug(level, fmt, args...) \ > do { \ > - if (hantro_debug & BIT(level)) \ > - pr_info("%s:%d: " fmt, \ > + trace_printk("%s:%d: " fmt, \ > __func__, __LINE__, ##args); \ > } while (0)