On Wed, 17 Jan 2018 12:03:15 -0800 Eric Anholt <eric@xxxxxxxxxx> wrote: > Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx> writes: > > > When saving BOs in the hang state we skip one entry of the > > kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL > > pointer dereference when, later in this function, we iterate over all > > BOs to check their ->madv state. > > > > Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs") > > Cc: <stable@xxxxxxxxxxxxxxx> > > Signed-off-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/vc4/vc4_gem.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c > > index 6c32c89a83a9..19ac7fe0e5db 100644 > > --- a/drivers/gpu/drm/vc4/vc4_gem.c > > +++ b/drivers/gpu/drm/vc4/vc4_gem.c > > @@ -208,7 +208,7 @@ vc4_save_hang_state(struct drm_device *dev) > > kernel_state->bo[j + prev_idx] = &bo->base.base; > > j++; > > } > > - prev_idx = j + 1; > > + prev_idx = j; > > Could we replace the whole "[j + prev_idx]" with a "[k++]" and maybe a > WARN_ON_ONCE(k != state->bo_count) at the end? Sure. > > I really need to figure out if I can come up with a way to make IGT > cases for GPU hangs on vc4, despite the validation. I managed to trigger the NULL pointer dereference while debugging the perfmon stuff, but it's fixed now, so I don't have a way to easily force a reset. > I found a bug in > GPU reset due to BCL hangs when doing vc5, but I don't have a testcase. > Maybe some submit flags that overwrite the BCL or RCL to do an infinite > loop?