Hi Tomi, Thank you for the patch. On Fri, Mar 13, 2020 at 10:26:39AM +0200, Tomi Valkeinen wrote: > When the CAL driver stops streaming, it will shut everything down > without waiting for the current frame to finish. This leaves the CAL DMA > in a slightly undefined state, and when CAL DMA is enabled when the > stream is started the next time, the old DMA transfer will continue. > > It is not clear if the old DMA transfer continues with the exact > settings of the original transfer, or is it a mix of old and new > settings, but in any case the end result is memory corruption as the > destination memory address is no longer valid. > > I could not find any way to ensure that any old DMA transfer would be > discarded, except perhaps full CAL reset. But we cannot do a full reset > when one port is getting enabled, as that would reset both ports. > > This patch tries to make sure that the DMA transfer is finished properly > when the stream is being stopped. I say "tries", as, as mentioned above, > I don't see a way to force the DMA transfer to finish. I believe this > fixes the corruptions for normal cases, but if for some reason the DMA > of the final frame would stall a lot, resulting in timeout in the code > waiting for the DMA to finish, we'll again end up with unfinished DMA > transfer. However, I don't know what could cause such a timeout. > > Signed-off-by: Tomi Valkeinen <tomi.valkeinen@xxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > drivers/media/platform/ti-vpe/cal.c | 32 +++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/drivers/media/platform/ti-vpe/cal.c b/drivers/media/platform/ti-vpe/cal.c > index be54806180a5..b857cab120ad 100644 > --- a/drivers/media/platform/ti-vpe/cal.c > +++ b/drivers/media/platform/ti-vpe/cal.c > @@ -414,6 +414,8 @@ struct cal_ctx { > struct cal_buffer *cur_frm; > /* Pointer pointing to next v4l2_buffer */ > struct cal_buffer *next_frm; > + > + bool dma_act; > }; > > static const struct cal_fmt *find_format_by_pix(struct cal_ctx *ctx, > @@ -944,6 +946,7 @@ static void csi2_lane_config(struct cal_ctx *ctx) > > static void csi2_ppi_enable(struct cal_ctx *ctx) > { > + reg_write(ctx->dev, CAL_CSI2_PPI_CTRL(ctx->csi2_port), BIT(3)); > reg_write_field(ctx->dev, CAL_CSI2_PPI_CTRL(ctx->csi2_port), > CAL_GEN_ENABLE, CAL_CSI2_PPI_CTRL_IF_EN_MASK); > } > @@ -1206,15 +1209,25 @@ static irqreturn_t cal_irq(int irq_cal, void *data) > if (isportirqset(irqst2, 1)) { > ctx = dev->ctx[0]; > > + spin_lock(&ctx->slock); > + ctx->dma_act = false; > + > if (ctx->cur_frm != ctx->next_frm) > cal_process_buffer_complete(ctx); > + > + spin_unlock(&ctx->slock); > } > > if (isportirqset(irqst2, 2)) { > ctx = dev->ctx[1]; > > + spin_lock(&ctx->slock); > + ctx->dma_act = false; > + > if (ctx->cur_frm != ctx->next_frm) > cal_process_buffer_complete(ctx); > + > + spin_unlock(&ctx->slock); > } > } > > @@ -1230,6 +1243,7 @@ static irqreturn_t cal_irq(int irq_cal, void *data) > dma_q = &ctx->vidq; > > spin_lock(&ctx->slock); > + ctx->dma_act = true; > if (!list_empty(&dma_q->active) && > ctx->cur_frm == ctx->next_frm) > cal_schedule_next_buffer(ctx); > @@ -1241,6 +1255,7 @@ static irqreturn_t cal_irq(int irq_cal, void *data) > dma_q = &ctx->vidq; > > spin_lock(&ctx->slock); > + ctx->dma_act = true; > if (!list_empty(&dma_q->active) && > ctx->cur_frm == ctx->next_frm) > cal_schedule_next_buffer(ctx); > @@ -1713,10 +1728,27 @@ static void cal_stop_streaming(struct vb2_queue *vq) > struct cal_ctx *ctx = vb2_get_drv_priv(vq); > struct cal_dmaqueue *dma_q = &ctx->vidq; > struct cal_buffer *buf, *tmp; > + unsigned long timeout; > unsigned long flags; > int ret; > + bool dma_act; > > csi2_ppi_disable(ctx); > + > + /* wait for stream and dma to finish */ > + dma_act = true; > + timeout = jiffies + msecs_to_jiffies(500); > + while (dma_act && time_before(jiffies, timeout)) { > + msleep(50); > + > + spin_lock_irqsave(&ctx->slock, flags); > + dma_act = ctx->dma_act; > + spin_unlock_irqrestore(&ctx->slock, flags); > + } Waiting for the transfer to complete seems to be a good idea, but how about using a wait queue instead of such a loop ? That would allow better usage of CPU time and faster reaction time, and shouldn't be difficult to implement. You may also want to replace dma_act with a state if needed (in case you need to express running/stopping/stopped states), and I would rename it to running if you just need a boolean. > + > + if (dma_act) > + ctx_err(ctx, "failed to disable dma cleanly\n"); > + > disable_irqs(ctx); > csi2_phy_deinit(ctx); > -- Regards, Laurent Pinchart