> -----Original Message----- > From: Nicolas Dufresne [mailto:nicolas@xxxxxxxxxxxx] > Sent: Friday, September 10, 2021 3:54 AM > To: Ming Qian <ming.qian@xxxxxxx>; mchehab@xxxxxxxxxx; > shawnguo@xxxxxxxxxx; robh+dt@xxxxxxxxxx; s.hauer@xxxxxxxxxxxxxx > Cc: hverkuil-cisco@xxxxxxxxx; kernel@xxxxxxxxxxxxxx; festevam@xxxxxxxxx; > dl-linux-imx <linux-imx@xxxxxxx>; Aisheng Dong <aisheng.dong@xxxxxxx>; > linux-media@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > devicetree@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > Subject: Re: [EXT] Re: [PATCH v8 04/15] media:Add v4l2 event codec_error and > skip > > Caution: EXT Email > > Le jeudi 09 septembre 2021 à 03:13 +0000, Ming Qian a écrit : > > > -----Original Message----- > > > From: Nicolas Dufresne [mailto:nicolas@xxxxxxxxxxxx] > > > Sent: Wednesday, September 8, 2021 9:33 PM > > > To: Ming Qian <ming.qian@xxxxxxx>; mchehab@xxxxxxxxxx; > > > shawnguo@xxxxxxxxxx; robh+dt@xxxxxxxxxx; s.hauer@xxxxxxxxxxxxxx > > > Cc: hverkuil-cisco@xxxxxxxxx; kernel@xxxxxxxxxxxxxx; > > > festevam@xxxxxxxxx; dl-linux-imx <linux-imx@xxxxxxx>; Aisheng Dong > > > <aisheng.dong@xxxxxxx>; linux-media@xxxxxxxxxxxxxxx; > > > linux-kernel@xxxxxxxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; > > > linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > > Subject: [EXT] Re: [PATCH v8 04/15] media:Add v4l2 event codec_error > > > and skip > > > > > > Caution: EXT Email > > > > > > Hi Ming, > > > > > > more API only review. > > > > > > Le mardi 07 septembre 2021 à 17:49 +0800, Ming Qian a écrit : > > > > The codec_error event can tell client that there are some error > > > > occurs in the decoder engine. > > > > > > > > The skip event can tell the client that there are a frame has been > > > > decoded, but it won't be outputed. > > > > > > > > Signed-off-by: Ming Qian <ming.qian@xxxxxxx> > > > > Signed-off-by: Shijie Qin <shijie.qin@xxxxxxx> > > > > Signed-off-by: Zhou Peng <eagle.zhou@xxxxxxx> > > > > --- > > > > .../userspace-api/media/v4l/vidioc-dqevent.rst | 12 > ++++++++++++ > > > > include/uapi/linux/videodev2.h | 2 ++ > > > > 2 files changed, 14 insertions(+) > > > > > > > > diff --git > > > > a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst > > > > b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst > > > > index 6eb40073c906..87d40ad25604 100644 > > > > --- a/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst > > > > +++ b/Documentation/userspace-api/media/v4l/vidioc-dqevent.rst > > > > @@ -182,6 +182,18 @@ call. > > > > the regions changes. This event has a struct > > > > :c:type:`v4l2_event_motion_det` > > > > associated with it. > > > > + * - ``V4L2_EVENT_CODEC_ERROR`` > > > > + - 7 > > > > + - This event is triggered when some error occurs inside the > > > > + codec > > > engine, > > > > + usually it can be replaced by a POLLERR event, but in some > > > > + cases, > > > > the > > > POLLERR > > > > + may cause the application to exit, but this event can allow > > > > + the > > > application to > > > > + handle the codec error without exiting. > > > > > > Events are sent to userspace in a separate queue from the VB2 queue. > > > Which means it's impossible for userspace to know where this error > > > actually took place. > > > Userspace may endup discarding valid frames from the VB queue, as it > > > does not know which one are good, and which one are bad. > > > > > > There is likely a bit of spec work to be done here for non-fatal > > > decode errors, but I think the right approach is to use > > > V4L2_BUF_FLAG_ERROR. What we expect from decoders is that for each > > > frame, a CAPTURE buffer is assigned. > > > If > > > decoding that frame was not possible but the error is recoverable > > > (corrupted bitstream, missing reference, etc.), then the failing > > > frame get marked with FLAG_ERROR and decoding continues as usual. > > > > > > What isn't documented is that you can set bytesused to 0, meaning > > > there is nothing useful in that frame, or a valid bytesused when you > > > know only some blocks are broken (e.g. missing 1 ref). Though, > > > GStreamer might be the only implementation of that, and byteused 0 may > confuse some existing userspace. > > > > > > > Hi Nicolas, > > We don't use this event to tell userspace which frame is broken. > > Actually it tries to tell userspace that the decoder is abnormal and > > there will be no more frames output. The usersapce shouldn't wait, it > > can reset the decoder instance if it wants to continue decoding more > > frames, or it can exit directly. > > Usually there will no capture buffer can be dequeued, so we can't > > set a V4L2_BUF_FLAG_ERROR flag to a capture buffer. > > That is not logical, if userspace asked to decode a buffer, but didn't queue back > any CAPTURE buffer, you are expected to just sit there and wait. > > > In my opinion, setting bytesused to 0 means eos, and as you say, > > it may confuse some existing userspace. > > Byteused 0 only mean EOS for one specific driver, MFC. That behaviour was > kept to avoid breaking existing userspace. In fact, you have to opt in, the > framework will prevent you from using it for that purpose. > > > I think it can be replaced by POLLERR in most of case, but we meet > > some applications who prefer to use this event instead of pollerr > > In general, recoverable errors should be handled without the need for > userspace to reset. This looks more like you have a bug in your error handling > and deffer it to userspace. Most userspace will just abort and report to users, I > doubt this is really what you expect. > > What matters for recoverable errors is that you keep consuming OUTPUT > buffers. > And userspace should be happy with never getting anything from the CAPTURE > till the propblem was recovered by the driver. Of course, userspace should > probably garbage collect the metadata it might be holding, chromium does > that with a leaky queue of 16 metadata buffer notably > > My recommandation would be to drop this for now, and just try to not stall on > errors (or make it a hard failure for now, pollerr, or ioctl errors). > OK, I agree with you, I will drop it > > > > > > > > + * - ``V4L2_EVENT_SKIP`` > > > > + - 8 > > > > + - This event is triggered when one frame is decoded, but it > > > > + won't > > > > be > > > outputed > > > > + to the display. So the application can't get this frame, and > > > > + the > > > > input > > > frame count > > > > + is dismatch with the output frame count. And this evevt is > > > > + telling > > > > the > > > client to > > > > + handle this case. > > > > > > Similar to my previous comment, this event is flawed, since > > > userspace cannot know were the skip is located in the queued > > > buffers. Currently, all decoders are mandated to support > > > V4L2_BUF_FLAG_TIMESTAMP_COPY. The timestamp must NOT be > interpreted > > > by the driver and must be reproduce as-is in the associated CAPTURE > > > buffer. It is possible to "garbage" collect skipped frames with this > > > method, though tedious. > > > > > > An alternative, and I think it would be much nicer then this, would > > > be to use the v4l2_buffer.sequence counter, and just make it skip 1 > > > on skips. Though, the down side is that userspace must also know how > > > to reorder frames (a driver job for stateless codecs) in order to > > > identify which frame was skipped. So this is perhaps not that > > > useful, other then knowing something was skipped in the past. > > > > > > A third option would be to introduce V4L2_BUF_FLAG_SKIPPED. This way > > > the driver could return an empty payload (bytesused = 0) buffer with > > > this flag set, and the proper timestamp properly copied. This would > > > let the driver communicate skipped frames in real-time. Note that > > > this could break with existing userspace, so it would need to be > > > opted-in somehow (a control or some flags). > > > > Hi Nicolas, > > The problem we meet is that userspace doesn't care which frame is > > skipped, it just need to know that there are a frame is skipped, the > > driver should promise the input frame count is equals to the output frame > count. > > Your first method is possible in theory, but we find the timestamp > > may be unreliable, we meet many timestamp issues that userspace may > > enqueue invalid timestamp or repeated timestamp and so on, so we can't > accept this solution. > > The driver should not interpret the provided timestamp, so it should not be > able to say if the timestamp is valid or not, this is not the driver's task. > > The driver task is to match the timestamp to the CAPTURE buffer (if that buffer > was produced), and reproduce it exactly. > > > I think your second option is better. And there are only 1 > > question, we find some application prefer to use the V4L2_EVENT_EOS to > > check the eos, not checking the empty buffer, if we use this method to > > check skipped frame, the > > Checking the empty buffer is a legacy method, only available in Samsung MFC > driver. The spec says that the last buffer should be flagged with _LAST, and any > further attempt to poll should unblock and DQBUF return EPIPE. > > > application should check empty buffer instead of V4L2_EVENT_EOS, > > otherwise if the last frame is skipped, the application will miss it. > > Of course this is not a problem, it just increases the complexity of > > the userspace implementation > > The EPIPE mechanism covers this issue, which we initially had with the LAST > flag. > > > I don't think your third method is feasible, the reasons are as below > > 1. usually the empty payload means eos, and as you say, > > it may introduce confusion. > > 2. The driver may not have the opportunity to return an empty > > payload during decoding, in our driver, driver will pass the capture > > buffer to firmware, and when some frame is skipped, the firmware won't > > return the buffer, driver may not find an available capture buffer to > > return to userspace. > > > > The requirement is that userspace need to match the input frame > > count and output frame count. It doesn't care which frame is skipped, > > so the V4L2_EVENT_SKIP is the easiest way for driver and userspace. > > If you think this event is really inappropriate, I prefer to adopt > > your second option > > Please, drop SKIP from you driver and this patchset and fix your draining > process handling to follow the spec. The Samsung OMX component is > irrelevant to mainline submission, the OMX code should be updated to follow > the spec. > OK, I'll drop it, and follow your second option that use the v4l2_buffer.sequence counter. My previous statement about empty buffer was not appropriate. I know the last buffer should be flagged with _LAST, and in most of case, driver will append a empty buffer with _LAST flag. But in userspace, I find some application will check the bytesused, if it's 0, the application will think it's eos I checked the gstreamer v4l2 decoder, I found the following code: /* Legacy M2M devices return empty buffer when drained */ if (size == 0 && GST_V4L2_IS_M2M (obj->device_caps)) goto eos; and I found the ffmpeg v4l2 decoder does the similar thing. So I don't want to use empty buffer except the last buffer in the driver. > > > > > > > > > * - ``V4L2_EVENT_PRIVATE_START`` > > > > - 0x08000000 > > > > - Base event number for driver-private events. > > > > diff --git a/include/uapi/linux/videodev2.h > > > > b/include/uapi/linux/videodev2.h index 5bb0682b4a23..c56640d42dc5 > > > > 100644 > > > > --- a/include/uapi/linux/videodev2.h > > > > +++ b/include/uapi/linux/videodev2.h > > > > @@ -2369,6 +2369,8 @@ struct v4l2_streamparm { > > > > #define V4L2_EVENT_FRAME_SYNC 4 > > > > #define V4L2_EVENT_SOURCE_CHANGE 5 > > > > #define V4L2_EVENT_MOTION_DET 6 > > > > +#define V4L2_EVENT_CODEC_ERROR 7 > > > > +#define V4L2_EVENT_SKIP 8 > > > > #define V4L2_EVENT_PRIVATE_START 0x08000000 > > > > > > > > /* Payload for V4L2_EVENT_VSYNC */ > > > > > >