Re: Fwd: Return value of write() in BUS-OFF state

Jaroslav Beran <jara.beran@xxxxxxxxx> · Wed, 6 Nov 2019 18:23:07 +0100

On Wed, Nov 6, 2019 at 3:26 PM Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote:
>
>
>
> On 06/11/2019 12.23, Kurt Van Dijck wrote:
> >
> >
> > On 6 November 2019 12:12:39 GMT+01:00, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote:
> >> Hello Jaroslav,
> >>
> >> On 05/11/2019 22.46, Jaroslav Beran wrote:
> >>
> >>> So far I've learned this issue is most probably caused by upper (net
> >>> and can) layers (so this is not specific for certain controller
> >>> driver). When a driver calls can_bus_off, it sets carrier-off and
> >>> triggers linkwatch_* actions that deactivate net queues and
> >> substitute
> >>> a struct qdisc with `noop_qdisc`. Upon sending a frame, it's enqueue
> >>> function - noop_enqueue - just returns NET_XMIT_CN, which is
> >>> transformed by net_xmit_errno macro to zero, that's passed by
> >>> net/can/af_can.c:can_send up to a userspace caller of write as
> >>> success.
> >>
> >> Hm.
> >>
> >>> According to description for qdisc return codes in
> >>> include/linux/netdevice.h, NET_XMIT_CN stands for `congestion
> >>> notification` and further
> >>>
> >>> /* NET_XMIT_CN is special. It does not guarantee that this packet is
> >> lost. It
> >>>    * indicates that the device will soon be dropping packets, or
> >> already drops
> >>>    * some packets of the same priority; prompting us to send less
> >> aggressively. */
> >>>
> >>>
> >>> Is this behavior appropriate for a node in BUS-OFF state? I'd rather
> >>> expect such controller would be always dropping all frames (not just
> >>> soon and some) until reset.
> >>
> >> The common use of the net_xmit_errno macro probably really does not fit
> >>
> >> to the CAN specialties ...
> >>
> >>> In current situation a caller of write gets success even if his frame
> >>> is lost for sure. Is there any specific reason for this? Of course he
> >>> can be notified by receiving error frame, but why don't just return
> >>> error in can_send?
> >>
> >> Yes. It makes sense to forward the carrier-off state that is thankfully
> >>
> >> provided by the linkwatch triggers to the user space.
> >>
> >> Looking to man(2) send we should provide -ENOBUFS in the case of
> >> carrier-off state, right?
> > ENOBUFS seems a bad indication. What about ENETDOWN instead?
>
> ENETDOWN shows that the interface is "down" which does not fit the
> current situation.
>
> The interface is "up" but the carrier is "off".
>
> man(2) send says:
>
> ENOBUFS
>        The output queue for a network interface was full.  This  gener‐
>        ally  indicates  that the interface has stopped sending, but may
>        be caused by transient congestion.  (Normally, this does not oc‐
>        cur  in  Linux.  Packets are just silently dropped when a device
>        queue overflows.)
>
> Fits to me !?
>

I don't know, neither option doesn't perfectly fit for the carrier-off
state to me.

What about choosing another code like ENOSPC or EIO to distinguish
bus-off from other recoverable conditions?

Best regards,
Jaroslav

> Best,
> Oliver