On Wed, Nov 6, 2019 at 3:26 PM Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote: > > > > On 06/11/2019 12.23, Kurt Van Dijck wrote: > > > > > > On 6 November 2019 12:12:39 GMT+01:00, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote: > >> Hello Jaroslav, > >> > >> On 05/11/2019 22.46, Jaroslav Beran wrote: > >> > >>> So far I've learned this issue is most probably caused by upper (net > >>> and can) layers (so this is not specific for certain controller > >>> driver). When a driver calls can_bus_off, it sets carrier-off and > >>> triggers linkwatch_* actions that deactivate net queues and > >> substitute > >>> a struct qdisc with `noop_qdisc`. Upon sending a frame, it's enqueue > >>> function - noop_enqueue - just returns NET_XMIT_CN, which is > >>> transformed by net_xmit_errno macro to zero, that's passed by > >>> net/can/af_can.c:can_send up to a userspace caller of write as > >>> success. > >> > >> Hm. > >> > >>> According to description for qdisc return codes in > >>> include/linux/netdevice.h, NET_XMIT_CN stands for `congestion > >>> notification` and further > >>> > >>> /* NET_XMIT_CN is special. It does not guarantee that this packet is > >> lost. It > >>> * indicates that the device will soon be dropping packets, or > >> already drops > >>> * some packets of the same priority; prompting us to send less > >> aggressively. */ > >>> > >>> > >>> Is this behavior appropriate for a node in BUS-OFF state? I'd rather > >>> expect such controller would be always dropping all frames (not just > >>> soon and some) until reset. > >> > >> The common use of the net_xmit_errno macro probably really does not fit > >> > >> to the CAN specialties ... > >> > >>> In current situation a caller of write gets success even if his frame > >>> is lost for sure. Is there any specific reason for this? Of course he > >>> can be notified by receiving error frame, but why don't just return > >>> error in can_send? > >> > >> Yes. It makes sense to forward the carrier-off state that is thankfully > >> > >> provided by the linkwatch triggers to the user space. > >> > >> Looking to man(2) send we should provide -ENOBUFS in the case of > >> carrier-off state, right? > > ENOBUFS seems a bad indication. What about ENETDOWN instead? > > ENETDOWN shows that the interface is "down" which does not fit the > current situation. > > The interface is "up" but the carrier is "off". > > man(2) send says: > > ENOBUFS > The output queue for a network interface was full. This gener‐ > ally indicates that the interface has stopped sending, but may > be caused by transient congestion. (Normally, this does not oc‐ > cur in Linux. Packets are just silently dropped when a device > queue overflows.) > > Fits to me !? > I don't know, neither option doesn't perfectly fit for the carrier-off state to me. What about choosing another code like ENOSPC or EIO to distinguish bus-off from other recoverable conditions? Best regards, Jaroslav > Best, > Oliver