---------- Forwarded message --------- From: Jaroslav Beran <jara.beran@xxxxxxxxx> Date: Wed, Nov 6, 2019 at 11:09 PM Subject: Re: Fwd: Return value of write() in BUS-OFF state To: Kurt Van Dijck <dev.kurt@xxxxxxxxxxxxxxxxxxxxxx> On Wed, Nov 6, 2019 at 10:59 PM Kurt Van Dijck <dev.kurt@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > On wo, 06 nov 2019 22:15:50 +0100, Oliver Hartkopp wrote: > > > > > > > > On 06/11/2019 18.23, Jaroslav Beran wrote: > > >On Wed, Nov 6, 2019 at 3:26 PM Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote: > > >> > > >> > > >> > > >>On 06/11/2019 12.23, Kurt Van Dijck wrote: > > >>> > > >>> > > >>>On 6 November 2019 12:12:39 GMT+01:00, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote: > > >>>>Hello Jaroslav, > > >>>> > > >>>>On 05/11/2019 22.46, Jaroslav Beran wrote: > > >>>> > > >>>>>So far I've learned this issue is most probably caused by upper (net > > >>>>>and can) layers (so this is not specific for certain controller > > >>>>>driver). When a driver calls can_bus_off, it sets carrier-off and > > >>>>>triggers linkwatch_* actions that deactivate net queues and > > >>>>substitute > > >>>>>a struct qdisc with `noop_qdisc`. Upon sending a frame, it's enqueue > > >>>>>function - noop_enqueue - just returns NET_XMIT_CN, which is > > >>>>>transformed by net_xmit_errno macro to zero, that's passed by > > >>>>>net/can/af_can.c:can_send up to a userspace caller of write as > > >>>>>success. > > >>>> > > >>>>Hm. > > >>>> > > >>>>>According to description for qdisc return codes in > > >>>>>include/linux/netdevice.h, NET_XMIT_CN stands for `congestion > > >>>>>notification` and further > > >>>>> > > >>>>>/* NET_XMIT_CN is special. It does not guarantee that this packet is > > >>>>lost. It > > >>>>> * indicates that the device will soon be dropping packets, or > > >>>>already drops > > >>>>> * some packets of the same priority; prompting us to send less > > >>>>aggressively. */ > > >>>>> > > >>>>> > > >>>>>Is this behavior appropriate for a node in BUS-OFF state? I'd rather > > >>>>>expect such controller would be always dropping all frames (not just > > >>>>>soon and some) until reset. > > >>>> > > >>>>The common use of the net_xmit_errno macro probably really does not fit > > >>>> > > >>>>to the CAN specialties ... > > >>>> > > >>>>>In current situation a caller of write gets success even if his frame > > >>>>>is lost for sure. Is there any specific reason for this? Of course he > > >>>>>can be notified by receiving error frame, but why don't just return > > >>>>>error in can_send? > > >>>> > > >>>>Yes. It makes sense to forward the carrier-off state that is thankfully > > >>>> > > >>>>provided by the linkwatch triggers to the user space. > > >>>> > > >>>>Looking to man(2) send we should provide -ENOBUFS in the case of > > >>>>carrier-off state, right? > > >>>ENOBUFS seems a bad indication. What about ENETDOWN instead? > > >> > > >>ENETDOWN shows that the interface is "down" which does not fit the > > >>current situation. > > ack. ENETDOWN was a bad suggestion. > > > >> > > >>The interface is "up" but the carrier is "off". > > >> > > >>man(2) send says: > > >> > > >>ENOBUFS > > >> The output queue for a network interface was full. This gener‐ > > >> ally indicates that the interface has stopped sending, but may > > >> be caused by transient congestion. (Normally, this does not oc‐ > > >> cur in Linux. Packets are just silently dropped when a device > > >> queue overflows.) > > >> > > >>Fits to me !? > > >> > > > > > >I don't know, neither option doesn't perfectly fit for the carrier-off > > >state to me. > > > > > >What about choosing another code like ENOSPC or EIO to distinguish > > >bus-off from other recoverable conditions? > > > > I must admit that the ability to distinguish the return values would be > > cool. > > ack. busoff is completely different from 'on buffers' > > > > > But when you check the man pages from either send() or write() you'll see > > that ENOSPC or EIO have a description that fit even worse. > > ENETUNREACH? I'm just picking values that suit more. Yeah, I couldn't find anything better than ENETUNREACH. Already sent a patch.