On 06/11/2019 12.23, Kurt Van Dijck wrote:
On 6 November 2019 12:12:39 GMT+01:00, Oliver Hartkopp <socketcan@xxxxxxxxxxxx> wrote:
Hello Jaroslav,
On 05/11/2019 22.46, Jaroslav Beran wrote:
So far I've learned this issue is most probably caused by upper (net
and can) layers (so this is not specific for certain controller
driver). When a driver calls can_bus_off, it sets carrier-off and
triggers linkwatch_* actions that deactivate net queues and
substitute
a struct qdisc with `noop_qdisc`. Upon sending a frame, it's enqueue
function - noop_enqueue - just returns NET_XMIT_CN, which is
transformed by net_xmit_errno macro to zero, that's passed by
net/can/af_can.c:can_send up to a userspace caller of write as
success.
Hm.
According to description for qdisc return codes in
include/linux/netdevice.h, NET_XMIT_CN stands for `congestion
notification` and further
/* NET_XMIT_CN is special. It does not guarantee that this packet is
lost. It
* indicates that the device will soon be dropping packets, or
already drops
* some packets of the same priority; prompting us to send less
aggressively. */
Is this behavior appropriate for a node in BUS-OFF state? I'd rather
expect such controller would be always dropping all frames (not just
soon and some) until reset.
The common use of the net_xmit_errno macro probably really does not fit
to the CAN specialties ...
In current situation a caller of write gets success even if his frame
is lost for sure. Is there any specific reason for this? Of course he
can be notified by receiving error frame, but why don't just return
error in can_send?
Yes. It makes sense to forward the carrier-off state that is thankfully
provided by the linkwatch triggers to the user space.
Looking to man(2) send we should provide -ENOBUFS in the case of
carrier-off state, right?
ENOBUFS seems a bad indication. What about ENETDOWN instead?
ENETDOWN shows that the interface is "down" which does not fit the
current situation.
The interface is "up" but the carrier is "off".
man(2) send says:
ENOBUFS
The output queue for a network interface was full. This gener‐
ally indicates that the interface has stopped sending, but may
be caused by transient congestion. (Normally, this does not oc‐
cur in Linux. Packets are just silently dropped when a device
queue overflows.)
Fits to me !?
Best,
Oliver