Hi All, <Snip> > Hi all, > > On 3/10/19 6:07 AM, Dave Taht wrote: > > Toke Høiland-Jørgensen <toke@xxxxxxxxxx> writes: > > > >> Appana Durga Kedareswara Rao <appanad@xxxxxxxxxx> writes: > >> > >>> Hi Andre, > >>> > >>> <Snip> > >>>> > >>>> On 3/9/19 3:07 PM, Appana Durga Kedareswara rao wrote: > >>>>> While stress testing the CAN interface on xilinx axi can in > >>>>> loopback mode getting message "write: no buffer space available" > >>>>> Increasing device tx queue length resolved the above mentioned issue. > >>>> > >>>> No need to patch the kernel: > >>>> > >>>> $ ip link set <dev-name> txqueuelen 500 > >>>> > >>>> does the same thing. > >>> > >>> Thanks for the review... > >>> Agree but it is not an out of box solution right?? > >>> Do you have any idea for socket can devices why the tx queue length > >>> is 10 whereas for other network devices (ex: ethernet) it is 1000 ?? > >> > >> Probably because you don't generally want a long queue adding latency > >> on a CAN interface? The default 1000 is already way too much even for > >> an Ethernet device in a lot of cases. > >> > >> If you get "out of buffer" errors it means your application is > >> sending things faster than the receiver (or device) can handle them. > >> If you solve this by increasing the queue length you are just > >> papering over the underlying issue, and trading latency for fewer > >> errors. This tradeoff > >> *may* be appropriate for your particular application, but I can > >> imagine it would not be appropriate as a default. Keeping the buffer > >> size small allows errors to propagate up to the application, which > >> can then back off, or do something smarter, as appropriate. > >> > >> I don't know anything about the actual discussions going on when the > >> defaults were set, but I can imagine something along the lines of the > >> above was probably a part of it :) > >> > >> -Toke > > > > In a related discussion, loud and often difficult, over here on the > > can bus, > > > > https://github.com/systemd/systemd/issues/9194#issuecomment- > 469403685 > > > > we found that applying fq_codel as the default via sysctl qdisc a bad > > idea for systems for at least one model of can device. > > > > If you scroll back on the bug, a good description of what the can > > subsystem expects from the qdisc is therein - it mandates an in-order > > fifo qdisc or no queue at all. the CAN protocol expects each packet to > > be transmitted successfully or rejected, and if so, passes the error > > up to userspace and is supposed to stop for further input. > > > > As this was the first serious bug ever reported against using fq_codel > > as the default in 5+ years of systemd and 7 of openwrt deployment I've > > been taking it very seriously. It's worse than just systemd - openwrt > > patches out pfifo_fast entirely. pfifo_fast is the wrong qdisc - the > > right choices are noqueue and possibly pfifo. > > > > However, the vcan device exposes noqueue, and so far it has been only > > the one device ( a 8Devices socketcan USB2CAN ) that did not do this > > in their driver that was misbehaving. > > > > Which was just corrected with a simple: > > > > static int usb_8dev_probe(struct usb_interface *intf, > > const struct usb_device_id *id) > > { > > ... > > netdev->netdev_ops = &usb_8dev_netdev_ops; > > > > netdev->flags |= IFF_ECHO; /* we support local echo */ > > + netdev->priv_flags |= IFF_NO_QUEUE; > > ... > > } > > > > and successfully tested on that bug report. > > > > So at the moment, my thought is that all can devices should default to > > noqueue, if they are not already. I think a pfifo_fast and a qlen of > > any size is the wrong thing, but I still don't know enough about what > > other can devices do or did to be certain. > > > > Having about 10 elements in a CAN driver tx queue allows to work with > queueing disciplines > (http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf) and also to maintain a > nearly real-time behaviour with outgoing traffic. > > When the CAN interface is not able to cope with the (intened) outgoing traffic > load, the applications should get an instant feedback about it. > > There is a difference between running CAN applications in the real world and > doing performance tests, where it makes sense to increase the tx-queue-len to > e.g. 1000 and dump 1000 frames into the driver to check the hardware > performance. Thanks, Oliver, Martin, Andre, Toke, Dave for your inputs... So to conclude this the default txqueuelen 10 is ideal for real-time CAN traffic, For Stress/Performance tests user manually need to increase the txqueuelen based on his requirements. Please correct me if my understanding is wrong. Regards, Kedar. > > Best regards, > Oliver