Hi Sebastien, On Mon, Oct 24, 2022 at 04:15:13PM +0000, Sebastien FABRE wrote: > Hi Oleksij, > > > Hi Sebastien, > > > > On Thu, Oct 20, 2022 at 01:17:36PM +0000, Sebastien FABRE wrote: > > > > On 17.10.2022 14:55:58, Sebastien FABRE wrote: > > > > > Hello, > > > > > > > > > > I am working on 5.4 kernel, and I have the same behavior with 5.10 > > > > > kernel version. > > > > > > > > > > I reproduce the behavior with a custom application. A j1939 socket is > > > > > created with SO_BROADCAST and SO_J1939_PROMISC options and is > > binded. > > > > > The application sends a claim message then 50 broadcast messages in > > > > > loop (without waiting) with size greater than 8 bytes (50). > > > > > > > > > > Every sendto methods return success directly and sessions are stored > > > > > in sk_session_queue. > > > > > > > > > > If the can is 'on' but nobody acknowledges, after some times, trames > > > > > are no longer sent (ENOBUFS) but the application does not have this > > > > > information (sendto returned success). > > > > > > > > > > Moreover, txqueuelen does not have impact to this behavior (queue size > > > > > seems to be infinite). > > > > > > > > > > To finish, closing socket will take a long time depending on > > > > > sk_session_queue size because of J1939_XTP_TX_RETRY_LIMIT: kernel > > > > > seems to try to send every message 100 times if ENOBUFS is received. > > > > > > > > > > Is it the expected behavior? How can the application know that > > > > > messages are no longer sent? > > > > > > > > It's sort of expected....I think we haven't thought of that corner case. > > > > There is the socket TX timeout option, seems we have to implement this for > > > > j1939. > > > > > > > > > > I reproduced the same behaviour with updated testj1939 (so no claim > > message) to be able to send multiple messages. > > > The tests have been done with peak can or flexcan. > > > Should we limit the sk_session_queue size to not be able to have too many > > messages in this queue ? In this case, sendto will return an error (and not > > success) when it is full. > > > > Can you reproduce same issue with j1939cat in broadcast mode? > > > > The difference between testj1939 and j1939cat is the last one is > > designed to get error/completion reports from the kernel J1939 stack. > > Indeed, with j1939cat, I have now an error, but it takes a long time to close the socket. It seems to be because recvmsg is called after the last message (so j1939 queue contains a lot of messages) and not after each message send. > So, is it recommended to use j1939cat mechanism (with errqueue) to send j1939 broadcast messages ? If you need any kind of feedback from the stack, you'll need to use err queue. To deal with the big queue of message which can't be send, we need to extend can frame work with ability to kill pending packets on request. Currently it is not supported, but it is nice to have. Regards, Oleksij -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |