Hi Oleksij, > Hi Sebastien, > > On Thu, Oct 20, 2022 at 01:17:36PM +0000, Sebastien FABRE wrote: > > > On 17.10.2022 14:55:58, Sebastien FABRE wrote: > > > > Hello, > > > > > > > > I am working on 5.4 kernel, and I have the same behavior with 5.10 > > > > kernel version. > > > > > > > > I reproduce the behavior with a custom application. A j1939 socket is > > > > created with SO_BROADCAST and SO_J1939_PROMISC options and is > binded. > > > > The application sends a claim message then 50 broadcast messages in > > > > loop (without waiting) with size greater than 8 bytes (50). > > > > > > > > Every sendto methods return success directly and sessions are stored > > > > in sk_session_queue. > > > > > > > > If the can is 'on' but nobody acknowledges, after some times, trames > > > > are no longer sent (ENOBUFS) but the application does not have this > > > > information (sendto returned success). > > > > > > > > Moreover, txqueuelen does not have impact to this behavior (queue size > > > > seems to be infinite). > > > > > > > > To finish, closing socket will take a long time depending on > > > > sk_session_queue size because of J1939_XTP_TX_RETRY_LIMIT: kernel > > > > seems to try to send every message 100 times if ENOBUFS is received. > > > > > > > > Is it the expected behavior? How can the application know that > > > > messages are no longer sent? > > > > > > It's sort of expected....I think we haven't thought of that corner case. > > > There is the socket TX timeout option, seems we have to implement this for > > > j1939. > > > > > > > I reproduced the same behaviour with updated testj1939 (so no claim > message) to be able to send multiple messages. > > The tests have been done with peak can or flexcan. > > Should we limit the sk_session_queue size to not be able to have too many > messages in this queue ? In this case, sendto will return an error (and not > success) when it is full. > > Can you reproduce same issue with j1939cat in broadcast mode? > > The difference between testj1939 and j1939cat is the last one is > designed to get error/completion reports from the kernel J1939 stack. Indeed, with j1939cat, I have now an error, but it takes a long time to close the socket. It seems to be because recvmsg is called after the last message (so j1939 queue contains a lot of messages) and not after each message send. So, is it recommended to use j1939cat mechanism (with errqueue) to send j1939 broadcast messages ? Regards, Sébastien Fabre