On 13.08.23 13:23, Lukas Magel wrote: > Hi Maxime, hi Michal, > > On 01.07.23 00:35, Michal Sojka wrote: >> Hi Maxime, >> >> On Fri, Jun 30 2023, Maxime Jayat wrote: >>> Hi, >>> >>> There is something not clear happening with the non-blocking behavior >>> of ISO-TP sockets in the TX path, but more importantly, using epoll now >>> completely breaks isotp_sendmsg. >>> I believe it is related to >>> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false >>> EPOLLOUT events"), >>> but actually is probably deeper than that. >>> >>> I don't completely understand what is exactly going on, so I am sharing >>> the problem I face: >>> >>> With an ISO-TP socket in non-blocking mode, using epoll seems to make >>> isotp_sendmsg always return -EAGAIN. >> That's definitely not expected behavior. I tested the patch only with >> poll, hoping that epoll would behave the same. >> >> [...] > > I am writing to report that we have been witnessing a behavior very similar > to what you describe. ISO-TP send breaks with EAGAIN if a poll (for a read) > occurs at the same time. > > Our Python stack uses two threads to do concurrent, blocking reads & writes > from and to the ISO-TP socket. The socket has a timeout of 0.1s to facilitate > shutdown when requested by the application. Notably, the blocking semantics > are handled by CPython, i.e. the underlying kernel socket is non-blocking. > CPython polls until the requested operation (read or write) can be executed > or the timeout occurs. > > What happens during execution is that the socket is continuously being > polled by the read thread, i.e. so->wait is always filled with one task. > This process repeats until the socket receives a frame from the bus and the > poll returns successsfully. The app reads the data from the socket and > sends a response. Since the send occurs in a different thread, the reader > thread will have already returned to its poll loop and to populating > so->wait. When the send occurs, isotp_sendmsg checks so->wait for sleepers > and returns EAGAIN because the socket is non-blocking although there is no > concurrent send operation. This dance continues until the timeout occurs for > either the read or the write operation. If the write times out first, a > timeout error causes the Python app to break. If the read times out first, > there is a race that the write goes through or the reader puts in > another poll. > > This behavior can be seen rather nicely in strace: > 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout) > 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout) > 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 1 ([{fd=5, revents=POLLIN}]) > 110580 recvfrom(5, ">\0", 4095, 0, NULL, NULL) = 2 > 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...> > 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}]) > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}]) > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}]) > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > [....] > 110569 poll([{fd=5, events=POLLOUT}], 1, 6) = 1 ([{fd=5, revents=POLLOUT}]) > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > 110569 poll([{fd=5, events=POLLOUT}], 1, 6 <unfinished ...> > 110580 <... poll resumed>) = 0 (Timeout) > 110569 <... poll resumed>) = 1 ([{fd=5, revents=POLLOUT}]) > 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...> > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > 110569 poll([{fd=5, events=POLLOUT}], 1, 5) = 1 ([{fd=5, revents=POLLOUT}]) > 110569 sendto(5, "~\0", 2, 0, NULL, 0) = -1 EAGAIN (Resource temporarily unavailable) > > I believe this is consistent to the behavior you're witnessing with epoll > because epoll also places a sleeper in so->wait that is left in the queue > until the epoll descriptor is closed. > > >>> By reverting 79e19fa79c, I get better results but still incorrect: >> [...] >> >>> It is then possible to write on the socket but the write is blocking, >>> which is not the expected behavior for a non-blocking socket. >> Yes, incorrect behavior was why we made the commit in question, however >> we saw write() returning -EAGAIN when it shouldn't. >> >>> I don't know how to solve the problem. To me, using wq_has_sleeper seems >>> weird. >> Agreed. I've never tried to understand how synchronization works here. >> Hopefully, Oliver knows more. >> >>> The implementation of isotp_poll feels weird too (calling both >>> datagram_poll and >>> poll_wait?). But I am not sure what would be the correct >>> implementation. >> I understand it as follows (which might be wrong - someone, please >> correct me), isotp_poll() should register the file with all waitqueues >> it can wait on. so->wait is one and sock->sq.wait (used by >> datagram_poll) is another. The former is definitely used for TX, the >> latter is probably used because skb_recv_datagram() is called for RX. >> But so->wait is also used for RX and there might proabbly be be some >> inconsistency between those. > > AFAIK, it is correct behavior for isotp_poll to register all wait queues > with the poller. > > Before 79e19fa79c, I assume datagram_poll always returned > EPOLLOUT because its corresponding send buffer is unused and empty. This > return value can be incorrect if so->tx.state is not IDLE and a send would > block. With the patch, this behavior is now suppressed. I believe that > the inconsistency could have rather been introduced with: > can: isotp: fix race between isotp_sendsmg() and isotp_release() 0517374 > > With this patch, the behavior of isotp_sendmsg was changed to only check > so->wait for sleepers instead of consulting so->tx.state to see if the > socket is busy. Since the wait queue can also have sleepers only interested > in read operations, I believe the return value is not a valid indicator > of send readiness. Additionally, in this state, the behavior in isotp_sendmsg > is inconsistent with isotp_poll. I will try to test next week if reverting this > part of the patch could fix the race condition. > I submitted a patch to the mailing list and would very much appreciate your feedback: https://lore.kernel.org/linux-can/20230818114345.142983-1-lukas.magel@xxxxxxxxxx/ @Maxime, I tried your isotprecv / isotpsend snippet and the issue did not occur with the supplied patch. The send operation went through right away. @Oliver I adjusted the exit path for the case where the initial wait is interrupted to return immediately instead of jumping to err_event_drop. Could you please check if you would agree with this change? Regards, Lukas