On Mon, May 25, 2020 at 09:45:49PM +0200, Daniel Glöckner wrote: > Hello Sascha, > > Am 25.05.20 um 12:33 schrieb Sascha Hauer: > > This removes the may_send mechanism from fastboot over UDP. Without this > > fastboot transfers get stuck when on the command line something like > > "sleep 10" is executed before or during a fastboot session. In this case > > while (fbn->may_send == MAY_NOT_SEND) will never become false and > > barebox is cought in that loop. > > I am not able to reproduce the problem you have without this patch. > If I start "sleep 10" on the serial console and then use fastboot to > execute something else, the host will repeat the UDP packet with the > command until the sleep is over. At that point the board sends an ACK > packet and starts to execute the command. > > Starting a big download and then executing sleep 10 on the console > also works. The download continues while sleep is executing. Only the > "Downloading XXXX bytes finished" info message and the final "OKAY" > packet are delayed by the sleep since that is done from inside the > poller. Same here. It turned out that barebox got stuck because of the idle_slice_acquire/release I had removed from common/poller.c. With these added back it works as expected. Sorry for the noise. > > What were the exact commands you used to trigger the problem? > What kind of network connection do you use? > > I know that Barebox will get stuck when the host stops talking to us > before we had the opportunity to send the final OKAY, but that is not > what you describe. If you use a sleep 100, the host will time out > before the sleep is done. After the sleep the command is still in > fbn->command, so it will be acked and executed. Since the host no > longer talks to us, it will not send us an empty packet with the next > sequence number, so the loop in fastboot_write_net will never terminate. > > The only way to get out of this situation with the current code > (omitting patch 20) is to start a new fastboot session. The INIT packet > of the new session will set reinit to true, which discards all messages > the old session wants to send. If this is not acceptable, I suggest > adding a timeout >= 60 seconds to that loop in fastboot_write_net and > setting reinit when it expires. We can make the timeout a kconfig option > which defaults to 60 if you prefer. Yes, we should add a timeout to this loop. Being stuck in an endless loop never is a good user experience. How about this? | diff --git a/net/fastboot.c b/net/fastboot.c | index 42c432c678..f579666335 100644 | --- a/net/fastboot.c | +++ b/net/fastboot.c | @@ -144,12 +144,29 @@ static int fastboot_write_net(struct fastboot *fb, const char *buf, | struct fastboot_header response_header; | uchar *packet; | uchar *packet_base; | + uint64_t start; | + int selapsed = 5; | | if (fbn->reinit) | return 0; | | + start = get_time_ns(); | + | while (fbn->may_send == MAY_NOT_SEND) { | poller_call(); | + | + if (is_timeout(start, selapsed * SECOND)) { | + pr_err("No fastboot packet from host for %d seconds\n", | + selapsed); | + | + if (selapsed > 60) { | + pr_err("Giving up\n"); | + fbn->reinit = true; | + } | + | + selapsed += 5; | + } | + | if (fbn->reinit) | return 0; | } Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | _______________________________________________ barebox mailing list barebox@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/barebox