On Mon, Feb 19, 2024 at 09:58:20PM +1000, Nicholas Piggin wrote: > On Mon Feb 19, 2024 at 4:59 PM AEST, Thomas Huth wrote: > > On 17/02/2024 11.43, Nicholas Piggin wrote: > > > On Sat Feb 17, 2024 at 12:02 AM AEST, Thomas Huth wrote: > > >> getchar() can currently only be called once on arm since the implementation > > >> is a little bit too naïve: After the first character has arrived, the > > >> data register never gets set to zero again. To properly check whether a > > >> byte is available, we need to check the "RX fifo empty" on the pl011 UART > > >> or the "RX data ready" bit on the ns16550a UART instead. > > >> > > >> With this proper check in place, we can finally also get rid of the > > >> ugly assert(count < 16) statement here. > > >> > > >> Signed-off-by: Thomas Huth <thuth@xxxxxxxxxx> > > > > > > Nice, thanks for fixing this up. > > > > > > I see what you mean about multi-migration not waiting. It seems > > > to be an arm issue, ppc works properly. > > > > Yes, it's an arm issue. s390x also works fine. > > > > > This patch changed things > > > so it works a bit better (or at least differently) now, but > > > still has some bugs. Maybe buggy uart migration? > > > > I'm also seeing hangs when running the arm migration-test multiple times, > > but also without my UART patch here - so I assume the problem is not really > > related to the UART? > > Yeah, I ended up figuring it out. A 11 year old TCG migration memory > corruption bug! > > https://lists.gnu.org/archive/html/qemu-devel/2024-02/msg03486.html Nice! And thanks for bringing this multi-migration test support to kvm-unit-tests! drew > > All the weirdness was just symptoms of that. The hang that arm usually > got was target machine trying to lock the uart spinlock that is already > locked (because the unlock store got lost in migration). > > powerpc and s390x were just luckier in avoiding the race, maybe the way > their translation blocks around getchar code were constructed made the > problem not show up easily or at all. I did end up causing problems > for them by rearranging the code (test case is linked in that msg). > > Thanks, > Nick