On Fri, Mar 25, 2022 at 2:20 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > With the updates to change the size being passed in the splice from > page_size to pipe_size, this never finished (it would copy around a meg or > so). And stopped. When I killed the agent-fifo task on the guest, the guest > hung hard. Without knowing (or really caring) at all how virtqueue works, this sounds very much like the classic pipe deadlock where two processes communicate over a pair of pipes, sending each other commands, and replying to each other with status updates. And you absolutely cannot do that if one side can possibly want to up fill the whole pipe. Deadlock: - process A is trying to send data to process B (on 'pipe_A'), and blocks because the pipe is full - process B is reads the data and everything is fine, and A gets to continue - but then process B sends some stratus update the other way (on 'pipe_B' - you can't use the same pipe for bidirectional, it's why you use a pair of pipes or a socketpair) and waits for the result. - now A and B are both waiting for each other - A is waiting for B to empty the big bunch of data it's sending, and B is waiting for the result for the (small) command it sent. and neither makes any progress. You can find several mentions of these kinds of problems by just googling for "bidirectional pipe deadlock" or similar. The solution is invariably to either (a) make sure that nobody writes even remotely close to enough data to fill a pipe before reading the other pipe (you can still fill up a pipe, but at least somebody is always going to succeed and make progress and do the read to make progress). (b) make sure everybody who writes to a pipe will use nonblocking IO (and is willing to do reads in between to satisfy the other end). That first case is basically what one of the PIPE_BUF guarantees is all about (the other one is the atomicity it guarantees, ie you can write a "command packet" and be guaranteed that readers will see it without data mixed in from other writes). I have no idea what your client/agent does and how it interacts with the virtio pipes, but it really _sounds_ like a very similar issue, where it used to work (because PIPE_BUF) and now no longer does (because pipe filled). And that virtio_console __send_control_msg() pattern very much sounds like a "send data and wait for ACK" behavior of "process B". Linus _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization