Hi, On 10/19/2010 09:10 AM, Amit Shah wrote: > On (Tue) Oct 19 2010 [08:55:16], Hans de Goede wrote: >> Hi, >> >> On 10/19/2010 07:45 AM, Amit Shah wrote: >>> If the host is slow in reading data or doesn't read data at all, >>> blocking write calls not only blocked the program that called write() >>> but the entire guest itself. >>> >>> To overcome this, let's not block till the host signals it has given >>> back the virtio ring element we passed it. Instead, send the buffer to >>> the host and return to userspace. This operation then becomes similar >>> to how non-blocking writes work, so let's use the existing code for this >>> path as well. >>> >>> This code change also ensures blocking write calls do get blocked if >>> there's not enough room in the virtio ring as well as they don't return >>> -EAGAIN to userspace. >>> >>> Signed-off-by: Amit Shah<amit.shah@xxxxxxxxxx> >>> CC: stable@xxxxxxxxxx >>> --- >>> drivers/char/virtio_console.c | 17 ++++++++++++++--- >>> 1 files changed, 14 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c >>> index c810481..0f69c5e 100644 >>> --- a/drivers/char/virtio_console.c >>> +++ b/drivers/char/virtio_console.c >>> @@ -459,9 +459,12 @@ static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count, >>> >>> /* >>> * Wait till the host acknowledges it pushed out the data we >>> - * sent. This is done for ports in blocking mode or for data >>> - * from the hvc_console; the tty operations are performed with >>> - * spinlocks held so we can't sleep here. >>> + * sent. This is done for data from the hvc_console; the tty >>> + * operations are performed with spinlocks held so we can't >>> + * sleep here. An alternative would be to copy the data to a >>> + * buffer and relax the spinning requirement. The downside is >>> + * we need to kmalloc a GFP_ATOMIC buffer each time the >>> + * console driver writes something out. >>> */ >>> while (!virtqueue_get_buf(out_vq,&len)) >>> cpu_relax(); >>> @@ -626,6 +629,14 @@ static ssize_t port_fops_write(struct file *filp, const char __user *ubuf, >>> goto free_buf; >>> } >>> >>> + /* >>> + * We now ask send_buf() to not spin for generic ports -- we >>> + * can re-use the same code path that non-blocking file >>> + * descriptors take for blocking file descriptors since the >>> + * wait is already done and we're certain the write will go >>> + * through to the host. >>> + */ >>> + nonblock = true; >>> ret = send_buf(port, buf, count, nonblock); >>> >>> if (nonblock&& ret> 0) >> >> 1) Hmm, this changes the code to kfree the buffer, but only if the send_buf >> succeeded (which it always should given we did a will_block check first). > > The change is to *not* free the buffer. It will be freed later when the > host indicates it's done with it (happens in reclaim_consumed_buffers()). > Ah, thanks for explaining that. >> I cannot help but notice that the data was not freed on a blocking fd >> before this patch, but is freed now. And I see nothing in send_buf to make >> it take ownership of the buffer / free it in the blocking case, and not take >> ownership in the blocking case. More over if anything I would expect send_buf >> to take ownership in the non blocking case (as the data is not directly >> consumed there), and not take owner ship in the blocking case, but the check >> is the reverse. Also why is the buffer not freed if the write failed, that >> makes no sense. > > The buffer used to be freed in the blocking case, as we knew for certain > the host was done with the buffer. Now it's not, we'll free it later. > >> 2) Assuming that things are changed so that send_buf does take ownership of the >> buffer in the nonblocking case, shouldn't the buffer then be allocated >> with GPF_ATOMIC ? > > Why? We're not called from irq context. > Ok, my bad. >> 3) This patch will cause processes filling the virtqueue fast enough to block >> to never wake up again, due to a missing waitqueue wakeup, see: >> https://bugzilla.redhat.com/show_bug.cgi?id=643750 > > Doesn't happen in my testcase, but this patch shouldn't cause that > problem if it exists -- it's a problem that exists even now for > nonblocking ports. So if such a bug exists, it needs to be fixed > independently. First of all lets agree that this is a real problem, there is simply nothing waking the waitqueue were fops_write (or poll) block on when buffers become available in out_vq, it may be hard to come up with a test case which fills the queue fast enough to hit this scenario, but it is very real. I agree it is an independent problem, and should be fixed in a separate patch, but that patch should be part of the same set and become *before* this one, as this patch now extends the problem to ports opened in blocking mode too. BTW, many thanks for working on this, it is appreciated :) Regards, Hans _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/virtualization