Re: [PATCH v12 5/8] virtio-balloon: VIRTIO_BALLOON_F_SG

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/23/2017 09:45 AM, Michael S. Tsirkin wrote:
On Fri, Jul 14, 2017 at 03:12:43PM +0800, Wei Wang wrote:
On 07/14/2017 04:19 AM, Michael S. Tsirkin wrote:
On Thu, Jul 13, 2017 at 03:42:35PM +0800, Wei Wang wrote:
On 07/12/2017 09:56 PM, Michael S. Tsirkin wrote:
So the way I see it, there are several issues:

- internal wait - forces multiple APIs like kick/kick_sync
     note how kick_sync can fail but your code never checks return code
- need to re-write the last descriptor - might not work
     for alternative layouts which always expose descriptors
     immediately
Probably it wasn't clear. Please let me explain the two functions here:

1) virtqueue_add_chain_desc(vq, head_id, prev_id,..):
grabs a desc from the vq and inserts it to the chain tail (which is indexed
by
prev_id, probably better to call it tail_id). Then, the new added desc
becomes
the tail (i.e. the last desc). The _F_NEXT flag is cleared for each desc
when it's
added to the chain, and set when another desc comes to follow later.
And this only works if there are multiple rings like
avail + descriptor ring.
It won't work e.g. with the proposed new layout where
writing out a descriptor exposes it immediately.
I think it can support the 1.1 proposal, too. But before getting
into that, I think we first need to deep dive into the implementation
and usage of _first/next/last. The usage would need to lock the vq
from the first to the end (otherwise, the returned info about the number
of available desc in the vq, i.e. num_free, would be invalid):

lock(vq);
add_first();
add_next();
add_last();
unlock(vq);

However, I think the case isn't this simple, since we need to check more
things
after each add_xx() step. For example, if only one entry is available at the
time
we start to use the vq, that is, num_free is 0 after add_first(), we
wouldn't be
able to add_next and add_last. So, it would work like this:

start:
     ...get free page block..
     lock(vq)
retry:
     ret = add_first(..,&num_free,);
     if(ret == -ENOSPC) {
         goto retry;
     } else if (!num_free) {
         add_chain_head();
         unlock(vq);
         kick & wait;
         goto start;
     }
next_one:
     ...get free page block..
     add_next(..,&num_free,);
     if (!num_free) {
         add_chain_head();
         unlock(vq);
         kick & wait;
         goto start;
     } if (num_free == 1) {
         ...get free page block..
         add_last(..);
         unlock(vq);
         kick & wait;
         goto start;
     } else {
         goto next_one;
     }

The above seems unnecessary to me to have three different APIs.
That's the reason to combine them into one virtqueue_add_chain_desc().

-- or, do you have a different thought about using the three APIs?


Implementation Reference:

struct desc_iterator {
     unsigned int head;
     unsigned int tail;
};

add_first(*vq, *desc_iterator, *num_free, ..)
{
     if (vq->vq.num_free < 1)
         return -ENOSPC;
     get_desc(&desc_id);
     desc[desc_id].flag &= ~_F_NEXT;
     desc_iterator->head = desc_id
     desc_iterator->tail = desc_iterator->head;
     *num_free = vq->vq.num_free;
}

add_next(vq, desc_iterator, *num_free,..)
{
     get_desc(&desc_id);
     desc[desc_id].flag &= ~_F_NEXT;
     desc[desc_iterator.tail].next = desc_id;
     desc[desc_iterator->tail].flag |= _F_NEXT;
     desc_iterator->tail = desc_id;
     *num_free = vq->vq.num_free;
}

add_last(vq, desc_iterator,..)
{
     get_desc(&desc_id);
     desc[desc_id].flag &= ~_F_NEXT;
     desc[desc_iterator.tail].next = desc_id;
     desc_iterator->tail = desc_id;

     add_chain_head(); // put the desc_iterator.head to the ring
}


Best,
Wei
OK I thought this over. While we might need these new APIs in
the future, I think that at the moment, there's a way to implement
this feature that is significantly simpler. Just add each s/g
as a separate input buffer.


Should it be an output buffer? I think output means from the
driver to device (i.e. DMA_TO_DEVICE).


This needs zero new APIs.

I know that follow-up patches need to add a header in front
so you might be thinking: how am I going to add this
header? The answer is quite simple - add it as a separate
out header.

Host will be able to distinguish between header and pages
by looking at the direction, and - should we want to add
IN data to header - additionally size (<4K => header).


I think this works fine when the cmdq is only used for
reporting the unused pages. It would be an issue
if there are other usages (e.g. report memory statistics)
interleaving. I think one solution would be to lock the cmdq until
a cmd usage is done ((e.g. all the unused pages are reported) ) -
in this case, the periodically updated guest memory statistics
may be delayed for a while occasionally when live migration starts.
Would this be acceptable? If not, probably we can have the cmdq
for one usage only.


Best,
Wei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux