Re: [PATCH net-next v2 3/4] virtio-net: auto-tune mergeable rx buffer size for improved performance

Michael Dalton <mwdalton@xxxxxxxxxx> · Wed, 8 Jan 2014 19:16:18 -0800

Hi Michael,

On Wed, Jan 8, 2014 at 5:42 PM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> Sorry that I didn't notice early, but there seems to be a bug here.
> See below.
Yes, that is definitely a bug. Virtio spec permits OOO completions,
but current code assumes in-order completion. Thanks for catching this.

> Don't need full int really, it's up to 4K/cache line size,
> 1 byte would be enough, maximum 2 ...
> So if all we want is extra 1-2 bytes per buffer, we don't really
> need this extra level of indirection I think.
> We can just allocate them before the header together with an skb.
I'm not sure if I'm parsing the above correctly, but do you mean using a
few bytes at the beginning of the packet buffer to store truesize? I
think that will break Jason's virtio-net RX frag coalescing
code. To coalesce consecutive RX packet buffers, our packet buffers must
be physically adjacent, and any extra bytes before the start of the
buffer would break that.

We could allocate an SKB per packet buffer, but if we have multi-buffer
packets often(e.g., netperf benefiting from GSO/GRO), we would be
allocating 1 SKB per packet buffer instead of 1 SKB per MAX_SKB_FRAGS
buffers. How do you feel about any of the below alternatives:

(1) Modify the existing mrg_buf_ctx to chain together free entries
We can use the 'buf' pointer in mergeable_receive_buf_ctx to chain
together free entries so that we can support OOO completions. This would
be similar to how virtio-queue manages free sg entries.

(2) Combine the buffer pointer and truesize into a single void* value
Your point about there only being a byte needed to encode truesize is
spot on, and I think we could leverage this to eliminate the out-of-band
metadata ring entirely. If we were willing to change the packet buffer
alignment from L1_CACHE_BYTES to 256 (or min (256, L1_CACHE_SIZE)), we
could encode the truesize in the least significant 8 bits of the buffer
address (encoded as truesize >> 8 as we know all sizes are a multiple
of 256). This would allow packet buffers up to 64KB in length.

Is there another approach you would prefer to any of these? If the
cleanliness issues and larger alignment aren't too bad, I think (2)
sounds promising and allow us to eliminate the metadata ring
entirely while still permitting RX frag coalescing.

Best,

Mike
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization