Re: [PATCH] pci: endpoint: functions: Add a virtnet EP function

Haotian Wang <haotian.wang@xxxxxxxxxx> · Tue, 3 Sep 2019 20:55:22 -0400

On Tue, Sep 3, 2019 at 6:42 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> So if I understand correctly, what you want is:
> 
> 1) epf virtio actually represent a full virtio pci device to the host 
> Linux.
> 2) to endpoint Linux, you also want to represent a virtio device (by 
> copying data between two vrings) that has its own config ops
> 
> This looks feasible but tricky. One part is the feature negotiation. You 
> probably need to prepare two set of features for each side. Consider in 
> your case, you claim the device to support GUEST_CSUM, but since no 
> HOST_CUSM is advertised, neither side will send packet without csum. And 
> if you claim HOST_CUSM, you need to deal with the case if one of side 
> does not support GUEST_CSUM (e.g checksum by yourself). And things will 
> be even more complex for other offloading features. Another part is the 
> configuration space. You need to handle the inconsistency between two 
> sides, e.g one side want 4 queues but the other only do 1.

You are right about the two bullet points. You are also right about the
two sets of features.

When I put GUEST_CSUM and HOST_CSUM in both devices' features, I always
got the error that packets had incorrect "total length" in ip headers.
There were a bunch of other problems when I tried to implement the other
kinds of offloading.

Also, I encountered another inconsistency with the virtio 1.1 spec.
According to the spec, when legacy interface was used, we were supposed
to put the virtio_net_hdr and the actual packet in two different
descriptors in the rx queue. After a lot of trial and error, packets
were supposed to be put directly after the virtio_net_hdr struct,
together in the same descriptor.

Given that, I still did not address the situations where the two sides
had different features. Therefore, the solution right now is to hardcode
the features the epf support in the source code, including offloading
features, mergeable buffers and number of queues.

> > Also that design uses the conventional virtio/vhost framework. In this
> > epf, are you implying instead of creating a Device A, create some sort
> > of vhost instead?
> 
> 
> Kind of, in order to address the above limitation, you probably want to 
> implement a vringh based netdevice and driver. It will work like, 
> instead of trying to represent a virtio-net device to endpoint, 
> represent a new type of network device, it uses two vringh ring instead 
> virtio ring. The vringh ring is usually used to implement the 
> counterpart of virtio driver. The advantages are obvious:
> 
> - no need to deal with two sets of features, config space etc.
> - network specific, from the point of endpoint linux, it's not a virtio 
> device, no need to care about transport stuffs or embedding internal 
> virtio-net specific data structures
> - reuse the exist codes (vringh) to avoid duplicated bugs, implementing 
> a virtqueue is kind of challenge

Now I see what you mean. The data copying part stays the same but that
data copying stays transparent to the whole vhost/virtio framework. You
want me to create a new type of network_device based on vhost stuff
instead of epf_virtio_device. Yeah, that is doable.

There could be performance overheads with using vhost. The
epf_virtio_device has the most straightforward way of calling callback
functions, while in vhost I would imagine there are some kinds of task
management/scheduling going on. But all this is congesture. I will write
out the code and see if throughput really dropped.

Thanks for clarifying.

Best,
Haotian