On Tue, Mar 30, 2021 at 3:42 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > On Mon, Mar 29, 2021 at 04:22:28PM -0700, Jiang Wang . wrote: > > On Mon, Mar 29, 2021 at 2:26 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > > > > > On Fri, Mar 26, 2021 at 04:40:09PM -0700, Jiang Wang . wrote: > > > > I thought about this and discussed it with my colleague Cong Wang. > > > > One idea is to make current asynchronous send_pkt flow to be synchronous, > > > > then if the virtqueue is full, the function can return ENOMEM all the way back > > > > to the caller and the caller can check the return value of sendmsg > > > > and slow down when necessary. > > > > > > > > In the spec, we can put something like, if the virtqueue is full, the caller > > > > should be notified with an error etc. > > > > > > > > In terms of implementation, that means we will remove the current > > > > send_pkt_work for both stream and dgram sockets. Currently, the > > > > code path uses RCU and a work queue, then grab a mutex in the > > > > work queue function. Since we cannot grab mutex when in rcu > > > > critical section, we have to change RCU to a normal reference > > > > counting mechanism. I think this is doable. The drawback is > > > > that the reference counting in general spends a little more > > > > cycles than the RCU, so there is a small price to pay. Another > > > > option is to use Sleepable RCU and remove the work queue. > > > > > > > > What do you guys think? > > > > > > I think the tx code path is like this because of reliable delivery. > > > Maybe a separate datagram rx/tx code path would be simpler? > > > > I thought about this too. dgram can have a separate rx/tx > > path from stream types. In this case, the the_virtio_vsock > > will still be shared because the virtqueues have to be in one > > structure. Then the_virtio_vsock will be protected by a rcu > > and a reference counting ( or a sleepable RCU ). > > > > In vhost_vsock_dev_release, it will wait for both rcu and another > > one to be finished and then free the memory. I think this is > > doable. Let me know if there is a better way to do it. > > Btw, I think dgram tx code path will be quite different from > > stream, but dgram rx path will be similar to stream type. > > > > > Take the datagram tx virtqueue lock, try to add the packet into the > > > virtqueue, and return -ENOBUFS if the virtqueue is full. Then use the > > > datagram socket's sndbuf accounting to prevent queuing up too many > > > packets. When a datagram tx virtqueue buffer is used by the device, > > > select queued packets for transmission. > > > > I am not sure about the last sentence. In the new design, there will > > be no other queued packets for dgram (except those in the virtqueue). > > When dgram tx virtqueue is freed by the device, the driver side > > just needs to free some space. No need > > to trigger more transmission. > > This approach drops packets when the virtqueue is full. In order to get > close to UDP semantics I think the following is necessary: > > 1. Packets are placed directly into the tx virtqueue when possible. > 2. Packets are queued up to sndbuf bytes if the tx virtqueue is full. > 3. When a full tx virtqueue gets some free space again there is an > algorithm that selects from among the queued sockets. > > This is how unreliably delivery can be somewhat fair between competing > threads. I thought you were trying to achieve this (it's similar to what > UDP does)? I see. I was thinking something different without an extra queue. I think your approach is better. For step 3, I think I will just start with a simple FIFO algorithm first. > Stefan _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization