On Tue, May 18, 2021 at 6:02 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > > On Mon, May 17, 2021 at 11:33:06PM -0700, Jiang Wang . wrote: > >On Mon, May 17, 2021 at 4:02 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > >> > >> On Fri, May 14, 2021 at 11:55:29AM -0700, Jiang Wang . wrote: > >> >On Fri, May 14, 2021 at 8:17 AM Stefano Garzarella <sgarzare@xxxxxxxxxx> wrote: > >> >> On Thu, May 13, 2021 at 04:26:03PM -0700, Jiang Wang . wrote: > >> > >> [...] > >> > >> >> >I see. I will add some limit to dgram packets. Also, when the > >> >> >virtqueues > >> >> >are shared between stream and dgram, both of them need to grab a lock > >> >> >before using the virtqueue, so one will not completely block another one. > >> >> > >> >> I'm not worried about the concurrent access that we definitely need to > >> >> handle with a lock, but more about the uncontrolled packet sending that > >> >> dgram might have, flooding the queues and preventing others from > >> >> communicating. > >> > > >> >That is a valid concern. Let me explain how I would handle that if we > >> >don't add two new virtqueues. For dgram, I also add a dgram_send_pkt_list, > >> >which is similar to send_pkt_list for stream (and seqpacket). But there > >> >is one difference. The dgram_send_pkt_list has a maximum size setting, > >> >and keep tracking how many pkts are in the list. The track number > >> >(dgram_send_pkt_list_size) is increased when a packet is added > >> >to the list and is decreased when a packet > >> >is removed from the list and added to the virtqueue. In > >> >virtio_transport_send_pkt, if the current > >> >dgram_send_pkt_list_size is equal > >> >to the maximum ( let's say 128), then it will not add to the > >> >dgram_send_pkt_list and return an error to the application. > >> > >> For stream socket, we have the send_pkt_list and the send worker because > >> the virtqueue can be full and the transmitter needs to wait available > >> slots, because we can't discard packets. > >> > >> For dgram I think we don't need this, so we can avoid the > >> dgram_send_pkt_list and directly enqueue packets in the virtqueue. > >> > >> If there are no slots available, we can simply drop the packet. > >> In this way we don't have to put limits other than the available space > >> in the virtqueue. > > > >I considered this approach before. One question not clear to me is > >whether we should call virtqueue_kick for every dgram packet. If I > >am not wrong, each virtqueue_kick will lead to a vm_exit to the host. > >If we call virtqueue_kick() for each dgram packet, the performance > >might be bad when there are lots of packets. > > Not every virtqueue_kick() will lead to a vm_exit. If the host (see > vhost_transport_do_send_pkt()) is handling the virtqueue, it disables > the notification through vhost_disable_notify(). Got it. > >One idea is to set > >a threshold and a timer to call virtqueue_kick(). When there are > >lots of packets, we wait until a threshold. When there are few packets, > >the timer will make sure those packets will be delivered not too late. > > I think is better to keep the host polling a bit if we want to avoid > some notifications (see vhost_net_busy_poll()). Sure. > > > >Any other ideas for virtqueue_kick? If the threshold plus timer is fine, > >I can go this direction too. > > I think is better to follow vhost_net_busy_poll() approach. OK. I will follow vhost_net_busy_poll() > > > >> > > >> >In this way, the number of pending dgram pkts to be sent is limited. > >> >Then both stream and dgram sockets will compete to hold a lock > >> >for the tx virtqueue. Depending on the linux scheduler, this competition > >> >will be somewhat fair. As a result, dgram will not block stream completely. > >> >It will compete and share the virtqueue with stream, but stream > >> >can still send some pkts. > >> > > >> >Basically, the virtqueue becomes a first-come first-serve resource for > >> >the stream and dgram. When both stream and dgram applications > >> >have lots of data to send, dgram_send_pkt_list and send_pkt_list > >> >will still be a limited size, and each will have some chance to send out > >> >the data via virtqueue. Does this address your concern? > >> > > >> > > >> >> So having 2 dedicated queues could avoid a credit mechanism at all for > >> >> connection-less sockets, and simply the receiver discards packets that > >> >> it can't handle. > >> > > >> >With 2 dedicated queues, we still need some kind of accounting for > >> >the dgram. Like the dgram_send_pkt_size I mentioned above. Otherwise, > >> >it will cause OOM. It is not a real credit mechanism, but code is similar > >> >with or without 2 dedicated queues in my current prototype. > >> > >> I think in both cases we don't need an accounting, but we can drop > >> packets if the virtqueue is full. > >> > >> It's still not clear to me where OOM may occur. Can you elaborate on > >> this? > > > >OOM depending on the implementation details. If we keep > >dgram_send_pkt_list, and put no limitation, it may increase indefinitely > >and cause OOM. As long as we have some limitations or drop packets > >quickly, then we should be fine. > > Sure, with the extra list we need some limitations. > > > > >> > > >> >For receiver discarding packets part, could you explain more? I think > >> >receiver discarding pkts code also is same with or without 2 dedicated > >> >queues. > >> > >> Yep. > >> > >> > Both will use can_enqueue = virtio_transport_inc_rx_pkt(vvs, pkt); > >> >to check if a packet should be discarded or not. > >> > >> I don't think we can use virtio_transport_inc_rx_pkt() for dgram. > >> This is based on the credit mechanism (e.g. buf_alloc) that works when > >> you have a connection oriented socket. > >> > >> With dgram you can have multiple transmitter for a single socket, so how > >> we can exchange that information? > >> > >In my view, we don't need to exchange that information. Buf_alloc for > >dgram is local to a receiving socket. The sending sockets do not know > >about that. For receiving socket, it just checks if there is still buffer > >available to decide to drop a packet or not. If multiple transmitter > >send to a single socket, they will share the same buf_alloc, and packets > >will be dropped randomly for those transmitters. > > I see, but if we reuse skbuff I think we don't need this. > > > > >> If we reuse the VMCI implementation with skbuff, the sk_receive_skb() > >> already checks if the sk_rcvbuf is full and discards the packet in > >> this > >> case, so we don't need an internal rx_queue. > > > >OK. > > > >> Maybe someday we should convert stream to skbuff too, it would make our > >> lives easier. > >> > >Yeah. I remember the original virtio vsock patch did use skb, but somehow > >it did not use skb anymore when merging to the upstream, not sure why. > > Really? I didn't know, then I'll take a look. If you have any links, > please send :-) No. I take it back. The skb is only used for dgram, not for stream sockets. Sorry for the confusion. The code I found is not in a easy-to-read format. Anyway, here is the link if you are interested: https://kvm.vger.kernel.narkive.com/BMvH9eEr/rfc-v2-0-7-introduce-vm-sockets-virtio-transport#post1 btw: as major parts of the spec are kind of settled. I will send my current POC code to the mailing list just to have a reference and see if there are other major changes are needed. I will add to do's in the cover letter. Will update spec too. > Thanks, > Stefano > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization