On Wed, May 15, 2024 at 8:54 PM Jiri Pirko <jiri@xxxxxxxxxxx> wrote: > > Wed, May 15, 2024 at 12:12:51PM CEST, jiri@xxxxxxxxxxx wrote: > >Wed, May 15, 2024 at 10:20:04AM CEST, mst@xxxxxxxxxx wrote: > >>On Wed, May 15, 2024 at 09:34:08AM +0200, Jiri Pirko wrote: > >>> Fri, May 10, 2024 at 01:27:08PM CEST, mst@xxxxxxxxxx wrote: > >>> >On Fri, May 10, 2024 at 01:11:49PM +0200, Jiri Pirko wrote: > >>> >> Fri, May 10, 2024 at 12:52:52PM CEST, mst@xxxxxxxxxx wrote: > >>> >> >On Fri, May 10, 2024 at 12:37:15PM +0200, Jiri Pirko wrote: > >>> >> >> Thu, May 09, 2024 at 04:28:12PM CEST, mst@xxxxxxxxxx wrote: > >>> >> >> >On Thu, May 09, 2024 at 03:31:56PM +0200, Jiri Pirko wrote: > >>> >> >> >> Thu, May 09, 2024 at 02:41:39PM CEST, mst@xxxxxxxxxx wrote: > >>> >> >> >> >On Thu, May 09, 2024 at 01:46:15PM +0200, Jiri Pirko wrote: > >>> >> >> >> >> From: Jiri Pirko <jiri@xxxxxxxxxx> > >>> >> >> >> >> > >>> >> >> >> >> Add support for Byte Queue Limits (BQL). > >>> >> >> >> >> > >>> >> >> >> >> Signed-off-by: Jiri Pirko <jiri@xxxxxxxxxx> > >>> >> >> >> > > >>> >> >> >> >Can we get more detail on the benefits you observe etc? > >>> >> >> >> >Thanks! > >>> >> >> >> > >>> >> >> >> More info about the BQL in general is here: > >>> >> >> >> https://lwn.net/Articles/469652/ > >>> >> >> > > >>> >> >> >I know about BQL in general. We discussed BQL for virtio in the past > >>> >> >> >mostly I got the feedback from net core maintainers that it likely won't > >>> >> >> >benefit virtio. > >>> >> >> > >>> >> >> Do you have some link to that, or is it this thread: > >>> >> >> https://lore.kernel.org/netdev/21384cb5-99a6-7431-1039-b356521e1bc3@xxxxxxxxxx/ > >>> >> > > >>> >> > > >>> >> >A quick search on lore turned up this, for example: > >>> >> >https://lore.kernel.org/all/a11eee78-b2a1-3dbc-4821-b5f4bfaae819@xxxxxxxxx/ > >>> >> > >>> >> Says: > >>> >> "Note that NIC with many TX queues make BQL almost useless, only adding extra > >>> >> overhead." > >>> >> > >>> >> But virtio can have one tx queue, I guess that could be quite common > >>> >> configuration in lot of deployments. > >>> > > >>> >Not sure we should worry about performance for these though. > >>> >What I am saying is this should come with some benchmarking > >>> >results. > >>> > >>> I did some measurements with VDPA, backed by ConnectX6dx NIC, single > >>> queue pair: > >>> > >>> super_netperf 200 -H $ip -l 45 -t TCP_STREAM & > >>> nice -n 20 netperf -H $ip -l 10 -t TCP_RR > >>> > >>> RR result with no bql: > >>> 29.95 > >>> 32.74 > >>> 28.77 > >>> > >>> RR result with bql: > >>> 222.98 > >>> 159.81 > >>> 197.88 > >>> > >> > >>Okay. And on the other hand, any measureable degradation with > >>multiqueue and when testing throughput? > > > >With multiqueue it depends if the flows hits the same queue or not. If > >they do, the same results will likely be shown. > > RR 1q, w/o bql: > 29.95 > 32.74 > 28.77 > > RR 1q, with bql: > 222.98 > 159.81 > 197.88 > > RR 4q, w/o bql: > 355.82 > 364.58 > 233.47 > > RR 4q, with bql: > 371.19 > 255.93 > 337.77 > > So answer to your question is: "no measurable degradation with 4 > queues". Thanks but I think we also need benchmarks in cases other than vDPA. For example, a simple virtualization setup.