Re: [PATCH 2/2] usb: gadget: ncm: Add support to update wMaxSegmentSize via configfs

Maciej Żenczykowski <maze@xxxxxxxxxx> · Fri, 13 Oct 2023 15:35:32 -0700

On Fri, Oct 13, 2023 at 12:58 PM Krishna Kurapati PSSNV
<quic_kriskura@xxxxxxxxxxx> wrote:
>
>
>
> On 10/14/2023 12:09 AM, Maciej Żenczykowski wrote:
> > On Thu, Oct 12, 2023 at 8:40 AM Krishna Kurapati PSSNV
> > <quic_kriskura@xxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 10/12/2023 6:02 PM, Maciej Żenczykowski wrote:
> >>> On Thu, Oct 12, 2023 at 1:48 AM Krishna Kurapati PSSNV
> >>>
> >>> Could you paste the full patch?
> >>> This is hard to review without looking at much more context then email
> >>> is providing
> >>> (or, even better, send me a link to a CL in gerrit somewhere - for
> >>> example aosp ACK mainline tree)
> >>
> >> Sure. Will provide a gerrit on ACK for review before posting v2.
> >>
> >> The intent of posting the diff was two fold:
> >>
> >> 1. The question Greg asked regarding why the max segment size was
> >> limited to 15014 was valid. When I thought about it, I actually wanted
> >> to limit the max MTU to 15000, so the max segment size automatically
> >> needs to be limited to 15014.
> >
> > Note that this is a *very* abstract value.
> > I get you want L3 MTU of 10 * 1500, but this value is not actually meaningful.
> >
> > IPv4/IPv6 fragmentation and IPv4/IPv6 TCP segmentation
> > do not result in a trivial multiplication of the standard 1500 byte
> > ethernet L3 MTU.
> > Indeed aggregating 2 1500 L3 mtu frames results in *different* sized
> > frames depending on which type of aggregation you do.
> > (and for tcp it even depends on the number and size of tcp options,
> > though it is often assumed that those take up 12 bytes, since that's the
> > normal for Linux-to-Linux tcp connections)
> >
> > For example if you aggregate N standard Linux ipv6/tcp L3 1500 mtu frames,
> > this means you have
> > N frames: ethernet (14) + ipv6 (40) + tcp (20) + tcp options (12) +
> > payload (1500-12-20-40=1500-72=1428)
> > post aggregation:
> > 1 frame: ethernet (14) + ipv6 (40) + tcp (20) + tcp options (12) +
> > payload (N*1428)
> >
> > so N * 1500 == N * (72 + 1428) --> 1 * (72 + N * 1428)
> >
> > That value of 72 is instead 52 for 'standard Linux ipv4/tcp),
> > it's 40/60 if there's no tcp options (which I think happens when
> > talking to windows)
> > it's different still with ipv4 fragmentation... and again different
> > with ipv6 fragmentation...
> > etc.
> >
> > ie. 15000 L3 mtu is exactly as meaningless as 14000 L3 mtu.
> > Either way you don't get full frames.
> >
> > As such I'd recommend going with whatever is the largest mtu that can
> > be meaningfully made to fit in 16K with all the NCM header overhead.
> > That's likely closer to 15500-16000 (though I have *not* checked).
> >
> >> But my commit text didn't mention this
> >> properly which was a mistake on my behalf. But when I looked at the
> >> code, limiting the max segment size 15014 would force the practical
> >> max_mtu to not cross 15000 although theoretical max_mtu was set to:
> >> (GETHER_MAX_MTU_SIZE - 15412) during registration of net device.
> >>
> >> So my assumption of limiting it to 15000 was wrong. It must be limited
> >> to 15412 as mentioned in u_ether.c  This inturn means we must limit
> >> max_segment_size to:
> >> GETHER_MAX_ETH_FRAME_LEN (GETHER_MAX_MTU_SIZE + ETH_HLEN)
> >> as mentioned in u_ether.c.
> >>
> >> I wanted to confirm that setting MAX_DATAGRAM_SIZE to
> >> GETHER_MAX_ETH_FRAME_LEN was correct.
> >>
> >> 2. I am not actually able to test with MTU beyond 15000. When my host
> >> device is a linux machine, the cdc_ncm.c limits max_segment_size to:
> >> CDC_NCM_MAX_DATAGRAM_SIZE               8192    /* bytes */
> >
> > In practice you get 50% of the benefits of infinitely large mtu by
> > going from 1500 to ~2980.
> > you get 75% of the benefits by going to ~6K
> > you get 87.5% of the benefits by going to ~12K
> > the benefits of going even higher are smaller and smaller...
> >  > If the host side is limited to 8192, maybe we should match that here too?
>
> Hi Maciej,
>
>   Thanks for the detailed explanation. I agree with you on setting
> device side also to 8192 instead of what max_mtu is present in u_ether
> or practical max segment size possible.
>
> >
> > But the host side limitation of 8192 doesn't seem particularly sane either...
> > Maybe we should relax that instead?
> >
> I really didn't understand why it was set to 8192 in first place.
>
> > (especially since for things like tcp zero copy you want an mtu which
> > is slighly more then N * 4096,
> > ie. around 4.5KB, 8.5KB, 12.5KB or something like that)
> >
>
> I am not sure about host mode completely. If we want to increase though,
> just increasing the MAX_DATAGRAM_SIZE to some bigger value help ? (I
> don't know the entire code of cdc_ncm, so I might be wrong).
>
> Regards,
> Krishna,

Hmm, I'm not sure.  I know I've experimented with high mtu ncm in the past
(around 2.5 years ago).  I got it working between my Linux desktop (host)
and a Pixel 6 (device/gadget) with absolutely no problems.

I'm pretty sure I didn't change my desktop kernel, so I was probably
limited to 8192 there
(and I do more or less remember that).
>From what I vaguely remember, it wasn't difficult (at all) to hit
upwards of 7gbps for iperf tests.
I don't remember how close to the theoretical USB 10gbps maximum of
9.7gbps I could get...
[this was never the real bottleneck / issue, so I didn't ever dig
particularly deep]

I'm pretty sure my gadget side changes were non-configurable...
Probably just bumped one or two constants...

I do *very* *vaguely* recall there being some funkiness though, where 8192 was
*less* efficient than some slightly smaller value.

If I recall correctly the issue is that 8192 + ethernet overhead + NCM
overhead only fits *once* into 16384, which leaves a lot of space
wasted.
While ~7.5 kb + overhead fits twice and is thus a fair bit better.

I don't remember if I found a way to boost the 16384 to double or triple that.
That should have been a win, I can't remember if we were usb3 spec
limitted there.