On Fri, Oct 13, 2023 at 12:58 PM Krishna Kurapati PSSNV <quic_kriskura@xxxxxxxxxxx> wrote: > > > > On 10/14/2023 12:09 AM, Maciej Żenczykowski wrote: > > On Thu, Oct 12, 2023 at 8:40 AM Krishna Kurapati PSSNV > > <quic_kriskura@xxxxxxxxxxx> wrote: > >> > >> > >> > >> On 10/12/2023 6:02 PM, Maciej Żenczykowski wrote: > >>> On Thu, Oct 12, 2023 at 1:48 AM Krishna Kurapati PSSNV > >>> > >>> Could you paste the full patch? > >>> This is hard to review without looking at much more context then email > >>> is providing > >>> (or, even better, send me a link to a CL in gerrit somewhere - for > >>> example aosp ACK mainline tree) > >> > >> Sure. Will provide a gerrit on ACK for review before posting v2. > >> > >> The intent of posting the diff was two fold: > >> > >> 1. The question Greg asked regarding why the max segment size was > >> limited to 15014 was valid. When I thought about it, I actually wanted > >> to limit the max MTU to 15000, so the max segment size automatically > >> needs to be limited to 15014. > > > > Note that this is a *very* abstract value. > > I get you want L3 MTU of 10 * 1500, but this value is not actually meaningful. > > > > IPv4/IPv6 fragmentation and IPv4/IPv6 TCP segmentation > > do not result in a trivial multiplication of the standard 1500 byte > > ethernet L3 MTU. > > Indeed aggregating 2 1500 L3 mtu frames results in *different* sized > > frames depending on which type of aggregation you do. > > (and for tcp it even depends on the number and size of tcp options, > > though it is often assumed that those take up 12 bytes, since that's the > > normal for Linux-to-Linux tcp connections) > > > > For example if you aggregate N standard Linux ipv6/tcp L3 1500 mtu frames, > > this means you have > > N frames: ethernet (14) + ipv6 (40) + tcp (20) + tcp options (12) + > > payload (1500-12-20-40=1500-72=1428) > > post aggregation: > > 1 frame: ethernet (14) + ipv6 (40) + tcp (20) + tcp options (12) + > > payload (N*1428) > > > > so N * 1500 == N * (72 + 1428) --> 1 * (72 + N * 1428) > > > > That value of 72 is instead 52 for 'standard Linux ipv4/tcp), > > it's 40/60 if there's no tcp options (which I think happens when > > talking to windows) > > it's different still with ipv4 fragmentation... and again different > > with ipv6 fragmentation... > > etc. > > > > ie. 15000 L3 mtu is exactly as meaningless as 14000 L3 mtu. > > Either way you don't get full frames. > > > > As such I'd recommend going with whatever is the largest mtu that can > > be meaningfully made to fit in 16K with all the NCM header overhead. > > That's likely closer to 15500-16000 (though I have *not* checked). > > > >> But my commit text didn't mention this > >> properly which was a mistake on my behalf. But when I looked at the > >> code, limiting the max segment size 15014 would force the practical > >> max_mtu to not cross 15000 although theoretical max_mtu was set to: > >> (GETHER_MAX_MTU_SIZE - 15412) during registration of net device. > >> > >> So my assumption of limiting it to 15000 was wrong. It must be limited > >> to 15412 as mentioned in u_ether.c This inturn means we must limit > >> max_segment_size to: > >> GETHER_MAX_ETH_FRAME_LEN (GETHER_MAX_MTU_SIZE + ETH_HLEN) > >> as mentioned in u_ether.c. > >> > >> I wanted to confirm that setting MAX_DATAGRAM_SIZE to > >> GETHER_MAX_ETH_FRAME_LEN was correct. > >> > >> 2. I am not actually able to test with MTU beyond 15000. When my host > >> device is a linux machine, the cdc_ncm.c limits max_segment_size to: > >> CDC_NCM_MAX_DATAGRAM_SIZE 8192 /* bytes */ > > > > In practice you get 50% of the benefits of infinitely large mtu by > > going from 1500 to ~2980. > > you get 75% of the benefits by going to ~6K > > you get 87.5% of the benefits by going to ~12K > > the benefits of going even higher are smaller and smaller... > > > If the host side is limited to 8192, maybe we should match that here too? > > Hi Maciej, > > Thanks for the detailed explanation. I agree with you on setting > device side also to 8192 instead of what max_mtu is present in u_ether > or practical max segment size possible. > > > > > But the host side limitation of 8192 doesn't seem particularly sane either... > > Maybe we should relax that instead? > > > I really didn't understand why it was set to 8192 in first place. > > > (especially since for things like tcp zero copy you want an mtu which > > is slighly more then N * 4096, > > ie. around 4.5KB, 8.5KB, 12.5KB or something like that) > > > > I am not sure about host mode completely. If we want to increase though, > just increasing the MAX_DATAGRAM_SIZE to some bigger value help ? (I > don't know the entire code of cdc_ncm, so I might be wrong). > > Regards, > Krishna, Hmm, I'm not sure. I know I've experimented with high mtu ncm in the past (around 2.5 years ago). I got it working between my Linux desktop (host) and a Pixel 6 (device/gadget) with absolutely no problems. I'm pretty sure I didn't change my desktop kernel, so I was probably limited to 8192 there (and I do more or less remember that). >From what I vaguely remember, it wasn't difficult (at all) to hit upwards of 7gbps for iperf tests. I don't remember how close to the theoretical USB 10gbps maximum of 9.7gbps I could get... [this was never the real bottleneck / issue, so I didn't ever dig particularly deep] I'm pretty sure my gadget side changes were non-configurable... Probably just bumped one or two constants... I do *very* *vaguely* recall there being some funkiness though, where 8192 was *less* efficient than some slightly smaller value. If I recall correctly the issue is that 8192 + ethernet overhead + NCM overhead only fits *once* into 16384, which leaves a lot of space wasted. While ~7.5 kb + overhead fits twice and is thus a fair bit better. I don't remember if I found a way to boost the 16384 to double or triple that. That should have been a win, I can't remember if we were usb3 spec limitted there.