USB3.0 / USB3.1 gen1 / USB3.2 gen 1x1 / 5gbps overhead is upwards of 20% (8b10b coding is 80% efficient). USB3.1 gen2 / USB3.2 gen 2x1 / 10gbps overhead is upwards of 3% (128b132b coding is nearly 97% efficient). however: USB3.2 gen 1x2 / 10gbps overhead is again 20% (since this is 8b10b on two 5gbps links on one cable) USB3.2 gen 2x2 / 20gbps overhead is again 3% (since this is 128b132b on two 10gbps links on one cable) On top of that you need to layer usb protocol overhead (the above is just link layer overhead). AFAICT for optimal xfer you need to transfer in 16KiB chunks, which get split into 16 1KiB pieces, each piece has overhead, plus there's a begin packet and final ack packet (ie. 18 packets total). I'm not entirely sure what the overhead is here, but my estimate: 16384 / (32 + 16*(32 + 1024) + 32) puts it at another 3.5% loss on top of the previous L1 overhead (ie. multiplicative). [Note: I'm not entirely sure if the first and final 32 are correct, but I'm pretty sure it's at least this much, if anything stuff is worse due to some unavoidable delays between data reception and ack, the upstream direction to host is even worse, since host asks for data, device provides it, host acks it, thus there's 2 data direction flip delays] This means: 5 gbps -> 5*8/10*0.965 = 3.86 gbps (USB 3.0 / USB3.1 gen1 / USB3.2 gen 1x1) 10 gbps -> 10*128/132*0.965 = 9.35 gbps (this is USB3.1 gen2 / USB3.2 gen 2x1) 10 gbps -> 10*8/10*0.965 = 7.72 gbps (this is dual link USB3.2 gen 1x2) 20 gbps -> 20*128/132*0.965 = 18.72 gbps (this is dual link USB3.2 gen 2x2) At least I'm pretty sure you physically can't go faster, though there might still be extra overhead I missed (which would make it even slower). (in particular the dual link cases seem to duplicate some control stuff across both cables, so overhead is probably a tad higher) > > > + /* the following 2 values can be tweaked if necessary */ > > > + /* .bMaxBurst = 0, */ > > > > should you add bMaxBurst = 15 here? > > I'm not sure. On my setup, it provides a fair performance boost (goes > from ~1.7Gbps to ~2.3Gbps in, and ~620Mbps to ~720Mbps out). But I > don't know whether there might be any compatibility constraints or > hardware dependencies. I do see that the f_mass_storage driver sets it > to 15: > > /* Calculate bMaxBurst, we know packet size is 1024 */ > max_burst = min_t(unsigned, FSG_BUFLEN / 1024, 15); > > so perhaps this is fine to do in NCM too? If we want to set bMaxBurst > to 15, should that be in this patch, or in a separate patch? I think we should. I would imagine this is the 16*1024 I reference up above. Though it should probably be bumped in a different commit.