On (12/01/15 10:17), Rick Jones wrote: > > What do the perf profiles show? Presumably, loss of TSO/GSO means > an increase in the per-packet costs, but if the ipsec path > significantly increases the per-byte costs... For ESP-null, there's actually very little work to do - we just need to add the 8 byte ESP header with an spi and a seq#.. no crypto work to do.. so the overhead *should* be minimal, else we've painted ourself into a corner where we can't touch anything including TCP options like md5. perf profiles: I used perf tracepoints to instrument latency. Yes, there is function call overhead for the xfrm path. So, for example, the stack ends up being like this: : e5d2f2 ip_finish_output ([kerne.kallsyms]) 75d6d0 ip_output ([kernel.kallsyms]) 7c08ad xfrm_output_resume ([kernel.kallsyms]) 7c0aae xfrm_output ([kernel.kallsyms]) 7b1bdd xfrm4_output_finish ([kernel.kallsyms]) 7b1c7e __xfrm4_output ([kernel.kallsyms]) 7b1dbe xfrm4_output ([kernel.kallsyms]) 75bac4 ip_local_out ([kernel.kallsyms]) 75c012 ip_queue_xmit ([kernel.kallsyms]) 7736a3 tcp_transmit_skb ([kernel.kallsyms]) : where the detour into xfrm has been indented out, and esp_output gets called out of xfrm_output_resume(). And as I said, there's some nickels-and-dimes of perf to be squeezed out from better memory management in xfrm, but the fact that it doesnt move beyond 3 Gbps strikes me as some other bottleneck/serialization. > Short of a perf profile, I suppose one way to probe for per-packet > versus per-byte would be to up the MTU. That should reduce the > per-packet costs while keeping the per-byte roughly the same. actually the hack/rfc I sent out does help (in that it almost doubles the existing 1.8 Gbps). Problem is that this cliff is much steeper than that, and there's more hidden somewhere. --Sowmini -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html