On Mon, Jun 25, 2018 at 11:41 PM Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > > > > On 06/25/2018 09:15 PM, Cong Wang wrote: > > On Mon, Jun 25, 2018 at 8:59 AM Flavio Leitner <fbl@xxxxxxxxxx> wrote: > >> > >> The sock reference is lost when scrubbing the packet and that breaks > >> TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing > >> performance impacts of about 50% in a single TCP stream when crossing > >> network namespaces. > >> > >> XPS breaks because the queue mapping stored in the socket is not > >> available, so another random queue might be selected when the stack > >> needs to transmit something like a TCP ACK, or TCP Retransmissions. > >> That causes packet re-ordering and/or performance issues. > >> > >> TSQ breaks because it orphans the packet while it is still in the > >> host, so packets are queued contributing to the buffer bloat problem. > > > > Why should TSQ in one stack care about buffer bloat in another stack? > > > > Actually, I think the current behavior is correct, once the packet leaves > > its current stack (or netns), it should relief the backpressure on TCP > > socket in this stack, whether it will be queued in another stack is beyond > > its concern. This breaks the isolation between networking stacks. > > > > We discussed about this during netconf Cong, nobody was against this planned removal. I agreed to keep skb->sk, but didn't realize it actually impacts TSQ too. > > When a packet is attached to a socket, we should keep the association as much as possible. As much as possible within one stack, I agree. I still don't understand why we should keep it across the stack boundary. > > Only when a new association needs to be done, skb_orphan() needs to be called. > > Doing this skb_orphan() too soon breaks back pressure in general, this is bad, since a socket > can evades SO_SNDBUF limits. Right before leaving the stack is not too soon, it is the latest actually, for veth case. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html