On Fri, 2013-12-13 at 14:41 +0800, Li Zhong wrote: > On Thu, 2013-12-12 at 08:18 +0100, Jesper Dangaard Brouer wrote: > > On Fri, 13 Dec 2013 07:55:03 +0800 Li Zhong <zhong@xxxxxxxxxxxxxxxxxx> wrote: > > > On Wed, 2013-12-11 at 10:55 +0100, Daniel Borkmann wrote: > > > > On 12/12/2013 07:10 AM, Li Zhong wrote: > > [...] > > > > > Also, it seems that we could move skb_set_queue_mapping() into > > > > > packet_pick_tx_queue(), so we avoid calling it one more time > > > > > unnecessarily if we are going into the normal dev_queue_xmit() code > > > > > path. > > > > > > > > I don't agree with that part, I think this can be also beneficiary for > > > > packets without direct xmit, as in PF_PACKET we don't have a notion of > > > > "flow" but just raw packets instead, and can keep the mapping local > > > > depending on the current CPU as we do queue setting elsewhere in the > > > > stack just as well. > > > > > > It seems to me that the newly added xmit in packet_sock is > > > dev_queue_xmit() by default, and in this default case, dev_queue_xmit() > > > would call netdev_pick_tx(), which would set the skb queue_mapping again > > > to override the value based on the current CPU. > > > > Yes, I think you are right, that is also my experience with the code path. > > > > > Or did I miss something here? > > > > A bit related; One thing I'm missing to understand, is why the > > RAW/PF_PACKET sockets have a NULL in skb->sk when they reach > > __netdev_pick_tx() ? (resulting in they cannot store/cache the queue in > > sk_tx_queue_set) > > I checked the code, it seems skb->sk is not set in this code path (for > tcp, seems tcp_transmit_skb() sets it). > > Do you think we could set it here where skb_set_queue_mapping() is used, > so for this code path, we could also have it cached(but for devices > which define their queue selection method, it will not take effect)? Sorry, seems it will not take effect, as the sk_tx_queue_set/get logic depends on sk->sk_dst_cache, which is used by tcp, connected udp ... Not sure why we need this dependency ... Thanks, Zhong > something like below: > > @@ -2219,7 +2219,7 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) > } > } > > - skb_set_queue_mapping(skb, packet_pick_tx_queue(dev)); > + skb->sk = (struct sock *)po; > skb->destructor = tpacket_destruct_skb; > __packet_set_status(po, ph, TP_STATUS_SENDING); > atomic_inc(&po->tx_ring.pending); > @@ -2429,7 +2429,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) > skb->dev = dev; > skb->priority = sk->sk_priority; > skb->mark = sk->sk_mark; > - skb_set_queue_mapping(skb, packet_pick_tx_queue(dev)); > + skb->sk = sk; > > if (po->has_vnet_hdr) { > if (vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { > > > -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html