Re: [PATCH nf-next v4 1/5] net: sched: Micro-optimize egress handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 22, 2021 at 9:55 AM Lukas Wunner <lukas@xxxxxxxxx> wrote:
>
> sch_handle_egress() returns either the skb or NULL to signal to its
> caller __dev_queue_xmit() whether a packet should continue to be
> processed.
>
> The skb is always non-NULL, otherwise __dev_queue_xmit() would hit a
> NULL pointer deref right at its top.
>
> But the compiler doesn't know that.  So if sch_handle_egress() signals
> success by returning the skb, the "if (!skb) goto out;" statement
> results in a gratuitous NULL pointer check in the Assembler output.
>
> Avoid by telling the compiler that __dev_queue_xmit() is never passed a
> NULL skb.  This also eliminates another gratuitous NULL pointer check in
> __dev_queue_xmit()
>   qdisc_pkt_len_init()
>     skb_header_pointer()
>       __skb_header_pointer()
>
> The speedup is barely measurable:
> Before: 1877 1875 1878 1874 1882 1873 Mb/sec
> After:  1877 1877 1880 1883 1888 1886 Mb/sec
>
> However we're about to add a netfilter egress hook to __dev_queue_xmit()
> and without the micro-optimization, it will result in a performance
> degradation which is indeed measurable:
> With netfilter hook:               1853 1852 1850 1848 1849 1851 Mb/sec
> With netfilter hook + micro-optim: 1874 1877 1881 1875 1876 1876 Mb/sec
>
> The performance degradation is caused by a JNE instruction ("if (skb)")
> being flipped to a JE instruction ("if (!skb)") once the netfilter hook
> is added.  The micro-optimization removes the test and jump instructions
> altogether.
>
> Measurements were performed on a Core i7-3615QM.  Reproducer:
> ip link add dev foo type dummy
> ip link set dev foo up
> tc qdisc add dev foo clsact
> tc filter add dev foo egress bpf da bytecode '1,6 0 0 0,'
> modprobe pktgen
> echo "add_device foo" > /proc/net/pktgen/kpktgend_3
> samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i foo -n 400000000 -m "11:11:11:11:11:11" -d 1.1.1.1
>
> Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
> Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> Cc: Thomas Graf <tgraf@xxxxxxx>
> ---
>  net/core/dev.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7afbb642e203..4c16b9932823 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4072,6 +4072,7 @@ struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
>   *      the BH enable code must have IRQs enabled so that it will not deadlock.
>   *          --BLG
>   */
> +__attribute__((nonnull(1)))
>  static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>  {
>         struct net_device *dev = skb->dev;
> --
> 2.29.2
>

Interesting !

It is a bit sad the compilers do not automatically get this knowledge
from the very first instruction :

 struct net_device *dev = skb->dev;

I see this also makes qdisc_pkt_len_init() slightly faster because
this removes the if (!skb) test from __skb_header_pointer()

I guess we also could add this patch to benefit all
skb_header_pointer() callers :

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5f60c9e907c9d8eae1e85ae0329838383e3325df..db8774c50cc6ab99deaecdb1dfc31dc5306b9a37
100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3627,6 +3627,7 @@ __skb_header_pointer(const struct sk_buff *skb,
int offset,
        return buffer;
 }

+__attribute__((nonnull(1)))
 static inline void * __must_check
 skb_header_pointer(const struct sk_buff *skb, int offset, int len,
void *buffer)
 {



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux