Re: [PATCH bpf 2/2] libbpf: remove dependency on barrier.h in xsk.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/10/2019 12:28 AM, Georg Müller wrote:
> Am 09.04.19 um 13:29 schrieb Magnus Karlsson:
>> On Tue, Apr 9, 2019 at 11:11 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>>> On 04/09/2019 08:44 AM, Magnus Karlsson wrote:
>>>> The use of smp_rmb() and smp_wmb() creates a Linux header dependency
>>>> on barrier.h that is uneccessary in most parts. This patch implements
>>>> the two small defines that are needed from barrier.h. As a bonus, the
>>>> new implementations are faster than the default ones as they default
>>>> to sfence and lfence for x86, while we only need a compiler barrier in
>>>> our case. Just as it is when the same ring access code is compiled in
>>>> the kernel.
>>>>
>>>> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
>>>> Signed-off-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx>
>>>> ---
>>>>  tools/lib/bpf/xsk.h | 19 +++++++++++++++++--
>>>>  1 file changed, 17 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
>>>> index 3638147..317b44f 100644
>>>> --- a/tools/lib/bpf/xsk.h
>>>> +++ b/tools/lib/bpf/xsk.h
>>>> @@ -39,6 +39,21 @@ DEFINE_XSK_RING(xsk_ring_cons);
>>>>  struct xsk_umem;
>>>>  struct xsk_socket;
>>>>
>>>> +#if !defined bpf_smp_rmb && !defined bpf_smp_wmb
>>>> +# if defined(__i386__) || defined(__x86_64__)
>>>> +#  define bpf_smp_rmb() asm volatile("" : : : "memory")
>>>> +#  define bpf_smp_wmb() asm volatile("" : : : "memory")
>>>> +# elif defined(__aarch64__)
>>>> +#  define bpf_smp_rmb() asm volatile("dmb ishld" : : : "memory")
>>>> +#  define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
>>>> +# elif defined(__arm__)
>>>> +#  define bpf_smp_rmb() asm volatile("dmb ish" : : : "memory")
>>>> +#  define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
>>>> +# else
>>>> +#  error Architecture not supported by the XDP socket code in libbpf.
>>>> +# endif
>>>> +#endif
>>>
>>> Hmm, bit unfortunate, but I can see why it's needed inside the header here
>>> as it's installed into the system and needs to be included by applications.
>>> Fine with me. In any case, I still think we should also fix it for tooling
>>> infra to avoid others running into the same issue, lemme resurrect [0] from
>>> back in the days.
>>>
>>> One question though regarding above arm32 implementation. You match these
>>> barriers against the ones from af_xdp in kernel. So smp_* variants from the
>>> arch/arm/include/asm/barrier.h header need to be matched. This selects the
>>> dmb(ish) and dmb(ishst) variant. The implementation of dmb() however is
>>> quite CPU dependent, the above assumes to match with __LINUX_ARM_ARCH__ >= 7.
>>> Perhaps this assumption needs at minimum a comment to make it clear for
>>> developers.
>>
>> Good idea. I will put a comment in there to clarify this.
> 
> Since this is a userspace library, what about using the C11 atomic_thread_fence()
> from stdatomic.h - wouldn't that be sufficient?

These are paired with smp_{w,r}mb() in AF_XDP's kernel code, therefore they
would need the exact same counterpart in user space. We had similar discussion
earlier wrt perf ring buffer, conclusion was: the C11 memory model doesn't
generally align with the kernel's, but even if it would emit the same code,
it's still implementation dependent from compiler side, so one would need to
have a guarantee that the mapping from supported compilers must be compatible
at all times to the kernels for the various supported archs.

>>>   [0] https://lore.kernel.org/netdev/20181017144156.16639-2-daniel@xxxxxxxxxxxxx/
>>>
>>>>  static inline __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill,
>>>>                                             __u32 idx)
>>>>  {
>>>> @@ -119,7 +134,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb)
>>>>       /* Make sure everything has been written to the ring before signalling
>>>>        * this to the kernel.
>>>>        */
>>>> -     smp_wmb();
>>>> +     bpf_smp_wmb();
>>>>
>>>>       *prod->producer += nb;
>>>>  }
>>>> @@ -133,7 +148,7 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
>>>>               /* Make sure we do not speculatively read the data before
>>>>                * we have received the packet buffers from the ring.
>>>>                */
>>>> -             smp_rmb();
>>>> +             bpf_smp_rmb();
>>>>
>>>>               *idx = cons->cached_cons;
>>>>               cons->cached_cons += entries;
>>>>
>>>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux