On 04/11/2019 10:23 PM, Daniel Borkmann wrote: > On 04/11/2019 09:54 AM, Magnus Karlsson wrote: >> On Wed, Apr 10, 2019 at 9:08 PM Y Song <ys114321@xxxxxxxxx> wrote: >>> On Wed, Apr 10, 2019 at 12:21 AM Magnus Karlsson >>> <magnus.karlsson@xxxxxxxxx> wrote: >>>> >>>> The use of smp_rmb() and smp_wmb() creates a Linux header dependency >>>> on barrier.h that is uneccessary in most parts. This patch implements >>>> the two small defines that are needed from barrier.h. As a bonus, the >>>> new implementations are faster than the default ones as they default >>>> to sfence and lfence for x86, while we only need a compiler barrier in >>>> our case. Just as it is when the same ring access code is compiled in >>>> the kernel. >>>> >>>> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") >>>> Signed-off-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> >>>> --- >>>> tools/lib/bpf/xsk.h | 20 ++++++++++++++++++-- >>>> 1 file changed, 18 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h >>>> index 3638147..69136d9 100644 >>>> --- a/tools/lib/bpf/xsk.h >>>> +++ b/tools/lib/bpf/xsk.h >>>> @@ -39,6 +39,22 @@ DEFINE_XSK_RING(xsk_ring_cons); >>>> struct xsk_umem; >>>> struct xsk_socket; >>>> >>>> +#if !defined bpf_smp_rmb && !defined bpf_smp_wmb >>> >>> Maybe add some comments to explain the different between bpf_smp_{r,w}mb >>> and smp_{r,w}mb so later users will have a better idea which to pick? >> >> Ouch, that is a hard one. I would just recommend people to read >> Documentation/memory-barriers.txt. My attempt at explaining all this >> would not be pretty and likely sprinkled with errors ;-). > > I think Yonghong meant here place a comment wrt when to use the below versus > when to use smp_{r,w}mb(). Both are essentially the same just that the main > difference here would be that this header needs to be installed in the system > so users need to have it. I think it indeed makes sense to add a comment about > this specific fact otherwise we might forget about it in few months. > >>>> +# if defined(__i386__) || defined(__x86_64__) >>>> +# define bpf_smp_rmb() asm volatile("" : : : "memory") >>>> +# define bpf_smp_wmb() asm volatile("" : : : "memory") >>>> +# elif defined(__aarch64__) >>>> +# define bpf_smp_rmb() asm volatile("dmb ishld" : : : "memory") >>>> +# define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory") >>>> +# elif defined(__arm__) >>>> +/* These are only valid for armv7 and above */ >>>> +# define bpf_smp_rmb() asm volatile("dmb ish" : : : "memory") >>>> +# define bpf_smp_wmb() asm volatile("dmb ishst" : : : "memory") >>>> +# else >>>> +# error Architecture not supported by the XDP socket code in libbpf. >>>> +# endif >>>> +#endif >>> >>> Since this is generic enough and could be used by other files as well, >>> maybe put it into libbpf_util.h? > > Hmm, maybe a good point. We could place it into libbpf.h as there is already > various misc helpers and xsk.h includes it anyway. But: if we do that, then > the above 'else' part would need some generic fallback (__sync_synchronize() > plus a warning?) as otherwise compilation would break for everyone with 'error'. > Ideally this should then cover as much as possible from mainstream archs though. (And if so then prefixed with libbpf_smp_{r,w}mb() to denote it's misc libbpf internal function.) >> Good question. Do not know. Daniel suggested introducing [0] and >> perhaps that can be used by the broader libbpf code base? The >> important part for this patch set is that these operations match the >> ones in the kernel on the other end of the ring. > > Yeah, it can be used generally except for headers that are going to be > installed where these are present in inline helper functions. > >> [0] https://lore.kernel.org/netdev/20181017144156.16639-2-daniel@xxxxxxxxxxxxx/ >> >>>> + >>>> static inline __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, >>>> __u32 idx) >>>> { >>>> @@ -119,7 +135,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb) >>>> /* Make sure everything has been written to the ring before signalling >>>> * this to the kernel. >>>> */ >>>> - smp_wmb(); >>>> + bpf_smp_wmb(); >>>> >>>> *prod->producer += nb; >>>> } >>>> @@ -133,7 +149,7 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons, >>>> /* Make sure we do not speculatively read the data before >>>> * we have received the packet buffers from the ring. >>>> */ >>>> - smp_rmb(); >>>> + bpf_smp_rmb(); >>> >>> Could you explain why a compiler barrier is good enough here on x86? Note that >>> the load cons->cached_cons could be reordered with earlier >>> non-overlapping stores >>> at runtime. >> >> The bpf_smp_rmb() is there to protect the data in the ring itself to >> be read by the consumer before the producer has signaled that it has >> finished “producing” them by updating the producer (head) pointer. As >> stores are not reordered with other stores on x86 (nor loads with >> other loads), the update of the producer pointer will always be >> observed after the writing of the data in the ring, as that is done >> before the update of the producer pointer in xsk_ring_prod__submit(). >> One side only updates and the other side only reads. cached_cons is a >> local variable and only for operations done by another core can we >> observe loads being reordered with older stores to different >> locations. Since no one else is touching cached_cons, this will not >> happen. > > From perf RB side, I found this one kernel/events/ring_buffer.c +72 to > be very helpful. It's independent of this series, but I would appreciate > if you could make similar scheme / comment somewhere in the AF_XDP code > such that all barriers in there can be more easily followed wrt how they > pair to user space. > > Thanks, > Daniel > >> /Magnus >> >>>> >>>> *idx = cons->cached_cons; >>>> cons->cached_cons += entries; >>>> -- >>>> 2.7.4 >>>> >