Hi, On 12/21/2023 2:46 AM, Daniel Borkmann wrote: > On 12/20/23 7:25 PM, Alexei Starovoitov wrote: >> On Wed, Dec 20, 2023 at 6:58 AM Daniel Borkmann >> <daniel@xxxxxxxxxxxxx> wrote: >>> >>> On 12/19/23 2:56 PM, Hou Tao wrote: >>>> From: Hou Tao <houtao1@xxxxxxxxxx> >>>> >>>> According to the implementation of atomic_xchg() under x86-64, the >>>> lock >>>> prefix is not necessary for BPF_XCHG atomic operation, so just remove >>>> it. >>> >>> It's probably a good idea for the commit message to explicitly quote >>> the >>> Intel docs in here, so it's easier to find on why the lock prefix would >>> not be needed for the xchg op. >> >> It's a surprise to me as well. >> Definitely more info would be good. >> >> Also if xchg insn without lock somehow implies lock in the HW >> what is the harm of adding it explicitly? >> If it's a lock in HW than performance with and without lock prefix >> should be the same, right? > > e.g. 7.3.1.2 Exchange Instructions says: > > The XCHG (exchange) instruction swaps the contents of two operands. > This > instruction takes the place of three MOV instructions and does not > require > a temporary location to save the contents of one operand location while > the other is being loaded. When a memory operand is used with the XCHG > instruction, the processor’s LOCK signal is automatically asserted. > > Also curious if there is any harm adding it explicitly. > > . I could use the bpf ma benchmark to test it, but I doubt it will make any visible difference.