Re: [PATCH bpf-next 2/3] bpf, x86: Don't generate lock prefix for BPF_XCHG

Hou Tao <houtao@xxxxxxxxxxxxxxx> · Thu, 21 Dec 2023 19:37:03 +0800

Hi,

On 12/21/2023 2:46 AM, Daniel Borkmann wrote:
> On 12/20/23 7:25 PM, Alexei Starovoitov wrote:
>> On Wed, Dec 20, 2023 at 6:58 AM Daniel Borkmann
>> <daniel@xxxxxxxxxxxxx> wrote:
>>>
>>> On 12/19/23 2:56 PM, Hou Tao wrote:
>>>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>>>
>>>> According to the implementation of atomic_xchg() under x86-64, the
>>>> lock
>>>> prefix is not necessary for BPF_XCHG atomic operation, so just remove
>>>> it.
>>>
>>> It's probably a good idea for the commit message to explicitly quote
>>> the
>>> Intel docs in here, so it's easier to find on why the lock prefix would
>>> not be needed for the xchg op.
>>
>> It's a surprise to me as well.
>> Definitely more info would be good.
>>
>> Also if xchg insn without lock somehow implies lock in the HW
>> what is the harm of adding it explicitly?
>> If it's a lock in HW than performance with and without lock prefix
>> should be the same, right?
>
> e.g. 7.3.1.2 Exchange Instructions says:
>
>   The XCHG (exchange) instruction swaps the contents of two operands.
> This
>   instruction takes the place of three MOV instructions and does not
> require
>   a temporary location to save the contents of one operand location while
>   the other is being loaded. When a memory operand is used with the XCHG
>   instruction, the processor’s LOCK signal is automatically asserted.
>
> Also curious if there is any harm adding it explicitly.
>
> .

I could use the bpf ma benchmark to test it, but I doubt it will make
any visible difference.