Re: [PATCH net-next v2 16/18] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx>
Date: Tue, 22 Oct 2024 17:42:13 +0200

> On Tue, Oct 15, 2024 at 04:53:48PM +0200, Alexander Lobakin wrote:
>> Currently, when you send an XSk frame without metadata, you need to do
> 
> you meant *with* metadata?

Eeeeh... Maybe, I forgot already what I wanted to say =\

> 
>> the following:
>>
>> * call external xsk_buff_raw_get_dma();
>> * call inline xsk_buff_get_metadata(), which calls external
>>   xsk_buff_raw_get_data() and then do some inline checks.
>>
>> This effectively means that the following piece:
>>
>> addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
>>
>> is done twice per frame, plus you have 2 external calls per frame, plus
>> this:
>>
>> 	meta = pool->addrs + addr - pool->tx_metadata_len;
>> 	if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
>>
>> is always inlined, even if there's no meta or it's invalid.
> 
> when there is no meta you bail out early in xsk_buff_get_metadata() as
> tx_metadata_len was not set, no?

Yes, but this code is still inlined.
See below (at the end of the reply).

> 
>>
>> Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
>> in one go. It returns a small structure with 2 fields: DMA address,
>> filled unconditionally, and metadata pointer, valid only if it's
>> present. The address correction is performed only once and you also
>> have only 1 external call per XSk frame, which does all the calculations
>> and checks outside of your hotpath. You only need to check
>> `if (ctx.meta)` for the metadata presence.
> 
> IMHO adding this might confuse future users which approach should be
> preferred.

It's a regular practice in the kernel that we have several functions to
do +/- the same. It's up to the developer which one to pick, he reads
the code and decides himself.

> 
> Thinking out loud...couldn't we export address correction logic and pass
> the corrected addr to xsk_buff_get_metadata and then add it to
> pool->addrs. But that would require modifying existing callsites +
> addressing xp_raw_get_dma() as well :<

Yes, modifying current API requires touching the users.
+ keeping xsk_buff_get_metadata negates most the main purpose of this
patch, see below.

> 
> Standard question - any perf improvement when micro benchmarking? :P

TBH I didn't test before/after with the meta enabled, but it was enough
for me that using this function instead of the get_dma + get_meta pair
reduces the object code size by 1 Kb when unrolling by 8.

Thanks,
Olek




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux