Re: [RFC PATCH for-next 3/3] libhns: Add support for SVE Direct WQE function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/3/27, Haoyue Xu wrote:
On 2023/3/23 3:02:47, Jason Gunthorpe wrote:
> On Sat, Feb 25, 2023 at 06:02:53PM +0800, Haoyue Xu wrote:
> 
>> +
>> +set_source_files_properties(hns_roce_u_hw_v2.c PROPERTIES COMPILE_FLAGS "${SVE_FLAGS}")
>> diff --git a/providers/hns/hns_roce_u_hw_v2.c b/providers/hns/hns_roce_u_hw_v2.c
>> index 3a294968..bd457217 100644
>> --- a/providers/hns/hns_roce_u_hw_v2.c
>> +++ b/providers/hns/hns_roce_u_hw_v2.c
>> @@ -299,6 +299,11 @@ static void hns_roce_update_sq_db(struct hns_roce_context *ctx,
>>  	hns_roce_write64(qp->sq.db_reg, (__le32 *)&sq_db);
>>  }
>>  
>> +static void hns_roce_sve_write512(uint64_t *dest, uint64_t *val)
>> +{
>> +	mmio_memcpy_x64_sve(dest, val);
>> +}
> 
> This is not the right way, you should make this work like the x86 SSE
> stuff, using a "__attribute__((target(xx)))"
> 
> Look in util/mmio.c and implement a mmio_memcpy_x64 for ARM SVE
> 
> mmio_memcpy_x64 is defined to try to generate a 64 byte PCI-E TLP.
> 
> If you don't want or can't handle that then you should write your own
> loop of 8 byte stores.
> 

We will refer to the mmio.c and make a new version, reflected in v2.


>>  static void hns_roce_write512(uint64_t *dest, uint64_t *val)
>>  {
>>  	mmio_memcpy_x64(dest, val, sizeof(struct hns_roce_rc_sq_wqe));
>> @@ -314,7 +319,10 @@ static void hns_roce_write_dwqe(struct hns_roce_qp *qp, void *wqe)
>>  	hr_reg_write(rc_sq_wqe, RCWQE_DB_SL_H, qp->sl >> HNS_ROCE_SL_SHIFT);
>>  	hr_reg_write(rc_sq_wqe, RCWQE_WQE_IDX, qp->sq.head);
>>  
>> -	hns_roce_write512(qp->sq.db_reg, wqe);
>> +	if (qp->flags & HNS_ROCE_QP_CAP_SVE_DIRECT_WQE)
> 
> Why do you need a device flag here?

Our CPU die can support NEON instructions and SVE instructions,
but some CPU dies only have SVE instructions that can accelerate our direct WQE performance.
Therefore, we need to add such a flag bit to distinguish.


> 
>> +		hns_roce_sve_write512(qp->sq.db_reg, wqe);
>> +	else
>> +		hns_roce_write512(qp->sq.db_reg, wqe);
> 
> Isn't this function being called on WC memory already? The usual way
> to make the 64 byte write is with stores to WC memory..
> 
> Jason
> .
> 
We are currently using WC memory.

Sincerely,
Haoyue



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux