Re: [RFC PATCH for-next 3/3] libhns: Add support for SVE Direct WQE function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 31, 2023 at 11:38:26AM +0800, xuhaoyue (A) wrote:
> 
> 
> On 2023/3/30 21:01:20, Jason Gunthorpe wrote:
> > On Thu, Mar 30, 2023 at 08:57:41PM +0800, xuhaoyue (A) wrote:
> >>
> >>
> >> On 2023/3/27 20:55:59, Jason Gunthorpe wrote:
> >>> On Mon, Mar 27, 2023 at 08:53:35PM +0800, xuhaoyue (A) wrote:
> >>>
> >>>>>>  static void hns_roce_write512(uint64_t *dest, uint64_t *val)
> >>>>>>  {
> >>>>>>  	mmio_memcpy_x64(dest, val, sizeof(struct hns_roce_rc_sq_wqe));
> >>>>>> @@ -314,7 +319,10 @@ static void hns_roce_write_dwqe(struct hns_roce_qp *qp, void *wqe)
> >>>>>>  	hr_reg_write(rc_sq_wqe, RCWQE_DB_SL_H, qp->sl >> HNS_ROCE_SL_SHIFT);
> >>>>>>  	hr_reg_write(rc_sq_wqe, RCWQE_WQE_IDX, qp->sq.head);
> >>>>>>  
> >>>>>> -	hns_roce_write512(qp->sq.db_reg, wqe);
> >>>>>> +	if (qp->flags & HNS_ROCE_QP_CAP_SVE_DIRECT_WQE)
> >>>>>
> >>>>> Why do you need a device flag here?
> >>>>
> >>>> Our CPU die can support NEON instructions and SVE instructions,
> >>>> but some CPU dies only have SVE instructions that can accelerate our direct WQE performance.
> >>>> Therefore, we need to add such a flag bit to distinguish.
> >>>
> >>> NEON vs SVE is available to userspace already, it shouldn't come
> >>> throuhg a driver flag. You need another reason to add this flag
> >>>
> >>> The userspace should detect the right instruction to use based on the
> >>> cpu flags using the attribute stuff I pointed you at
> >>>
> >>> Jason
> >>> .
> >>>
> >>
> >> We optimized direct wqe based on different instructions for
> >> different CPUs, but the architecture of the CPUs is the same and
> >> supports both SVE and NEON instructions.  We plan to use cpuid to
> >> distinguish between them. Is this more reasonable?
> > 
> > Uhh, do you mean certain CPUs won't work with SVE and others won't
> > work with NEON?
> > 
> > That is quite horrible
> > 
> > Jason
> > .
> > 
> 
> No, acctually for general scenarios, our CPU supports two types of instructions, SVE and NEON.
> However, for the CPU that requires high fp64 floating-point computing power, the SVE instruction is enhanced and the NEON instruction is weakened.

Ideally the decision of what CPU instruction to use will be made by
rdma-core, using the the various schemes for dynamic link time
selection

It should apply universally to all providers

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux