On Wed, Feb 14, 2018 at 4:08 PM, Anuj Kalia <anujkaliaiitd@xxxxxxxxx> wrote: > > 1. FP stands for fast path > 2. Each UAR page has four BlueFlame registers. This is specific to the > NIC and firmware and is mentioned in the PRM. ConnectX-5 InfiniBand > has four, ConnectX-4 in Ethernet mode seems to have two according to > the online PRM. > 3. No idea > > I'm curious: are you seeing any performance impact of > "latency_sensitve" or "low latency UARs"? Sorry for the delayed response. Getting back to this topic only now. I do see some performance impact on the QPs assigned to low_lat_uuars versur other uUARs. But I see them when I take off some opitmizations such as postlist (so there are a lot more doorbells). Also, if you set MLX5_SINGLE_THREADED=1 while running your bench, the locks are never taken and also won't see any difference in performance between the different uUARs. I've looked at this briefly > in the past and couldn't find perf impact. IIRC, the driver uses "fast > path" for only the first few QPs, but in my experiment even later QPs > got similar latency. But I might have missed something. If there's an > observable perf impact, you can probably figure out the answer to (3). > > --Anuj I studied the code a little further and here are some questions below: (1) It seems like only BFREG_0 and BFREG_1 of a UAR are being used to ring the doorbell. BFREG_2 and BFREG_3, while being allocated in `ibv_open_device`, are not assigned to the QPs belonging to the same context.The PRM says that BFREG_2 and BFREG_3 of a UAR are used for "fast path posting" (page 65). What do these "fast path" operations refer to: are they control operations? (2) From my understanding of `allocate_uars` and `alloc_bfreg` kernel functions, there is no difference (in terms of hardware uUAR functionality) between the `low_lat_uuars` and the other uUARs that are being allocated during context creation. The only reason why `low_lat_uuars` are called so is that there is no lock taken for the QP's assigned to the `low_lat_uuars`, hence "low latency". The QPs assigned to other uUARs will take a lock and might contend for it if multiple QPs are assigned the same uUAR, hence not low latency. Can someone (maybe, Yishai Hadas) confirm this? -- Rohit > > > > > > On Wed, Feb 14, 2018 at 10:35 AM, Rohit Zambre <rzambre@xxxxxxx> wrote: > > Hi, > > > > I am trying to understand the uuar-related calculations of the > > userspace mlx5 driver (in mlx5_alloc_context) that use the various > > constants introduced in > > https://github.com/linux-rdma/rdma-core/commit/166d34841dd6a835e4f4f0196143880a73ce71bd. > > > > My understanding is that a uar is comprised of uuars. bfregs mean the > > same thing as uuars. A doorbell is written to the bfreg/uuar assigned > > to the QP. > > > > (1) W.r.t. MLX5_NUM_NON_FP_BFREGS_PER_UAR, what is a NON_FP_BFREG in a > > UAR? I am curious to know what FP stands for. > > > > (2) Is the value of NUM_BFREGS_PER_UAR set to 4 because the driver > > will have at least 4 low_lat_uuars for a context? This constant is > > used only in the calculation of gross_uuars. My assumption is that the > > NUM_BFREGS_PER_UAR value is not specific to the mlx5 device because of > > the missing MLX5_ prefix. Please correct me if I am wrong. > > > > (3) Given that the value of tot_uuars and low_lat_uuars can be > > configured through the MLX5_TOTAL_UUARS and MLX5_NUM_LOW_LAT_UUARS > > environment variables respectively, it is possible for me to have, in > > 1 context, 512 tot_uuars (=MLX5_MAX_BFREGS) and set all 511 of them as > > low_lat_uuars. Is this correct? > > > > Thanks, > > > > Rohit Zambre > > Ph.D. Student, Computer Engineering > > University of California, Irvine > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html