On Mon, Apr 30, 2018 at 4:14 AM, Yishai Hadas <yishaih@xxxxxxxxxxxxxxxxxx> wrote: > On 4/29/2018 8:34 PM, Rohit Zambre wrote: >> >> On Sun, Apr 29, 2018 at 11:12 AM, Yishai Hadas >> <yishaih@xxxxxxxxxxxxxxxxxx> wrote: >>> >>> >>> On 4/23/2018 8:58 PM, Rohit Zambre wrote: >>>> >>>> >>>> The calculation of need_uuar_lock rightly accounts for at least 1 medium >>>> bfreg. >>> >>> >>> >>> The code should support even 0 medium bfregs comparing the suggested >>> patch below. >> >> >> So when the user sets MLX5_TOTAL_UUARS=16 and >> MLX5_NUM_LOW_LAT_UUARS=15, the first 15 QPs will map to the 15 low >> latency bfregs. To which bfreg will the 16th QP map to? > > > The 16th QP expects to get bfreg index 0 which is returned from the kernel > once there are *no* low latency high class nor medium bfregs available. In > this case DB is used in rdma-core instead of BF and no lock is needed. (see > usage of bfs[bfi].buf_size) OK, so all the QPs after the 15th one will map to bfreg_0. I am trying to understand why a lock would not be needed on bfreg_0. I know that the DB is a 64-bit atomic. Is there an implicit flush/sfence with the atomic? When CPU_A posts on the 16th QP and CPU_B posts on the 17th QP simultaneously, without the lock, what is guaranteeing that the device will see both the DBs? > However, looking in the kernel code, it seems that there is some bug in > 'first_med_bfreg()' which had to handle this non-default case when there are > no medium bfregs. > > Below patch expects to fix that, can you please give it a try and confirm > the fix ? this should come in parallel with the candidate patch that I have > sent yesterday in rdma-core in need_uuar_lock(). Both the rdma-core and kernel patch will fix the problem, given that the device will see all the doorbells written simultaneously to bfreg_0 without a lock. In the rdma-core patch, mapping the uuarn index to i is quite specific to the values defined as MLX5 enums. I would do it the following way but the final decision is up to you. static int need_uuar_lock(struct mlx5_context *ctx, int uuarn) { + int i; + if (uuarn == 0 || mlx5_single_threaded) return 0; - if (uuarn >= (ctx->tot_uuars - ctx->low_lat_uuars) * 2) + i = (uuarn / NUM_BFREGS_PER_UAR * MLX5_NUM_NON_FP_BFREGS_PER_UAR) + (uuarn % NUM_BFREGS_PER_UAR); + if (i >= ctx->tot_uuars - ctx->low_lat_uuars) return 0; Thanks, Rohit -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html