On Tue, May 1, 2018 at 2:42 AM, Yishai Hadas <yishaih@xxxxxxxxxxxxxxxxxx> wrote: > On 4/30/2018 9:23 PM, Rohit Zambre wrote: >> >> On Mon, Apr 30, 2018 at 4:14 AM, Yishai Hadas >> <yishaih@xxxxxxxxxxxxxxxxxx> wrote: >>> >>> On 4/29/2018 8:34 PM, Rohit Zambre wrote: >>>> >>>> >>>> On Sun, Apr 29, 2018 at 11:12 AM, Yishai Hadas >>>> <yishaih@xxxxxxxxxxxxxxxxxx> wrote: >>>>> >>>>> >>>>> >>>>> On 4/23/2018 8:58 PM, Rohit Zambre wrote: >>>>>> >>>>>> >>>>>> >>>>>> The calculation of need_uuar_lock rightly accounts for at least 1 >>>>>> medium >>>>>> bfreg. >>>>> >>>>> >>>>> >>>>> >>>>> The code should support even 0 medium bfregs comparing the suggested >>>>> patch below. >>>> >>>> >>>> >>>> So when the user sets MLX5_TOTAL_UUARS=16 and >>>> MLX5_NUM_LOW_LAT_UUARS=15, the first 15 QPs will map to the 15 low >>>> latency bfregs. To which bfreg will the 16th QP map to? >>> >>> >>> >>> The 16th QP expects to get bfreg index 0 which is returned from the >>> kernel >>> once there are *no* low latency high class nor medium bfregs available. >>> In >>> this case DB is used in rdma-core instead of BF and no lock is needed. >>> (see >>> usage of bfs[bfi].buf_size) >> >> >> OK, so all the QPs after the 15th one will map to bfreg_0. I am trying >> to understand why a lock would not be needed on bfreg_0. I know that >> the DB is a 64-bit atomic. Is there an implicit flush/sfence with the >> atomic? When CPU_A posts on the 16th QP and CPU_B posts on the 17th QP >> simultaneously, without the lock, what is guaranteeing that the device >> will see both the DBs? >> > > It's guaranteed, 64-bit writing is atomic on 64 bit systems, for 32 bit > systems there is some global lock that is taken as part of mmio_write64_be() > implementation. > >>> However, looking in the kernel code, it seems that there is some bug in >>> 'first_med_bfreg()' which had to handle this non-default case when there >>> are >>> no medium bfregs. >>> >>> Below patch expects to fix that, can you please give it a try and confirm >>> the fix ? this should come in parallel with the candidate patch that I >>> have >>> sent yesterday in rdma-core in need_uuar_lock(). >> >> >> Both the rdma-core and kernel patch will fix the problem, given that >> the device will see all the doorbells written simultaneously to >> bfreg_0 without a lock. > > > Thanks for confirming those patches, will take them into our regression > system to be formally approved before sending them. Thanks for looking into it! -Rohit -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html