Re: What is synchronizing MMIO-writes on a shared UAR?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My understanding is that an sfence guarantees that all the writes
before the barrier are executed prior to the writes after the barrier.
But the sfence doesn't guarantee the order in which the writes before
the barrier happen. So when both the writes (by separate threads) to
bfreg_0 and bfreg_1 on the same UAR occur before the sfence, how are
those writes serialized?

Thanks,
Rohit

On Mon, Mar 12, 2018 at 12:09 AM, Anuj Kalia <anujkaliaiitd@xxxxxxxxx> wrote:
> Why is wc_wmb() not sufficient? It flushes all WC buffers in the CPU
> core (around 10 per core).
>
> On Sun, Mar 11, 2018 at 9:07 PM, Rohit Zambre <rzambre@xxxxxxx> wrote:
>> Hi,
>>
>> When we create 2 QPs each with a separate context, the QPs are
>> naturally assigned to different bfregs on different UAR pages. When we
>> create 2 QPs within the same context, the QPs are assigned to
>> different bfregs but on the same UAR page. The first 2 QPs are
>> assigned to low_lat_uuars and so, no locks are taken while writing to
>> the different bfregs. However, the Mellanox PRM states that doorbells
>> to the same UAR page must be serialized. I see the serialization
>> effect when I post 1 WQE-per-ibv_post_send in the graph attached
>> (multi-threaded 1 thread-per-QP using MOFED). But I'm failing to
>> understand how this serialization is enforced in the 1 context case:
>>
>> The only synchronization mechanism I see is the sfence barrier. The
>> sfence is imperative in the multiple-threads-per-QP case when the
>> order of doorbells needs to be preserved. But how does this sfence
>> synchronize writes to different bfregs of the same UAR? Since the
>> message size is 2 bytes, each of the 2 QPs' MMIO-writes is only 64
>> bytes. My understanding is that the size of the write-combining buffer
>> is 64 bytes. How many WC buffers are there per UAR page?
>>
>> Here is the doorbell ringing code from MOFED-4.1
>>
>>     case MLX5_DB_METHOD_DEDIC_BF:
>>         /* The QP has dedicated blue-flame */
>>
>>         /*
>>          * Make sure that descriptors are written before
>>          * updating doorbell record and ringing the doorbell
>>          */
>>         wmb();
>>         qp->gen_data.db[MLX5_SND_DBR] = htonl(curr_post);
>>
>>         /* This wc_wmb ensures ordering between DB record and BF copy */
>>         wc_wmb();
>>         if (size <= bf->buf_size / 64)
>>             mlx5_bf_copy(bf->reg + bf->offset, seg,
>>                      size * 64, qp);
>>         else
>>             mlx5_write_db(bf->reg + bf->offset, seg);
>>         /*
>>          * use wc_wmb to ensure write combining buffers are flushed out
>>          * of the running CPU. This must be carried inside the spinlock.
>>          * Otherwise, there is a potential race. In the race, CPU A
>>          * writes doorbell 1, which is waiting in the WC buffer. CPU B
>>          * writes doorbell 2, and it's write is flushed earlier. Since
>>          * the wc_wmb is CPU local, this will result in the HCA seeing
>>          * doorbell 2, followed by doorbell 1.
>>          */
>>         wc_wmb();
>>         bf->offset ^= bf->buf_size;
>>         break;
>>
>> Thanks,
>> Rohit
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux