Re: [linux-rdma/rdma-core] rxe extended verbs support (#878)

Bob Pearson <rpearsonhpe@xxxxxxxxx> · Mon, 7 Dec 2020 22:52:53 -0600

On 12/7/20 1:42 PM, Pearson, Robert B wrote:
> 
> 
> From: Jason Gunthorpe <notifications@xxxxxxxxxx>
> Sent: Monday, December 7, 2020 11:17 AM
> To: linux-rdma/rdma-core <rdma-core@xxxxxxxxxxxxxxxxxx>
> Cc: Pearson, Robert B <robert.pearson2@xxxxxxx>; Author <author@xxxxxxxxxxxxxxxxxx>
> Subject: Re: [linux-rdma/rdma-core] rxe extended verbs support (#878)
> 
> 
> @jgunthorpe requested changes on this pull request.
> 
> ________________________________
> 
> In providers/rxe/rxe_queue.h<https://github.com/linux-rdma/rdma-core/pull/878#discussion_r537679241>:
> 
>>       atomic_store(
> 
>             &q->consumer_index,
> 
>             (atomic_load_explicit(&q->consumer_index, memory_order_relaxed) +
> 
>              1) &
> 
>                q->index_mask);
> 
>  }
> 
> 
> 
> +/* Must hold producer_index lock */
> 
> +static inline uint32_t load_producer_index(struct rxe_queue *q)
> 
> +{
> 
> +       return atomic_load_explicit(&q->producer_index,
> 
> +                                  memory_order_relaxed);
> 
> +}
> 
> +
> 
> +/* Must hold producer_index lock */
> 
> +static inline void store_producer_index(struct rxe_queue *q, uint32_t index)
> 
> +{
> 
> +       /* flush writes to work queue before moving index */
> 
> +       atomic_thread_fence(memory_order_release);
> 
> +       atomic_store(&q->producer_index, index);
> 
> This combination is just atomic_store_explicit(... memory_order_release);
> 
> ________________________________
> 
> In providers/rxe/rxe_queue.h<https://github.com/linux-rdma/rdma-core/pull/878#discussion_r537679461>:
> 
>> +                                memory_order_relaxed);
> 
> +}
> 
> +
> 
> +/* Must hold producer_index lock */
> 
> +static inline void store_producer_index(struct rxe_queue *q, uint32_t index)
> 
> +{
> 
> +       /* flush writes to work queue before moving index */
> 
> +       atomic_thread_fence(memory_order_release);
> 
> +       atomic_store(&q->producer_index, index);
> 
> +}
> 
> +
> 
> +/* Must hold consumer_index lock */
> 
> +static inline uint32_t load_consumer_index(struct rxe_queue *q)
> 
> +{
> 
> +       return atomic_load_explicit(&q->consumer_index,
> 
> +                                  memory_order_relaxed);
> 
> I suspect this should be memory_order_acquire
> 
> ________________________________
> 
> In providers/rxe/rxe_queue.h<https://github.com/linux-rdma/rdma-core/pull/878#discussion_r537679679>:
> 
>> +     atomic_store(&q->producer_index, index);
> 
> +}
> 
> +
> 
> +/* Must hold consumer_index lock */
> 
> +static inline uint32_t load_consumer_index(struct rxe_queue *q)
> 
> +{
> 
> +       return atomic_load_explicit(&q->consumer_index,
> 
> +                                  memory_order_relaxed);
> 
> +}
> 
> +
> 
> +/* Must hold consumer_index lock */
> 
> +static inline void store_consumer_index(struct rxe_queue *q, uint32_t index)
> 
> +{
> 
> +       /* flush writes to work queue before moving index */
> 
> +       atomic_thread_fence(memory_order_release);
> 
> +       atomic_store(&q->consumer_index, index);
> 
> atomic_store_explicit
> 
> ________________________________
> 
> In providers/rxe/rxe.c<https://github.com/linux-rdma/rdma-core/pull/878#discussion_r537680251>:
> 
>> @@ -162,20 +162,144 @@ static int rxe_dereg_mr(struct verbs_mr *vmr)
> 
>         return 0;
> 
>  }
> 
> 
> 
> +static int cq_start_poll(struct ibv_cq_ex *current,
> 
> +                       struct ibv_poll_cq_attr *attr)
> 
> +{
> 
> +       struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex);
> 
> +
> 
> +       pthread_spin_lock(&cq->lock);
> 
> +
> 
> +       atomic_thread_fence(memory_order_acquire);
> 
> atomic_thread_fence/atomic_XX pairs should always be merged into an atomic_XX_explicit operation for performance
> 
> ________________________________
> 
> In providers/rxe/rxe.c<https://github.com/linux-rdma/rdma-core/pull/878#discussion_r537680632>:
> 
>> +
> 
> +       if (attr->wc_flags & IBV_WC_EX_WITH_SL)
> 
> +              cq->vcq.cq_ex.read_sl
> 
> +                      = cq_read_sl;
> 
> +
> 
> +       if (attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS)
> 
> +              cq->vcq.cq_ex.read_dlid_path_bits
> 
> +                      = cq_read_dlid_path_bits;
> 
> +
> 
> +       return &cq->vcq.cq_ex;
> 
> +
> 
> +err_unmap:
> 
> +       if (cq->mmap_info.size)
> 
> +              munmap(cq->queue, cq->mmap_info.size);
> 
> +err_destroy:
> 
> +       (void)ibv_cmd_destroy_cq(&cq->vcq.cq);
> 
> No (void) please
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub<https://github.com/linux-rdma/rdma-core/pull/878#pullrequestreview-546359978>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ARXBUZWZIICJ6IQJYHRZJYLSTUEYJANCNFSM4TT4I57Q>.
> 

Jason,

This is the first time I have ever used the atomic_xx APIs so apologies if I got this wrong.
I updated the PR to what I think is correct and removed the 2 (void) casts.

There are two types of memory access issues. (1) Between kernel and user space there is no explicit locking and
synchronization is achieved by separate producer and consumer indices which must be atomic and need to fence
memory operations accessing the WC and WR structs before changing the indices. This uses acquire/release
memory ordering. (2) Between separate user threads there is explicit locking which provides a memory barrier so relaxed memory ordering can be used. The existing code was a bit of a hodge-podge. I think it is correct now
but it needs a look over. If I understand this correctly most of this turns into no-ops for X86_64 architectures
so I can't really test this.

Synchronization between kernel threads is also protected by spinlocks. The current PR is complete as far as user
space is concerned but a similar exercise will be required to optimize the atomic accesses in the kernel driver and
fix up the inline functions in (kernel)rxe_queue.h to match the user space ones. It may make sense to move rxe_queue.h into kernel_headers/rdma/ or merge with rdma_user_rxe.h. Thoughts?

Bob