Re: [PATCH v8 02/12] SIW main include file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----"Leon Romanovsky" <leon@xxxxxxxxxx> wrote: -----

>To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>
>From: "Leon Romanovsky" <leon@xxxxxxxxxx>
>Date: 05/05/2019 07:10PM
>Cc: linux-rdma@xxxxxxxxxxxxxxx, "Bernard Metzler"
><bmt@xxxxxxxxxxxxxxxxxxx>
>Subject: Re: [PATCH v8 02/12] SIW main include file
>
>On Sun, May 05, 2019 at 04:54:50PM +0000, Bernard Metzler wrote:
>> -----"Leon Romanovsky" <leon@xxxxxxxxxx> wrote: -----
>>
>> >To: "Bernard Metzler" <bmt@xxxxxxxxxxxxxx>
>> >From: "Leon Romanovsky" <leon@xxxxxxxxxx>
>> >Date: 04/28/2019 01:07PM
>> >Cc: linux-rdma@xxxxxxxxxxxxxxx, "Bernard Metzler"
>> ><bmt@xxxxxxxxxxxxxxxxxxx>
>> >Subject: Re: [PATCH v8 02/12] SIW main include file
>> >
>> >On Fri, Apr 26, 2019 at 03:18:42PM +0200, Bernard Metzler wrote:
>> >> From: Bernard Metzler <bmt@xxxxxxxxxxxxxxxxxxx>
>> >>
>> >> Signed-off-by: Bernard Metzler <bmt@xxxxxxxxxxxxxx>
>> >> ---
>> >>  drivers/infiniband/sw/siw/siw.h | 733
>> >++++++++++++++++++++++++++++++++
>> >>  1 file changed, 733 insertions(+)
>> >>  create mode 100644 drivers/infiniband/sw/siw/siw.h
>> >>
>> >> diff --git a/drivers/infiniband/sw/siw/siw.h
>> >b/drivers/infiniband/sw/siw/siw.h
>> >> new file mode 100644
>> >> index 000000000000..9a3c2abbd858
>> >> --- /dev/null
>> >> +++ b/drivers/infiniband/sw/siw/siw.h
>> >> @@ -0,0 +1,733 @@
>> >> +/* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */
>> >> +
>> >> +/* Authors: Bernard Metzler <bmt@xxxxxxxxxxxxxx> */
>> >> +/* Copyright (c) 2008-2019, IBM Corporation */
>> >> +
>> >> +#ifndef _SIW_H
>> >> +#define _SIW_H
>> >> +
>> >> +#include <linux/idr.h>
>> >> +#include <rdma/ib_verbs.h>
>> >> +#include <linux/socket.h>
>> >> +#include <linux/skbuff.h>
>> >> +#include <linux/in.h>
>> >> +#include <linux/fs.h>
>> >> +#include <linux/netdevice.h>
>> >> +#include <crypto/hash.h>
>> >> +#include <linux/resource.h> /* MLOCK_LIMIT */
>> >> +#include <linux/module.h>
>> >> +#include <linux/version.h>
>> >> +#include <linux/llist.h>
>> >> +#include <linux/mm.h>
>> >> +#include <linux/sched/signal.h>
>> >> +
>> >> +#include <rdma/siw_user.h>
>> >> +#include "iwarp.h"
>> >> +
>> >> +/* driver debugging enabled */
>> >> +#define DEBUG
>> >
>> >I clearly remember that we asked to remove this.
>>
>> Absolutely. Sorry, it sneaked in again since I did some
>> debugging. Will remove...
>> >
>> >> +	spinlock_t lock;
>> >> +
>> >> +	/* object management */
>> >> +	struct idr qp_idr;
>> >> +	struct idr mem_idr;
>> >
>> >Why IDR and not XArray?
>>
>> Memory access keys and QP IDs are generated as random
>> numbers, since both are exposed to the application.
>> Since XArray is not designed for sparsely distributed
>> id ranges, I am still in favor of IDR for these two
>> resources.
>
>Why do those number need to be "sparsely distributed"?
>Isn't this ID to query in some internal object database?
>
>At lease QP number is for sure can be sequential, there is no extra
>value of creating it to be random.
>
Yes, I can drop that for QP's. At the other hand, those
QP numbers are used by the application to reference the
right QP during siw_accept and siw_connect. So for
the application it becomes easy to guess amd hijack
a valid QP number, if it is not random. I am not sure
if the RDMA midlayer takes care if the application uses
the right QP number during accept/connect. I can add
checks into siw (e.g. right PD or some such).

>>
>> >
>> >> +	/* active objects statistics */
>> >
>> >refcount_t please
>> >
>> >> +	atomic_t num_qp;
>> >> +	atomic_t num_cq;
>> >> +	atomic_t num_pd;
>> >> +	atomic_t num_mr;
>> >> +	atomic_t num_srq;
>> >> +	atomic_t num_cep;
>> >> +	atomic_t num_ctx;
>> >> +
>> >
>>
>> These counters are only used to limit the amount of
>> resources allocated to their max values per device.
>>
>> Since there is no equivalent for atomic_inc_return()
>> for refcounters I'd suggest to stay with atomic_t:
>>
>>         refcount_inc(&sdev->num_mr);
>>         if (refcount_read(&sdev->num_mr) > SIW_MAX_MR) {
>>                 siw_dbg_pd(pd, "too many mr's\n");
>>                 rv = -ENOMEM;
>>                 goto err_out;
>>         }
>> vs.
>>         if (atomic_inc_return(&sdev->num_mr) > SIW_MAX_MR) {
>>                 siw_dbg_pd(pd, "too many mr's\n");
>>                 rv = -ENOMEM;
>>                 goto err_out;
>>         }
>>
>
>No problem, anyway we are planning to generalize it and remove from
>all drivers.
>
Okay
>>
>>
>> ><...>
>> >
>> >> +/*
>> >> + * Generic memory representation for registered siw memory.
>> >> + * Memory lookup always via higher 24 bit of STag (STag index).
>> >> + * Object relates to memory window if embedded mr pointer is
>valid
>> >> + */
>> >> +struct siw_mem {
>> >> +	struct siw_device *sdev;
>> >> +	struct kref ref;
>> >
>> ><...>
>> >
>> >> +struct siw_qp {
>> >> +	struct ib_qp base_qp;
>> >> +	struct siw_device *sdev;
>> >> +	struct kref ref;
>> >
>> >I wonder if kref is needed in driver code.
>> >
>> ><...>
>>
>> Memory and QP objects are, while generally maintained
>> by the RDMA midlayer, also guarded against deallocation
>> while still in use by the driver. The code tries to avoid
>> taking resource locks for operations like in flight RDMA reads
>> on a memory object. So it makes use of the release function
>> in kref_put().
>
>krefs are not replacing the locks, but protect from release
>during the operation on that object. I don't understand
>the connection that your draw here between RDMA midlayer
>and in-flight operations.

An in flight RDMA Read/Write to/from a memory must finish before
the memory gets invalidated. I avoid spinlocking + irq
disabling each time memory gets accessed.

>
>> >
>> >> +/* Varia */
>> >
>> >????
>> >
>> right, will remove useless comment.
>>
>> >> +extern void siw_cq_flush(struct siw_cq *cq);
>> >> +extern void siw_sq_flush(struct siw_qp *qp);
>> >> +extern void siw_rq_flush(struct siw_qp *qp);
>> >> +extern int siw_reap_cqe(struct siw_cq *cq, struct ib_wc *wc);
>> >> +extern void siw_print_hdr(union iwarp_hdr *hdr, int qp_id, char
>> >*string);
>> >> +#endif
>> >> --
>> >> 2.17.2
>> >>
>> >
>> >
>>
>
>




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux