Re: [PATCH 02/10] IB/uverbs: Build the specs into a radix tree at runtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 06, 2018 at 01:30:20PM -0600, Jason Gunthorpe wrote:
> On Mon, Aug 06, 2018 at 08:32:35AM +0300, Leon Romanovsky wrote:
> > On Fri, Aug 03, 2018 at 01:31:35PM -0600, Jason Gunthorpe wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> > >
> > > This radix tree datastructure is intended to replace the 'hash' structure
> > > used today for parsing ioctl methods during system calls. This first
> > > commit introduces the structure and builds it from the existing .rodata
> > > descriptions.
> > >
> > > The so-called hash arrangement is actually a 5 level open coded radix tree.
> > > This new version uses a 3 level radix tree built using the radix tree
> > > library.
> > >
> > > Overall this is much less code and much easier to build as the radix tree
> > > API allows for dynamic modification during the building. There is a small
> > > memory penalty to pay for this, but since the radix tree is allocated on
> > > a per device basis, a few kb of RAM seems immaterial considering the
> > > gained simplicity.
> > >
> > > The radix tree is similar to the existing tree, but also has a 'attr_bkey'
> > > concept, which is a small value'd index for each method attribute. This is
> > > used to simplify and improve performance of everything in the next
> > > patches.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> > >  drivers/infiniband/core/Makefile      |   3 +-
> > >  drivers/infiniband/core/rdma_core.h   |  50 ++++
> > >  drivers/infiniband/core/uverbs.h      |   1 +
> > >  drivers/infiniband/core/uverbs_main.c |  14 +-
> > >  drivers/infiniband/core/uverbs_uapi.c | 350 ++++++++++++++++++++++++++
> > >  include/rdma/uverbs_ioctl.h           | 137 ++++++++++
> > >  6 files changed, 552 insertions(+), 3 deletions(-)
> > >  create mode 100644 drivers/infiniband/core/uverbs_uapi.c
> > >
> >
> > <...>
> >
> > > +static inline __attribute_const__ u32 uapi_key_attrs_start(u32 ioctl_method_key)
> > > +{
> > > +	/* 0 is the method slot itself */
> > > +	return ioctl_method_key + 1;
> > > +}
> >
> > <...>
> >
> > > +/*
> > > + * This returns a value in the range [0 to UVERBS_API_ATTR_BKEY_LEN),
> > > + * basically it undoes the reservation of 0 in the ID numbering. attr_key
> > > + * must already be masked with UVERBS_API_ATTR_KEY_MASK, or be the output of
> > > + * uapi_key_attr().
> > > + */
> > > +static inline __attribute_const__ u32 uapi_bkey_attr(u32 attr_key)
> > > +{
> > > +	return attr_key - 1;
> > > +}
> >
> > It is very unclear that attr_key starts from 1 and not from 0.
>
> There are lots of comments talking about the construction of the radix
> tree key, including the one you clipped above that directly says what
> the zero index is, and the larger comment at the top explaining that 0
> is always used for something.
>
> Maybe it should be more explicit??
>
> There are three 'identifiers' for each attribute:
>  1) The kernel enum value (ie something like
>      UVERBS_ATTR_ALLOC_DM_HANDLE)
>     This is a 16 bit value
>  2) The radix tree leaf key (called attr_key). This is the 16 bit
>     value manipulated down to 6 bits, with 0 reserved
>  3) The bitmap/linear array index (called attr_bkey). This is the 16 bit
>     value manipulated down to 6 bits, without 0 reserved.
>
> Would the above help in one of the comment blocks?

I think that the best will be to have proper description under
Documentation/ folder.

>
> The manipulation algorithm is best understood by reading the code I
> think. It has been designed to minimize the amount of memory required
> during command execution, which allows commands to avoid the kmalloc
> overhad for each syscall.
>
> I admit this approach is tricky without a solid understanding of how a
> radix tree works and what factors are needed to optimize it, but the
> technique of using the '0' bits of the suffix turns out to be a very
> optimal way to use this datastructure.
>
> Ie review the attribute search code in the next patch, it is basically
> a constant time, single array offset calcuation due to this
> construction.
>
> This is part of why this serise gets the 6% performance win.
>
> > Other than that, I have no clue how this patch can be properly reviewed.
>
> Well, it isn't very complicated, particularly compared to the last
> series..
>
> Look at the radix tree construction code and see if there are the
> usual classes of issues. Check if the semantics of the new
> data structure make sense and have enough clarity.
>
> This will become an important datastructure so it is important that
> this code very readable and clear.

The radix tree API doesn't really fit to this requirement (readable and
clear). For example the article about xArray talks about it [1].

Thanks

[1] https://lwn.net/Articles/745073/

>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux