Re: [PATCH 3/4] IB/uverbs: ex_query_device: check request's comp_mask

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

(adding linux-api@ for comments:

 We're introducing a new "uverb" in the InfiniBand subsystem:
 extended QUERY_DEVICE in v3.19:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.19-rc6#n3297
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6#n209
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6#n224

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5a77abf9a97a
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=860f10a799c8

 As you may not already know, InfiniBand subsystem use a write() 
 syscall[1] to issue ioctl()-like operations. Many operations (aka   
 verbs) are available [2], for each there's a query data structures
 and for some there's a response data structure [3]. As a result to a 
 write() operation, kernel could allocate resources on the task behalf 
 and/or write data back to userspace provided buffers whose pointers 
 were part of buffer passed to write().

 I've expressed concern on the way the new extended QUERY_DEVICE
 was trying to be itself expandable: by silently ignoring shorter
 buffer, returning truncated data, the interface seems awkward.

"Re: [PATCH v2 06/17] IB/core: Add support for extended query device caps"
http://mid.gmane.org/1418216676.11111.45.camel@xxxxxxxxxx

"Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps"
http://mid.gmane.org/1418733236.2779.26.camel@xxxxxxxxxx
http://mid.gmane.org/1418824811.3334.3.camel@dworkin
http://mid.gmane.org/1421844612.13543.40.camel@xxxxxxxxxx

 I recognize the author's intention to provide a way for userspace
 to retrieve easily the highest supported information as something
 desirable.

 But I believe we must be more strict on the request content and
 fail for any unrecognized, unsupported, incorrect bits to make
 safer to extend the interface latter.

 I've submitted a patchset[4] to address these issues.
 But, while I'm convinced it the way to go, I'm not able to find
 how it could be made to satisfy the original author expectations.
 
 I hope linux-api@ readers could provide some insight regarding
 the way we should implement such interface

[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.19-rc6#n599

[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.19-rc6#n81

[3] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/uapi/rdma/ib_user_verbs.h?id=v3.19-rc6

[4] http://mid.gmane.org/cover.1421931555.git.ydroneaud@xxxxxxxxxx

)

Le dimanche 25 janvier 2015 à 17:23 +0200, Haggai Eran a écrit :
> On 22/01/2015 15:28, Yann Droneaud wrote:
> > This patch ensures the extended QUERY_DEVICE uverbs request's
> > comp_mask has only known values. If userspace returns unknown
> > features, -EINVAL will be returned, allowing to probe/discover
> > which features are currently supported by the kernel.
> 
> This probing process will be much more cumbersome than it needs to be
> because userspace will have to call QUERY_DEVICE repetitively with
> different comp_mask fields until it finds out the exact set of supported
> bits.
> 

O(log2(N))

Or you had to add a different interface, dedicated to retrieve the exact
supported feature mask.

> > Moreover, it also ensure the requested features set in comp_mask
> > are sequentially set, not skipping intermediate features, eg. the
> > "highest" requested feature also request all the "lower" ones.
> > This way each new feature will have to be stacked on top of the
> > existing ones: this is especially important for the request and
> > response data structures where fields are added after the
> > current ones when expanded to support a new feature.
> 
> I think it is perfectly acceptable that not all drivers will implement
> all available features, and so you shouldn't enforce this requirement.

With regard to QUERY_DEVICE: the data structure layout depends on the
comp_mask value. So either you propose a way to express multipart data
structure (see CMSG or NETLINK), or we have to ensure the data structure
is ever-growing, with each new chunck stacked over the existing ones:
that's the purpose of :

	if (cmd.comp_mask & (cmd.comp_mask + 1))
		return -EINVAL;

> Also, it makes the comp_mask nothing more than a wasteful version number
> between 0 and 63.

That's what I've already explained earlier in "Re: [PATCH v3 06/17]
IB/core: Add support for extended query device caps":

http://mid.gmane.org/1421844612.13543.40.camel@xxxxxxxxxx

> 
> In the specific case of QUERY_DEVICE you might argue that there isn't
> any need for input comp_mask, only for output, and then you may enforce
> the input comp_mask will always be zero.

The extended QUERY_DEVICE uverbs as currently merged is using comp_mask
from input to choose to report on-demand-paging related value. So it
seems it's needed.

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.19-rc6#n3297

> However, you will in any case need to be able to extended the size of the response in the future.
> 

That's already the case for on demand paging.

> > 
> > Link: http://mid.gmane.org/cover.1421931555.git.ydroneaud@xxxxxxxxxx
> > Cc: Sagi Grimberg <sagig@xxxxxxxxxxxx>
> > Cc: Shachar Raindel <raindel@xxxxxxxxxxxx>
> > Cc: Eli Cohen <eli@xxxxxxxxxxxx>
> > Cc: Haggai Eran <haggaie@xxxxxxxxxxxx>
> > Signed-off-by: Yann Droneaud <ydroneaud@xxxxxxxxxx>
> > ---
> >  drivers/infiniband/core/uverbs_cmd.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> > index 8668b328b7e6..80a1c90f1dbf 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -3313,6 +3313,12 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
> >  	if (err)
> >  		return err;
> >  
> > +	if (cmd.comp_mask & (cmd.comp_mask + 1))
> > +		return -EINVAL;
> > +
> > +	if (cmd.comp_mask & ~(__u32)IB_USER_VERBS_EX_QUERY_DEVICE_ODP)
> > +		return -EINVAL;
> > +

If we keep the checks on output buffer size from patch 1, returning
-ENOSPC in case of size mismatch, and if we are sure that no bit in
input comp_mask will ever trigger a call to a kernel function that can
make the uverb fail, the latter check on known value could be dropped,
allowing the userspace to set the highest mask (-1): userspace
will use -ENOSPC to probe the expected size of the response buffer
to match the highest supported comp_mask. But it's going to hurt
userspace application not ready to receive -ENOSPC on newer kernel
with extended QUERY_DEVICE ABI ... Oops.

So in this end, the safest way to ensure userspace is doing the correct
thing so that we have backward and forward compatibility is to check
for known values in comp_mask, check for response buffer size and ensure
that data structure chunk are stacked.

The tighter are the checks now, the easier the interface could be
extended latter.

> >  	if (cmd.reserved)
> >  		return -EINVAL;
> >  
> > 
> 

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux