Re: [PATCHv12 0/3] rdmacg: IB/core: rdma controller support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 31, 2016 at 12:24 PM, Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> On Thu, Oct 20, 2016 at 01:48:27AM +0530, Parav Pandit wrote:
>> On Thu, Oct 20, 2016 at 1:35 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
>> > Hello, Parav.
>> >
>> > On Thu, Oct 20, 2016 at 01:24:42AM +0530, Parav Pandit wrote:
>> >> userland can get the max numbers using other framework which is used
>> >> by control & data plane available in C library form or in form of
>> >> system tools.
>> >> I was preferring to get and set through same interface because,
>> >> It simplifies user land software which is often not written in C so
>> >> its likely that it needs to rely on system tools and parse the
>> >> content, iterate through devices etc.
>> >> Getting these info through rdma.max just makes it simple. There will
>> >> be logic built to read/write rdma.max in userland anyway, which can be
>> >> leveraged for percentage calculation instead of doing it from two
>> >> places.
>> >
>> > Yeah, I get that this can be convenient in this case but it isn't a
>> > generic approach.  I'd much prefer keeping it in line with other
>> > resources.
>> >
>> Hmm. we don't have /proc/sys/kernel/pid_max type of simple interface
>> to get the max values for rdma resources.
>> rdma.max is close to that simplicity.
>
> Sorry for my late response (very long weekends and piles of mails after it) and
> for not clarifying our requirements better, which are very simple.
>
> 1. We will have vendor specific vendors objects in the future (new ABI
> support it and designed for that).
I will let others comments on it. The patch_v11 design was allowing
vendor specific objects and standard objects to be defined in IB core
and rdma cgroup was facilitator to do so. We didn't reach consensus on
that approach.

> 2. We don't want to fight for every addition of such objects to cgroup list.
Ditto comment as above.

> 3. We don't want to teach and/or rewrite scripts for "average" user after
> addition of new objects.
This we can possibly do by having new rdma.percentage knob, which gets
configured by default for every new object in rdma cgroup.
This way average user/administrator doesn't have to know about it.

> 4. Cgroup configuration should be as close as possible to "standard" if
> such exists, so all infinite internet guides will work for RDMA too.
I didnt follow this comment. Can you please explain? Are you saying
rdma cgroup should have define all the objects of IB spec?
>
> From my understanding of current status.
> My naive approach of introducing GLOBAL_HCA object is the way to go and the real question
> is to understand how to configure it, am I right?
>
Global object won't work for below reason.
Lets take example that makes life easier.
Lets say two new RDMA objects exist which are not part of rdma cgroup
standard resource definition.
say, indirection table and PSM tags.
Both are abstracted using one global_hca resource object.
Say its given 10%.
Now IB core performs charging of each such object using GLOBAL_HCA.
(Because cgroup level there is only one object GLOBAL_HCA).
So two or more resources are mapped to single object.
Which means, one object can be charged more with total limit still
under 10%, thats leads to same problem as not having cgroup at all.

So my opinion is:
(a) Let cgroup define the current standard objects and new reasonable
set of vendor specific objects in future.
(b) Add new rdma.percentage parameter so that any new standard object
or vendor specific object can be abstracted from average end user and
applications which are yet to catch up.
I believe this takes care of your point (1), (3), (4)?

In other hypothetical design,
we can have rdma group as just pid to cgroup mapping facilitator.
All the charging/uncharging logic moves to IB core in form of library,
that standard ABI uverbs and vendor specific layer invokes. In this
approach there will be code duplicated in every such vendor driver.
By doing so, more callbacks will also have to be moved down till IB
core and vendor drivers for cgroup creation/deletion/offline etc.

This also means that lack of standard object definitions, may creates
more confusion to end user and orchestration applications. I prefer to
avoid such design.

Parav
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux