Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs

Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> · Tue, 12 Jun 2018 11:33:34 +0100

On 12/06/18 10:20, Joonas Lahtinen wrote:
Quoting Chris Wilson (2018-06-11 18:02:37)
Quoting Lionel Landwerlin (2018-06-11 14:46:07)
On 11/06/18 13:10, Tvrtko Ursulin wrote:
On 30/05/2018 15:33, Lionel Landwerlin wrote:
There are concerns about denial of service around the per context sseu
configuration capability. In a previous commit introducing the
capability we allowed it only for capable users. This changes adds a
new debugfs entry to let any user configure its own context
powergating setup.
As far as I understood it, Joonas' concerns here are:

1) That in the containers use case individual containers wouldn't be
able to turn on the sysfs toggle for them.

2) That also in the containers use case if box admin turns on the
feature, some containers would potentially start negatively affecting
the others (via the accumulated cost of slice re-configuration on
context switching).

I am not familiar with typical container setups to be authoritative
here, but intuitively I find it reasonable that a low-level hardware
switch like this would be under the control of a master domain
administrator. ("If you are installing our product in the container
environment, make sure your system administrator enables this hardware
feature.", "Note to system administrators: Enabling this features may
negatively affect the performance of other containers.")

Alternative proposal is for the i915 to apply an "or" filter on all
requested masks and in that way ensure dynamic re-configuration
doesn't happen on context switches, but driven from userspace via ioctls.

In other words, should _all_ userspace agree between themselves that
they want to turn off a slice, they would then need to send out a
concerted ioctl storm, where number of needed ioctls equals the number
of currently active contexts. (This may have its own performance
consequences caused by the barriers needed to modify all context images.)

This was deemed acceptable the the media use case, but my concern is
the approach is not elegant and will tie us with the "or" policy in
the ABI. (Performance concerns I haven't evaluated yet, but they also
may be significant.)

If we go back thinking about the containers use case, then it
transpires that even though the "or" policy does prevent one container
from affecting the other from one angle, it also prevents one
container from exercising the feature unless all containers co-operate.

As such, we can view the original problem statement where we have an
issue if not everyone co-operates, as conceptually the same just from
an opposite angle. (Rather than one container incurring the increased
cost of context switches to the rest, we would have one container
preventing the optimized slice configuration to the other.)

 From this follows that both proposals require complete co-operation
from all running userspace to avoid complete control of the feature.

Since the balance between the benefit of optimized slice configuration
(or penalty of suboptimal one), versus the penalty of increased
context switch times, cannot be know by the driver (barring venturing
into the heuristics territory), that is another reason why I find the
"or" policy in the driver questionable.

We can also ask a question of - If we go with the "or" policy, why
require N per-context ioctls to modify the global GPU configuration
and not instead add a global driver ioctl to modify the state?

If a future hardware requires, or enables, the per-context behaviour
in a more efficient way, we could then revisit the problem space.

In the mean time I see the "or" policy solution as adding some ABI
which doesn't do anything for many use cases without any way for the
sysadmin to enable it. At the same time master sysfs knob at least
enables the sysadmin to make a decision. Here I am thinking about a
random client environment where not all userspace co-operates, but for
instance user is running the feature aware media stack, and
non-feature aware OpenCL/3d stack.

I guess the complete story boils down to - is the master sysfs knob
really a problem in container use cases.

Regards,

Tvrtko
Hey Tvrtko,

Thanks for summarizing a bunch of discussions.
Essentially I agree with every you wrote above.

If we have a global setting (determined by the OR policy), what's the
point of per context settings?

In Dmitry's scenario, all userspace applications will work together to
reach the consensus so it sounds like we're reimplementing the policy
that is already existing in userspace.

Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
than me pick one or the other :)
I'll just mention the voting/consensus approach to see if anyone else
likes it.

Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
(or some other abstract names).
Yeah, the param name should have the word _HINT_ in it when it's not a
definitive set.

There's no global setter across containers, only a scenario when
everyone agrees or not. Tallying up the votes and going with a majority
vote might be an option, too.

Regards, Joonas

Trying to test the "everyone agrees" approach here.
There are a number of processes that can hold onto a gem context and 
therefore prevent agreement.
On my system plymouthd & systemd-login have a number of contexts opened.

A process simply opening&closing a render node could flip the system 
back and forth between 2 configurations.

Does that change people's mind about how we should go about this?

-
Lionel

Then whenever the host cares, they can evaluate the set of hints
provided and make a choice on sseu config. One presumes a simple greater
good method (but you could extends that to include batch
frequency/duration to try and determine system impact on one setting or
another). Keeping it a hint helps reduce the effect of policy, though it
may still be policy and merit a switch for different implementations (or
BPF!).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx