Re: [Intel-gfx] [PATCH v6 0/5] drm/i915: Expose more GPU properties through sysfs

Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> · Wed, 13 Dec 2017 15:06:30 +0000

On 13/12/17 08:17, Daniel Vetter wrote:
On Tue, Dec 12, 2017 at 02:33:38PM +0000, Lionel Landwerlin wrote:
On 12/12/17 11:19, Tvrtko Ursulin wrote:
On 11/12/2017 21:05, Daniel Vetter wrote:
On Mon, Dec 11, 2017 at 02:38:53PM +0000, Tvrtko Ursulin wrote:
On 11/12/2017 10:50, Joonas Lahtinen wrote:
+ Daniel, Chris

On Thu, 2017-12-07 at 09:21 +0000, Tvrtko Ursulin wrote:
On 04/12/2017 15:02, Lionel Landwerlin wrote:
Hi,

After discussion with Chris, Joonas & Tvrtko, this series adds an
additional commit to link the render node back to the card through a
symlink. Making it obvious from an application using
a render node to
know where to get the information it needs.
Important thing to mention as well is that it is trivial
to get from the
master drm fd to the sysfs root, via fstat and opendir
/sys/dev/char/<major>:<minor>. With the addition of the
card symlink to
render nodes it is trivial for render node fd as well.

I am happy with this approach - it is extensible, flexible and avoids
issues with ioctl versioning or whatnot. With one value
per file it is
trivial for userspace to access.

So for what I'm concerned, given how gputop would use
all of this and so
be the userspace, if everyone else is happy, I think we could do a
detailed review and prehaps also think about including gputop in some
distribution to make the case 100% straightforward.
For the GPU topology I agree this is the right choice, it's
going to be
about topology after all, and directory tree is the perfect candidate.
And if a new platform appears, then it's a new platform and may change
the topology well the hardware topology has changed.

For the engine enumeration, I'm not equally sold for sysfs
exposing it.
It's a "linear list of engine instances with flags" how the userspace
is going to be looking at them. And it's also information
about what to
pass to an IOCTL as arguments after decision has been made, and then
you already have the FD you know you'll be dealing with, at hand. So
another IOCTL for that seems more convenient.
Apart from more flexibility and easier to extend, sysfs might be
a better
fit for applications which do not otherwise need a drm fd. Say a
top-like
tool which shows engine utilization, or those patches I RFC-ed recently
which do the same but per DRM client.

Okay, these stats are now available also via PMU so the argument
is not the
strongest I admit, but I still find it quite neat. It also might
allow us to
define our own policy with regards to needed privilege to access these
stats, and not be governed by the perf API rules.
How exactly is sysfs easier to extend than ioctl? There's lots of
Easier as in no need to version, add has_this/has_that markers, try to
guess today how big a field for something might be needed in the future
and similar.

serializing and deserializing going on, ime that's more boilerplate. Imo
the only reason for sysfs is when you _must_ access it without having an
fd to the gpu. The inverse is generally not true (i.e. using sysfs when
you have the fd already), and as soon as you add a writeable field here
you're screwed (because sysfs can't be passed to anyone else but root,
compared to drm fd - viz the backlight fiasco).
I would perhaps expand the "must access without having a drm fd" to
"more convenient to access without a drm fd". My use case here was the
per-client engine usage stats. Equivalence with /proc/<pid>/stat, or
even /proc/stat if you want. So I was interested in creating a foothold
in sysfs for that purpose.

No writable fields were imagined in all these two to three use cases.

But even without writeable fields: Think of highly contained
containers/clients which only get the drm fd to render. If sysfs is
gone,
will they fall on their faces? Atm "drm fd is all you need" is very
deeply
ingrained into our various OS stacks.
Okay, this is something which was mentioned but I think the answer was
unclear. If current stacks do work without sysfs then this is a good
argument to keep that ability.

As I said I am OK to drop the engine info bits from this series.
Question for Lionel, gpu-top and Mesa then is whether sysfs works for
them, for the remaining topology information. Attractiveness of sysfs
there was that it looked easier to be future proof for more or less
hypothetical future topologies.
We did a couple of versions with ioctls :

https://patchwork.freedesktop.org/patch/185959/ (through GET_PARAM)
https://patchwork.freedesktop.org/series/33436/ (though a new discovery uAPI
initially targeted at the engine discovery)

Eventually Chris suggested sysfs (which I find kind of convenient), even
though you Daniel raised a valid point with sandboxed processes.

I personally don't care much in which way this should be implemented.
Just give me some direction :)

Daniel: Would something in the style of
https://patchwork.freedesktop.org/series/33436/ work? If yes, what would you
recommend to change?
tbh I feel bad derailing this even further, but I think if you want a more
flexible query ioctl that's easier to extend we could maybe look at the
chunk list approach amdgpu_cs_ioctl uses: All the massive pile of execbuf
metadata is supplied in a chunk list, and if you want to extend new stuff,
or add a new type, you add a new chunk_id. The chunks themselves are all
linearized into one allocation.

We could use the exact same trick for output, and require that userspace
simply jumps over chunks it doesn't understand. Pronto, you have your
insta-extensibility.

If you want to go fancy, add a huge flags fields so you can select what
kinds of chunks you want (you need the flags field anyways).

Querying would be the usual two-step process:

1. ioctl call with chunk_size == 0, to query the kernel how much the
kernel should allocate. Kernel would just walk all the available chunks
and add up (with a bit of seq_file style abstraction this could look very
neat, even when dealing with huge number of chunks).

2. Userspace allocates sufficient amounts of memory for all the chunks it
wants, calls the ioctl again, kernel fills them all out into this single
one array.

Aside: VBT almost works like that (minus the few places they screwed up
and you need to know the size of the next block).

Okay, cool. Looks a lot like how some vulkan entrypoints work.
That's also not too dissimilar from what was done in the second ioctl 
attempt.
I guess I could rename a few things and drop the engine discovery (at 
least initially) and submit again.

Thanks!

Cheers, Daniel

Thanks!

-
Lionel

Regards,

Tvrtko

-Daniel

So I'd say for the GPU topology part, we go forward with the
review and
make sure we don't expose driver internal bits that could break when
refactoring code. If the exposed N bits of information are strictly
tied to the underlying hardware, we should have no trouble maintaining
that for the foreseeable future.

Then we can continue on about the engine discovery in parallel, not
blocking GPU topology discovery.
I can live with that, but would like to keep the gt/engines/ namespace
reserved for the eventuality with go with engine info in sysfs
at a later
stage then.

Also, Lionel, did you have plans to use the engine info straight
away in gpu
top, or you only needed topology? I think you were drawing a nice block
diagram of a GPU so do you need it for that?

Regards,

Tvrtko

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel