Re: [PATCH 4/4] drm/xe/guc: Expose raw access to GuC log over debugfs

Matthew Brost <matthew.brost@xxxxxxxxx> · Tue, 14 May 2024 20:31:47 +0000

On Tue, May 14, 2024 at 11:13:14AM -0700, John Harrison wrote:
> On 5/14/2024 07:58, Michal Wajdeczko wrote:
> > On 13.05.2024 18:53, John Harrison wrote:
> > > On 5/12/2024 08:36, Michal Wajdeczko wrote:
> > > > We already provide the content of the GuC log in debugsfs, but it
> > > > is in a text format where each log dword is printed as hexadecimal
> > > > number, which does not scale well with large GuC log buffers.
> > > > 
> > > > To allow more efficient access to the GuC log, which could benefit
> > > > our CI systems, expose raw binary log data.  In addition to less
> > > > overhead in preparing text based GuC log file, the new GuC log file
> > > > in binary format is also almost 3x smaller.
> > > > 
> > > > Any existing script that expects the GuC log buffer in text format
> > > > can use command like below to convert from new binary format:
> > > > 
> > > >      hexdump -e '4/4 "0x%08x " "\n"'
> > > > 
> > > > but this shouldn't be the case as most decoders expect GuC log data
> > > > in binary format.
> > > I strongly disagree with this.
> > > 
> > > Efficiency and file size is not an issue when accessing the GuC log via
> > > debugfs on actual hardware.
> > to some extend it is as CI team used to refuse to collect GuC logs after
> > each executed test just because of it's size
> I've never heard that argument. I've heard many different arguments but not
> one about file size. The default GuC log size is pretty tiny. So size really
> is not an issue.
> 
> > 
> > > It is an issue when dumping via dmesg but
> > > you definitely should not be dumping binary data to dmesg. Whereas,
> > not following here - this is debugfs specific, not a dmesg printer
> Except that it is preferable to have common code for both if at all
> possible.
> 
> > 
> > > dumping in binary data is much more dangerous and liable to corruption
> > > because some tool along the way tries to convert to ASCII, or truncates
> > > at the first zero, etc. We request GuC logs be sent by end users,
> > > customer bug reports, etc. all doing things that we have no control over.
> > hmm, how "cp gt0/uc/guc_log_raw FILE" could end with a corrupted file ?
> Because someone then tries to email it, or attach it or copy it via Windows
> or any number of other ways in which a file can get munged.
> 
> > 
> > > Converting the hexdump back to binary is trivial for those tools which
> > > require it. If you follow the acquisition and decoding instructions on
> > > the wiki page then it is all done for you automatically.
> > I'm afraid I don't know where this wiki page is, but I do know that hex
> > conversion dance is not needed for me to get decoded GuC log the way I
> > used to do
> Look for the 'GuC Debug Logs' page on the developer wiki. It's pretty easy
> to find.
> 
> > > These patches are trying to solve a problem which does not exist and are
> > > going to make working with GuC logs harder and more error prone.
> > it at least solves the problem of currently super inefficient way of
> > generating the GuC log in text format.
> > 
> > it also opens other opportunities to develop tools that could monitor or
> > capture GuC log independently on  top of what driver is able to offer
> > today (on i915 there was guc-log-relay, but it was broken for long time,
> > not sure what are the plans for Xe)
> > 
> > also still not sure how it can be more error prone.
> As already explained, the plan is move to LFD - an extensible, streamable,
> logging format. Any non-trivial effort that is not helping to move to LFD is
> not worth the effort.
> 
> > 
> > > On the other hand, there are many other issues with GuC logs that it
> > > would be useful to solves - including extra meta data, reliable output
> > > via dmesg, continuous streaming, pre-sizing the debugfs file to not have
> > > to generate it ~12 times for a single read, etc.
> > this series actually solves last issue but in a bit different way (we
> > even don't need to generate full GuC log dump at all if we would like to
> > capture only part of the log if we know where to look)
> No, it doesn't solve it. Your comment below suggests it will be read in 4KB
> chunks. Which means your 16MB buffer now requires 4096 separate reads! And
> you only doing partial reads of the section you think you need is never
> going to be reliable on live system. Not sure why you would want to anyway.
> It is just making things much more complex. You now need an intelligent user
> land program to read the log out and decode at least the header section to
> know what data section to read. You can't just dump the whole thing with
> 'cat' or 'dd'.
> 

Briefly have read this thread but seconding John's opinion that anything
which breaks GuC log collection via a simple command like cat is a no
go. Also anything that can allow windows to mangle the output would be
less than idle too. Lastly, GuC log collection is not a critical path
operation so it likely does not need to optimized for speed.

Matt

> > 
> > for reliable output via dmesg - see my proposal at [1]
> > 
> > [1] https://patchwork.freedesktop.org/series/133613/
> 
> > 
> > > Hmm. Actually, is this interface allowing the filesystem layers to issue
> > > multiple read calls to read the buffer out in small chunks? That is also
> > > going to break things. If the GuC is still writing to the log as the
> > > user is reading from it, there is the opportunity for each chunk to not
> > > follow on from the previous chunk because the data has just been
> > > overwritten. This is already a problem at the moment that causes issues
> > > when decoding the logs, even with an almost atomic copy of the log into
> > > a temporary buffer before reading it out. Doing the read in separate
> > > chunks is only going to make that problem even worse.
> > current solution, that converts data into hex numbers, reads log buffer
> > in chunks of 128 dwords, how proposed here solution that reads in 4K
> > chunks could be "even worse" ?
> See above, 4KB chunks means 4096 separate reads for a 16M buffer. And each
> one of those reads is a full round trip to user land and back. If you want
> to get at all close to an atomic read of the log then it needs to be done as
> a single call that copies the log into a locally allocated kernel buffer and
> then allows user land to read out from that buffer rather than from the live
> log. Which can be trivially done with the current method (at the expense of
> a large memory allocation) but would be much more difficult with random
> access reader like this as you would need to say the copied buffer around
> until the reads have all been done. Which would presumably mean adding
> open/close handlers to allocate and free that memory.
> 
> > 
> > and in case of some smart tool, that would understands the layout of the
> > GuC log buffer, we can even fully eliminate problem of reading stale
> > data, so why not to choose a more scalable solution ?
> You cannot eliminate the problem of stale data. You read the header, you
> read the data it was pointing to, you re-read the header and find that the
> GuC has moved on. That is an infinite loop of continuously updating
> pointers.
> 
> John.
> 
> > 
> > > John.
> > > 
> > > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx>
> > > > Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx>
> > > > Cc: John Harrison <John.C.Harrison@xxxxxxxxx>
> > > > ---
> > > > Cc: linux-fsdevel@xxxxxxxxxxxxxxx
> > > > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > > > ---
> > > >    drivers/gpu/drm/xe/xe_guc_debugfs.c | 26 ++++++++++++++++++++++++++
> > > >    1 file changed, 26 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > > > b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > > > index d3822cbea273..53fea952344d 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_debugfs.c
> > > > @@ -8,6 +8,7 @@
> > > >    #include <drm/drm_debugfs.h>
> > > >    #include <drm/drm_managed.h>
> > > >    +#include "xe_bo.h"
> > > >    #include "xe_device.h"
> > > >    #include "xe_gt.h"
> > > >    #include "xe_guc.h"
> > > > @@ -52,6 +53,29 @@ static const struct drm_info_list debugfs_list[] = {
> > > >        {"guc_log", guc_log, 0},
> > > >    };
> > > >    +static ssize_t guc_log_read(struct file *file, char __user *buf,
> > > > size_t count, loff_t *pos)
> > > > +{
> > > > +    struct dentry *dent = file_dentry(file);
> > > > +    struct dentry *uc_dent = dent->d_parent;
> > > > +    struct dentry *gt_dent = uc_dent->d_parent;
> > > > +    struct xe_gt *gt = gt_dent->d_inode->i_private;
> > > > +    struct xe_guc_log *log = &gt->uc.guc.log;
> > > > +    struct xe_device *xe = gt_to_xe(gt);
> > > > +    ssize_t ret;
> > > > +
> > > > +    xe_pm_runtime_get(xe);
> > > > +    ret = xe_map_read_from(xe, buf, count, pos, &log->bo->vmap,
> > > > log->bo->size);
> > > > +    xe_pm_runtime_put(xe);
> > > > +
> > > > +    return ret;
> > > > +}
> > > > +
> > > > +static const struct file_operations guc_log_ops = {
> > > > +    .owner        = THIS_MODULE,
> > > > +    .read        = guc_log_read,
> > > > +    .llseek        = default_llseek,
> > > > +};
> > > > +
> > > >    void xe_guc_debugfs_register(struct xe_guc *guc, struct dentry *parent)
> > > >    {
> > > >        struct drm_minor *minor = guc_to_xe(guc)->drm.primary;
> > > > @@ -72,4 +96,6 @@ void xe_guc_debugfs_register(struct xe_guc *guc,
> > > > struct dentry *parent)
> > > >        drm_debugfs_create_files(local,
> > > >                     ARRAY_SIZE(debugfs_list),
> > > >                     parent, minor);
> > > > +
> > > > +    debugfs_create_file("guc_log_raw", 0600, parent, NULL,
> > > > &guc_log_ops);
> > > >    }
>