Re: [PATCH v2] drm/print: Introduce drm_line_printer

Jani Nikula <jani.nikula@xxxxxxxxx> · Thu, 30 May 2024 10:49:56 +0300

On Wed, 29 May 2024, John Harrison <john.c.harrison@xxxxxxxxx> wrote:
> On 5/28/2024 06:06, Michal Wajdeczko wrote:
>> This drm printer wrapper can be used to increase the robustness of
>> the captured output generated by any other drm_printer to make sure
>> we didn't lost any intermediate lines of the output by adding line
>> numbers to each output line. Helpful for capturing some crash data.
>>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx>
>> Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
>> Cc: John Harrison <John.C.Harrison@xxxxxxxxx>
>> ---
>> v2: don't abuse prefix, use union instead (Jani)
>>      don't use 'dp' as name, prefer 'p' (Jani)
>>      add support for unique series identifier (John)
>> ---
>>   drivers/gpu/drm/drm_print.c | 14 ++++++++
>>   include/drm/drm_print.h     | 68 ++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 81 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
>> index cf2efb44722c..be9cbebff5b3 100644
>> --- a/drivers/gpu/drm/drm_print.c
>> +++ b/drivers/gpu/drm/drm_print.c
>> @@ -214,6 +214,20 @@ void __drm_printfn_err(struct drm_printer *p, struct va_format *vaf)
>>   }
>>   EXPORT_SYMBOL(__drm_printfn_err);
>>   
>> +void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf)
>> +{
>> +	unsigned int counter = ++p->line.counter;
> Wrong units, but see below anyway...
>
>> +	const char *prefix = p->prefix ?: "";
>> +	const char *pad = p->prefix ? " " : "";
>> +
>> +	if (p->line.series)
>> +		drm_printf(p->arg, "%s%s%u.%u: %pV",
>> +			   prefix, pad, p->line.series, counter, vaf);
>> +	else
>> +		drm_printf(p->arg, "%s%s%u: %pV", prefix, pad, counter, vaf);
>> +}
>> +EXPORT_SYMBOL(__drm_printfn_line);
>> +
>>   /**
>>    * drm_puts - print a const string to a &drm_printer stream
>>    * @p: the &drm printer
>> diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
>> index 089950ad8681..f4d9b98d7909 100644
>> --- a/include/drm/drm_print.h
>> +++ b/include/drm/drm_print.h
>> @@ -176,7 +176,13 @@ struct drm_printer {
>>   	void (*puts)(struct drm_printer *p, const char *str);
>>   	void *arg;
>>   	const char *prefix;
>> -	enum drm_debug_category category;
>> +	union {
>> +		enum drm_debug_category category;
>> +		struct {
>> +			unsigned short series;
>> +			unsigned short counter;
> Any particular reason for using 'short' rather than 'int'? Short is only 
> 16bits right? That might seem huge but a GuC log buffer with 16MB debug 
> log (and minimum sizes for the other sections) when dumped via the 
> original debugfs hexdump mechanism is 1,245,444 lines. That line count 
> goes down a lot when you start using longer lines for the dump, but it 
> is still in the tens of thousands of lines.  So limiting to 16 bits is 
> an overflow just waiting to happen.
>
>> +		} line;
>> +	};
>>   };
>>   
>>   void __drm_printfn_coredump(struct drm_printer *p, struct va_format *vaf);
>> @@ -186,6 +192,7 @@ void __drm_puts_seq_file(struct drm_printer *p, const char *str);
>>   void __drm_printfn_info(struct drm_printer *p, struct va_format *vaf);
>>   void __drm_printfn_dbg(struct drm_printer *p, struct va_format *vaf);
>>   void __drm_printfn_err(struct drm_printer *p, struct va_format *vaf);
>> +void __drm_printfn_line(struct drm_printer *p, struct va_format *vaf);
>>   
>>   __printf(2, 3)
>>   void drm_printf(struct drm_printer *p, const char *f, ...);
>> @@ -357,6 +364,65 @@ static inline struct drm_printer drm_err_printer(struct drm_device *drm,
>>   	return p;
>>   }
>>   
>> +/**
>> + * drm_line_printer - construct a &drm_printer that prefixes outputs with line numbers
>> + * @p: the &struct drm_printer which actually generates the output
>> + * @prefix: optional output prefix, or NULL for no prefix
>> + * @series: optional unique series identifier, or 0 to omit identifier in the output
>> + *
>> + * This printer can be used to increase the robustness of the captured output
>> + * to make sure we didn't lost any intermediate lines of the output. Helpful
>> + * while capturing some crash data.
>> + *
>> + * Example 1::
>> + *
>> + *	void crash_dump(struct drm_device *drm)
>> + *	{
>> + *		static unsigned short id;
>> + *		struct drm_printer p = drm_err_printer(drm, "crash");
>> + *		struct drm_printer lp = drm_line_printer(&p, "dump", ++id);
> Is there any benefit or other reason to split the prefix across two 
> separate printers? It seems like a source of confusion. As in, the code 
> will allow a double prefix, there is not much you can do about that 
> because losing the prefix from drm_line_printer would mean no prefix at 
> all when not using drm_err_printer underneath. But why explicitly split 
> the message across both printers in the usage example? This is saying 
> that this is the recommended way to use the interface, but with no 
> explanation of why the split is required or how the split should be done.

You could have a printer, and then add two separate line counted blocks.

	struct drm_printer p = drm_err_printer(drm, "parent");
	struct drm_printer lp1 = drm_line_printer(&p, "child 1", 0);

	...

	struct drm_printer lp2 = drm_line_printer(&p, "child 2", 0);

	...

p could be defined elsewhere and passed into separate functions which
each have the line printing. The two prefixes can be useful.

> Also, there is really no specific connection to crashes. The purpose of 
> this is for allowing the dumping of multi-line blocks of data. One use 
> is for debugging crashes. But there are many debug operations that 
> require the same dump and do not involve a crash. And this support is 
> certainly not intended to be used on general end-user GPU hangs. For 
> those, everything should be part of the devcoredump core file that is 
> produced and accessible via sysfs. We absolutely should not be dumping 
> huge data blocks in customer release drivers except in very extreme 
> circumstances.

A devcoredump implementation could use a drm_printer too?

Is this only about bikeshedding the example? I'm not sure I get the
point here.

>
>> + *
>> + *		drm_printf(&lp, "foo");
>> + *		drm_printf(&lp, "bar");
>> + *	}
>> + *
>> + * Above code will print into the dmesg something like::
>> + *
>> + *	[ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.1: foo
>> + *	[ ] 0000:00:00.0: [drm] *ERROR* crash dump 1.2: bar
>> + *
>> + * Example 2::
>> + *
>> + *	void line_dump(struct device *dev)
>> + *	{
>> + *		struct drm_printer p = drm_info_printer(dev);
>> + *		struct drm_printer lp = drm_line_printer(&p, NULL, 0);
>> + *
>> + *		drm_printf(&lp, "foo");
>> + *		drm_printf(&lp, "bar");
>> + *	}
>> + *
>> + * Above code will print::
>> + *
>> + *	[ ] 0000:00:00.0: [drm] 1: foo
>> + *	[ ] 0000:00:00.0: [drm] 2: bar
> Not really seeing the point of having two examples listed. The first 
> includes all the optional extras, the second is just repeating with no 
> new information.

You see how the "series" param behaves?

BR,
Jani.

>
> John.
>
>> + *
>> + * RETURNS:
>> + * The &drm_printer object
>> + */
>> +static inline struct drm_printer drm_line_printer(struct drm_printer *p,
>> +						  const char *prefix,
>> +						  unsigned short series)
>> +{
>> +	struct drm_printer lp = {
>> +		.printfn = __drm_printfn_line,
>> +		.arg = p,
>> +		.prefix = prefix,
>> +		.line = { .series = series, },
>> +	};
>> +	return lp;
>> +}
>> +
>>   /*
>>    * struct device based logging
>>    *
>

-- 
Jani Nikula, Intel