Re: [PATCH v2 1/3] drm: Add support for panic message output

John Ogness <john.ogness@xxxxxxxxxxxxx> · Wed, 13 Mar 2019 08:49:17 +0100

On 2019-03-12, Ahmed S. Darwish <darwish.07@xxxxxxxxx> wrote:
>>>> +
>>>> +static void drm_panic_kmsg_dump(struct kmsg_dumper *dumper,
>>>> +				enum kmsg_dump_reason reason)
>>>> +{
>>>> +	class_for_each_device(drm_class, NULL, dumper, drm_panic_dev_iter);
>>>
>>> class_for_each_device uses klist, which only uses an irqsave
>>> spinlock. I think that's good enough. Comment to that effect would
>>> be good e.g.
>>>
>>> 	/* based on klist, which uses only a spin_lock_irqsave, which we
>>> 	 * assume still works */
>>>
>>> If we aim for perfect this should be a trylock still, maybe using
>>> our own device list.
>>>
>
> I definitely agree here.
>
> The lock may already be locked either by a stopped CPU, or by the
> very same CPU we execute panic() on (e.g. NMI panic() on the
> printing CPU).
>
> This is why it's very common for example in serial consoles, which
> are usually careful about re-entrance and panic contexts, to do:
>
>   xxxxxx_console_write(...) {
> 	if (oops_in_progress)
> 		locked = spin_trylock_irqsave(&port->lock, flags);
> 	else
> 		spin_lock_irqsave(&port->lock, flags);
>   }
>
> I'm quite positive we should do the same for panic drm drivers.

This construction will continue, even if the trylock fails. It only
makes sense to do this if the driver has a chance of being
successful. Ignoring locks is a serious issue. I personally am quite
unhappy that the serial drivers do this, which was part of my motivation
for the new printk design I'm working on.

If the driver is not capable of doing something useful on a failed
trylock, then I recommend just skipping that device. Maybe trying it
again later after trying all the devices?

John Ogness
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx