Re: [PATCH v3 2/3] drm/panic: Add a drm panic handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28/09/2023 11:30, Thomas Zimmermann wrote:


Am 27.09.23 um 19:22 schrieb Jocelyn Falempe:
This module displays a user friendly message when a kernel panic
occurs. It currently doesn't contain any debug information,
but that can be added later.

v2
  * Use get_scanout_buffer() instead of the drm client API.
   (Thomas Zimmermann)
  * Add the panic reason to the panic message (Nerdopolis)
  * Add an exclamation mark (Nerdopolis)

v3
  * Rework the drawing functions, to write the pixels line by line and
  to use the drm conversion helper to support other formats.
  (Thomas Zimmermann)

Signed-off-by: Jocelyn Falempe <jfalempe@xxxxxxxxxx>
---
  drivers/gpu/drm/Kconfig     |  11 ++
  drivers/gpu/drm/Makefile    |   1 +
  drivers/gpu/drm/drm_drv.c   |   3 +
  drivers/gpu/drm/drm_panic.c | 366 ++++++++++++++++++++++++++++++++++++
  include/drm/drm_drv.h       |  14 ++
  include/drm/drm_panic.h     |  41 ++++
  6 files changed, 436 insertions(+)
  create mode 100644 drivers/gpu/drm/drm_panic.c
  create mode 100644 include/drm/drm_panic.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index afb3b2f5f425..98d78f865180 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -99,6 +99,17 @@ config DRM_KMS_HELPER
      help
        CRTC helpers for KMS drivers.
+config DRM_PANIC
+    bool "Display a user-friendly message when a kernel panic occurs"
+    depends on DRM && !FRAMEBUFFER_CONSOLE
+    select FONT_SUPPORT
+    default y if DRM_SIMPLEDRM

No 'default y' please. We sometimes do this for compatibility with existing configs, but that's clearly not the case here.

ok, yes it's better not enabled by default.


+    help
+      Enable a drm panic handler, which will display a user-friendly message
+      when a kernel panic occurs. It's useful when using a user-space
+      console instead of fbcon.
+      It will only work if your graphic driver supports this feature.
+
  config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
          bool "Enable refcount backtrace history in the DP MST helpers"
      depends on STACKTRACE_SUPPORT
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 7a09a89b493b..a525dd9a2751 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -72,6 +72,7 @@ drm-$(CONFIG_DRM_PRIVACY_SCREEN) += \
      drm_privacy_screen_x86.o
  drm-$(CONFIG_DRM_ACCEL) += ../../accel/drm_accel.o
  obj-$(CONFIG_DRM)    += drm.o
+drm-$(CONFIG_DRM_PANIC) += drm_panic.o
  obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 12687dd9e1ac..0e0f1e258845 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -45,6 +45,7 @@
  #include <drm/drm_mode_object.h>
  #include <drm/drm_print.h>
  #include <drm/drm_privacy_screen_machine.h>
+#include <drm/drm_panic.h>

In alphabetical order please.

ok,


  #include "drm_crtc_internal.h"
  #include "drm_internal.h"
@@ -1067,6 +1068,7 @@ static void drm_core_exit(void)
      unregister_chrdev(DRM_MAJOR, "drm");
      debugfs_remove(drm_debugfs_root);
      drm_sysfs_destroy();
+    drm_panic_exit();
      idr_destroy(&drm_minors_idr);
      drm_connector_ida_destroy();
  }
@@ -1078,6 +1080,7 @@ static int __init drm_core_init(void)
      drm_connector_ida_init();
      idr_init(&drm_minors_idr);
      drm_memcpy_init_early();
+    drm_panic_init();
      ret = drm_sysfs_init();
      if (ret < 0) {
diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c
new file mode 100644
index 000000000000..1d93e4dbed9a
--- /dev/null
+++ b/drivers/gpu/drm/drm_panic.c
@@ -0,0 +1,366 @@
+// SPDX-License-Identifier: GPL-2.0 or MIT
+/*
+ * Copyright (c) 2023 Jocelyn Falempe <jfalempe@xxxxxxxxxx>
+ * inspired by the drm_log driver from David Herrmann <dh.herrmann@xxxxxxxxx>
+ * Tux Ascii art taken from cowsay written by Tony Monroe
+ */
+
+#include <linux/font.h>
+#include <linux/iosys-map.h>
+#include <linux/kdebug.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/panic_notifier.h>
+#include <linux/types.h>
+
+#include <drm/drm_drv.h>
+#include <drm/drm_format_helper.h>
+#include <drm/drm_fourcc.h>
+#include <drm/drm_panic.h>
+#include <drm/drm_print.h>
+
+
+MODULE_AUTHOR("Jocelyn Falempe");
+MODULE_DESCRIPTION("DRM PANIC");

A better descrition please. At lease something like "DRM panic handler".
ok,


+MODULE_LICENSE("GPL");
+
+/**
+ * DOC: DRM Panic
+ *
+ * This module displays a user friendly message on screen when a kernel panic

Drop 'user friendly' here. That message is *never* user friendly. :D

Yes, but at the same time when you look at:
https://en.wikipedia.org/wiki/Screen_of_death
If you're not a computer scientist, the Linux one is the scariest.
I added the friendly tux logo to comfort you when you may have lost your last hour of work.


+ * occurs. It's useful when using a user-space console instead of fbcon.
+ * It's intended for end-user, so have minimal technical/debug information.

These two sentences miss the point. If the kernel crashes, the only useful message is a technical one, such as a stack trace. The panic handler should provide information for finding the bug.

My point of view is that technical/debug information should come with kdump. Or maybe encode the stack trace in a qrcode, so you can report the bug easily. But displaying the stack frame is useless to non-developers.

For the first version, I prefer to have only the panic reason, and then we can add other features incrementaly.

+ *
+ * It will display only one static frame, so performance optimizations are low
+ * priority as the machine is already in an unusable state.

An implementation detail.

In these docs, I'd rather say that it display an error message if a kernel panic occurs on kernels without framebuffer console.

Ok, I will clarify this point.

We should clarify if the message appears only if the framebuffer console is absent from the kernel (the current behavior), or if the console can be running, but not be the active DRM master.

I prefer to keep it this way, because otherwise the framebuffer console and drm_panic might compete for the framebuffer, and I don't want to handle this case.


+ */
+
+/*
+ * List of active drm devices that can render a panic
+ */
+struct dpanic_drm_device {

The naming is odd. This isn't driver code, but a DRM core feature. The common style for DRM core code is to start with drm_. Calling this structure drm_panic_device seems more appropriate.  The same gos for all the dpanic_ names below.

Yes, I'm not good at variable naming, I will change that.

+    struct list_head head;
+    struct drm_device *dev;
+};
+
+struct dpanic_line {
+    u32 len;
+    const char *txt;
+};
+
+#define PANIC_LINE(s) {.len = sizeof(s) - 1, .txt = s}
+
+struct dpanic_line panic_msg[] = {
+    PANIC_LINE("KERNEL PANIC !"),
+    PANIC_LINE(""),
+    PANIC_LINE("Please reboot your computer."),
+    PANIC_LINE(""),
+    PANIC_LINE(""), /* overwritten with panic reason */
+};
+
+const struct dpanic_line logo[] = {
+    PANIC_LINE("     .--.        _"),
+    PANIC_LINE("    |o_o |      | |"),
+    PANIC_LINE("    |:_/ |      | |"),
+    PANIC_LINE("   //   \\ \\     |_|"),
+    PANIC_LINE("  (|     | )     _"),
+    PANIC_LINE(" /'\\_   _/`\\    (_)"),
+    PANIC_LINE(" \\___)=(___/"),
+};

For merging this code, it should print something useful. It looks like stack traces can be quieried and formatted with stack_trace_save() [1] and stack_trace_snprint(). [2] I'd reserve a few pages for the stack trace and display the top-most information from the stack.

For most users, just saying to reboot the computer is the only useful thing to print. But I know that for developpers, it would be handy to add this later, as a config option.

[1] https://elixir.bootlin.com/linux/latest/source/kernel/stacktrace.c#L259
[2] https://elixir.bootlin.com/linux/latest/source/kernel/stacktrace.c#L37

Thanks for the pointer. I already looked at dump_stack(), but didn't find a way to get the string.

+
+static LIST_HEAD(dpanic_devices);
+static DEFINE_MUTEX(dpanic_lock);
+
+/*
+ * This buffer is used to convert xrgb8888 to the scanout buffer format.
+ * It is allocated when the first client register, and freed when the last client
+ * unregisters.
+ * There is no need for mutex, as the panic garantee that only 1 CPU is still
+ * running, and preemption is disabled.
+ */
+#define DRM_PANIC_MAX_WIDTH 8096
+static u32 *dpanic_line_buf;
+static u32 fg_color = le32_to_cpu(0x00ffffff);
+static u32 bg_color = le32_to_cpu(0x00000000);
+
+static void dpanic_draw_line(const struct dpanic_line *msg, size_t left_margin,
+                 size_t l, size_t width, const struct font_desc *font)
+{
+    size_t x, i, j;
+    const u8 *src;
+    size_t src_stride = DIV_ROUND_UP(font->width, 8);
+    u32 *dst = dpanic_line_buf;
+
+    for (x = 0; x < left_margin * font->width; x++)
+        *dst++ = bg_color;
+
+    for (i = 0; i < msg->len; i++) {
+        for (j = 0; j < font->width; j++) {
+            src = font->data + (msg->txt[i] * font->height + l) * src_stride; +            *dst++ = (src[j / 8] & (0x80 >> (j % 8))) ? fg_color : bg_color;
+        }
+    }
+    for (x = (left_margin + msg->len) * font->width; x < width ; x++)
+        *dst++ = bg_color;
+}

This helper is effectively an implementation of drm_fb_r1_to_xrgb8888(). At some point, we should attempt to implementent it as such.
Yes, I can add this to the drm_format_helper, and call it from here.


In the recent patchset for caching conversion memory, there's struct drm_xfrm_buf. It can later be used to store palette information and/or fg/bg colors.
Yes for the r1->xrgb8888 conversion, it's better to have a way to pass fg/bg colors.


+
+static void dpanic_draw_txt_line(const struct dpanic_line *msg, size_t left_margin, size_t y,
+              struct drm_scanout_buffer *sb,
+              const struct font_desc *font,
+              void (*convert)(void *dbuf, const void *sbuf, unsigned int npixels))
+{
+    size_t dst_off;
+    size_t l;
+
+    for (l = 0; l < font->height; l++) {
+        dpanic_draw_line(msg, left_margin, l, sb->width, font);
+        if (convert)
+            convert(dpanic_line_buf, dpanic_line_buf, sb->width);
+
+        dst_off = (y * font->height + l) * sb->pitch;
+        iosys_map_memcpy_to(&sb->map, dst_off, dpanic_line_buf,
+                    sb->width * sb->format->cpp[0]);
+    }
+}
+
+static void dpanic_draw_empty_txt_line(size_t y,
+              struct drm_scanout_buffer *sb,
+              const struct font_desc *font,
+              void (*convert)(void *dbuf, const void *sbuf, unsigned int npixels))
+{
+    size_t i, dst_off;
+    u32 *dst = dpanic_line_buf;
+
+    for (i = 0; i < sb->width; i++)
+        *dst++ = bg_color;
+
+    if (convert)
+        convert(dpanic_line_buf, dpanic_line_buf, sb->width);
+
+    for (i = 0; i < font->height; i++) {
+        dst_off = (y * font->height + i) * sb->pitch;
+        iosys_map_memcpy_to(&sb->map, dst_off, dpanic_line_buf,
+                    sb->width * sb->format->cpp[0]);
+    }
+}
+
+static size_t dpanic_needed_lines(size_t chars_per_line)
+{
+    size_t msg_len = ARRAY_SIZE(panic_msg);
+    size_t lines = 0;
+    size_t i;
+
+    for (i = 0; i < msg_len; i++)
+        lines += panic_msg[i].len ? DIV_ROUND_UP(panic_msg[i].len, chars_per_line) : 1;
+    return lines;
+}
+
+static bool dpanic_can_draw_logo(size_t chars_per_line, size_t lines, size_t msg_lines)
+{
+    size_t i;
+
+    for (i = 0; i < ARRAY_SIZE(logo); i++) {
+        if (logo[i].len > chars_per_line)
+            return false;
+    }
+    if (lines < msg_lines + ARRAY_SIZE(logo))
+        return false;
+    return true;
+}
+
+static size_t get_start_line(size_t lines, size_t msg_lines, bool can_draw_logo)
+{
+    size_t remaining;
+    size_t logo_len = ARRAY_SIZE(logo);
+
+    if (lines < msg_lines)
+        return 0;
+    remaining = lines - msg_lines;
+    if (can_draw_logo && remaining / 2 <= logo_len)
+        return logo_len + (remaining - logo_len) / 4;
+    return remaining / 2;
+}
+
+/*
+ * Return the function to convert xrgb8888 to the scanout buffer format
+ * NULL if no conversion is needed, and ERR_PTR if format can't be converted
+ */
+static void (*get_convert_func(const struct drm_format_info *format))
+        (void *, const void *, unsigned int)
+{
+    switch (format->format) {
+    case DRM_FORMAT_XRGB8888:
+        return NULL;
+    case DRM_FORMAT_RGB565:
+        return drm_fb_xrgb8888_to_rgb565_line;
+    default:
+        return ERR_PTR(EINVAL);
+    }
+}

This function plus dpanic_draw_*_line() will turn into a reimplementation of drm_fb_blit(). We should attempt to use drm_fb_blit() at some point, or at least share the majority of code. Maybe I have to do some work in the format-helper side to make this happen.

The problem I have with using drm_fb_blit(), is that I will need to call it for each character, which means doing the "big format switch" a lot of times. With this approach, it's done only once for the frame.

Also that means that instead of doing one memcpy_to_io() for the whole framebuffer line (so typically 1024~4096 pixels), that will be done only for one character width (8 pixels).
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_format_helper.c#L122

Thanks for the review,

--

Jocelyn


Best regards
Thomas

+
+
+/*
+ * Draw the panic message at the center of the screen
+ */
+static void dpanic_static_draw(struct drm_scanout_buffer *sb, const char *msg)
+{
+    size_t lines, msg_lines, l, msg_start_line, msgi;
+    size_t center, chars_per_line;
+    bool can_draw_logo;
+    struct dpanic_line panic_line;
+    size_t msg_len = ARRAY_SIZE(panic_msg);
+    size_t logo_len = ARRAY_SIZE(logo);
+    void (*convert)(void *dbuf, const void *sbuf, unsigned int npixels);
+    const struct font_desc *font = get_default_font(sb->width, sb->height, 0x8080, 0x8080);
+
+    if (!font)
+        return;
+
+    /* Set the panic reason */
+    panic_msg[msg_len - 1].len = strlen(msg);
+    panic_msg[msg_len - 1].txt = msg;
+
+    lines = sb->height / font->height;
+    chars_per_line = sb->width / font->width;
+
+    msg_lines = dpanic_needed_lines(chars_per_line);
+    can_draw_logo = dpanic_can_draw_logo(chars_per_line, lines, msg_lines);
+    msg_start_line = get_start_line(lines, msg_lines, can_draw_logo);
+
+    convert = get_convert_func(sb->format);
+    if (IS_ERR(convert))
+        return;
+
+    msgi = 0;
+    panic_line.len = 0;
+    for (l = 0; l < lines; l++) {
+        if (can_draw_logo && l < logo_len)
+            dpanic_draw_txt_line(&logo[l], 0, l, sb, font, convert);
+        else if (l >= msg_start_line && msgi < msg_len) {
+            if (!panic_line.len) {
+                panic_line.txt = panic_msg[msgi].txt;
+                panic_line.len = panic_msg[msgi].len;
+            }
+            if (!panic_line.len)
+                dpanic_draw_empty_txt_line(l, sb, font, convert);
+            else {
+                center = panic_line.len > chars_per_line ?
+                     0 : (chars_per_line - panic_line.len) / 2;
+                dpanic_draw_txt_line(&panic_line, center, l, sb, font, convert);
+            }
+            if (panic_line.len > chars_per_line) {
+                panic_line.len -= chars_per_line;
+                panic_line.txt += chars_per_line;
+            } else {
+                panic_line.len = 0;
+                msgi++;
+            }
+        } else {
+            dpanic_draw_empty_txt_line(l, sb, font, convert);
+        }
+    }
+}
+
+static void drm_panic_device(struct drm_device *dev, const char *msg)
+{
+    struct drm_scanout_buffer sb;
+
+    if (dev->driver->get_scanout_buffer(dev, &sb))
+        return;
+    if (!dpanic_line_buf)
+        return;
+
+    /* to avoid buffer overflow on dpanic_line_buf */
+    if (sb.width > DRM_PANIC_MAX_WIDTH)
+        sb.width = DRM_PANIC_MAX_WIDTH;
+
+    dpanic_static_draw(&sb, msg);
+}
+
+static int drm_panic(struct notifier_block *this, unsigned long event,
+             void *ptr)
+{
+    struct dpanic_drm_device *dpanic_device;
+
+    list_for_each_entry(dpanic_device, &dpanic_devices, head) {
+        drm_panic_device(dpanic_device->dev, ptr);
+    }
+    return NOTIFY_OK;
+}
+
+struct notifier_block drm_panic_notifier = {
+    .notifier_call = drm_panic,
+};
+
+/**
+ * drm_panic_register() - Initialize DRM panic for a device
+ * @dev: the DRM device on which the panic screen will be displayed.
+ */
+void drm_panic_register(struct drm_device *dev)
+{
+    struct dpanic_drm_device *new;
+
+    new = kzalloc(sizeof(*new), GFP_KERNEL);
+    if (!new)
+        return;
+
+    new->dev = dev;
+    mutex_lock(&dpanic_lock);
+    if (!dpanic_line_buf)
+        dpanic_line_buf = kmalloc(DRM_PANIC_MAX_WIDTH * sizeof(u32), GFP_KERNEL);
+    if (!dpanic_line_buf)
+        goto unlock;
+    list_add_tail(&new->head, &dpanic_devices);
+    drm_info(dev, "Registered with drm panic\n");
+unlock:
+    mutex_unlock(&dpanic_lock);
+}
+EXPORT_SYMBOL(drm_panic_register);
+
+/**
+ * drm_panic_unregister()
+ * @dev: the DRM device previously registered.
+ */
+void drm_panic_unregister(struct drm_device *dev)
+{
+    struct dpanic_drm_device *dpanic_device;
+    struct dpanic_drm_device *found = NULL;
+
+    mutex_lock(&dpanic_lock);
+    list_for_each_entry(dpanic_device, &dpanic_devices, head) {
+        if (dpanic_device->dev == dev)
+            found = dpanic_device;
+    }
+    if (found) {
+        list_del(&found->head);
+        kfree(found);
+        drm_info(dev, "Unregistered with drm panic\n");
+    }
+    if (dpanic_line_buf && list_empty(&dpanic_devices)) {
+        kfree(dpanic_line_buf);
+        dpanic_line_buf = NULL;
+    }
+    mutex_unlock(&dpanic_lock);
+}
+EXPORT_SYMBOL(drm_panic_unregister);
+
+/**
+ * drm_panic_init() - Initialize drm-panic subsystem
+ *
+ * register the panic notifier
+ */
+void drm_panic_init(void)
+{
+    atomic_notifier_chain_register(&panic_notifier_list,
+                       &drm_panic_notifier);
+}
+
+/**
+ * drm_log_exit() - Shutdown drm-panic subsystem
+ */
+void drm_panic_exit(void)
+{
+    atomic_notifier_chain_unregister(&panic_notifier_list,
+                     &drm_panic_notifier);
+}
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index 89e2706cac56..e538c87116d3 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -43,6 +43,7 @@ struct dma_buf_attachment;
  struct drm_display_mode;
  struct drm_mode_create_dumb;
  struct drm_printer;
+struct drm_scanout_buffer;
  struct sg_table;
  /**
@@ -408,6 +409,19 @@ struct drm_driver {
       */
      void (*show_fdinfo)(struct drm_printer *p, struct drm_file *f);
+    /**
+     * @get_scanout_buffer:
+     *
+     * Get the current scanout buffer, to display a panic message with drm_panic. +     * It is called from a panic callback, and must follow its restrictions.
+     *
+     * Returns:
+     *
+     * Zero on success, negative errno on failure.
+     */
+    int (*get_scanout_buffer)(struct drm_device *dev,
+                  struct drm_scanout_buffer *sb);
+
      /** @major: driver major number */
      int major;
      /** @minor: driver minor number */
diff --git a/include/drm/drm_panic.h b/include/drm/drm_panic.h
new file mode 100644
index 000000000000..db430b8dfbb2
--- /dev/null
+++ b/include/drm/drm_panic.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 or MIT */
+#ifndef __DRM_PANIC_H__
+#define __DRM_PANIC_H__
+
+/*
+ * Copyright (c) 2023 Jocelyn Falempe <jfalempe@xxxxxxxxxx>
+ */
+
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/iosys-map.h>
+
+struct drm_device;
+
+struct drm_scanout_buffer {
+    const struct drm_format_info *format;
+    struct iosys_map map;
+    unsigned int pitch;
+    unsigned int width;
+    unsigned int height;
+};
+
+#ifdef CONFIG_DRM_PANIC
+
+void drm_panic_init(void);
+void drm_panic_exit(void);
+
+void drm_panic_register(struct drm_device *dev);
+void drm_panic_unregister(struct drm_device *dev);
+
+#else
+
+static inline void drm_panic_init(void) {}
+static inline void drm_panic_exit(void) {}
+
+static inline void drm_panic_register(struct drm_device *dev) {}
+static inline void drm_panic_unregister(struct drm_device *dev) {}
+
+#endif
+
+#endif /* __DRM_LOG_H__ */





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux