Re: Tips for Kernel Module Debugging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 12, 2015 at 4:27 PM, <Valdis.Kletnieks@xxxxxx> wrote:
On Sat, 12 Sep 2015 16:04:43 -0300, Lucas Tanure said:

> I'm testing the linux-next tree and I got this stack:
>
> [    2.158054] Call Trace:
> [    2.158058]  [<ffffffff812b9159>] dump_stack+0x4b/0x72
> [    2.158061]  [<ffffffff81074e62>] warn_slowpath_common+0x82/0xc0
> [    2.158063]  [<ffffffff81074faa>] warn_slowpath_null+0x1a/0x20
> [    2.158066]  [<ffffffffa0572291>] drm_dev_alloc+0x251/0x320 [drm]
> [    2.158070]  [<ffffffffa0574d0b>] drm_get_pci_dev+0x3b/0x1e0 [drm]
> [    2.158081]  [<ffffffffa07062d4>] i915_pci_probe+0x34/0x50 [i915]
>
> How is the best way to debug this ? I really need to add a print, compile
> and boot many times ?
> How would you guys debug this ?

Step 0:  Include the last few lines *before* the Call Trace - if indeed
it was a Warning, it will give you the file and line number of where the
WARN_ON was..

[26636.029711] ------------[ cut here ]------------
[26636.029724] WARNING: CPU: 3 PID: 19157 at ./arch/x86/include/asm/thread_info.h:239 sigsuspend+0xa4/0xb0()

Bummer of a birthmark, Hal.  The one my laptop hit was a WARN_ON inside
either a macro or static inline from that .h file. Fortunately, yours
was inside a .c file and pointed in the right place (see below for how
I know that...)

That 'cut here' is where you should start the cut-n-paste, and include
everything down to 'end trace'.

Having said that, looking at drivers/gpu/drm/drm_drv.c:drm_dev_alloc() we
find only one WARN_ON:

        if (drm_core_check_feature(dev, DRIVER_MODESET)) {
                ret = drm_minor_alloc(dev, DRM_MINOR_CONTROL);
                if (ret)
                        goto err_minors;

                WARN_ON(driver->suspend || driver->resume);
        }

As to *why* that one triggered, you'll have to ask an actual i915 expert.

Hi, 

My full warning:

------------[ cut here ]------------
WARNING: CPU: 3 PID: 243 at drivers/gpu/drm/drm_drv.c:569 drm_dev_alloc+0x251/0x320 [drm]()
Modules linked in: i915(+) joydev input_leds mousedev intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ttm hid_generic drm_kms_helper crct10dif_pclmul snd_hda_intel crc32_pclmul usbhid snd_hda_codec crc32c_intel drm hid ghash_clmulni_intel snd_hda_core eeepc_wmi asus_wmi aesni_intel iTCO_wdt sparse_keymap snd_hwdep led_class aes_x86_64 lrw snd_pcm iTCO_vendor_support rfkill mxm_wmi evdev gf128mul intel_gtt e1000e glue_helper mac_hid snd_timer syscopyarea ablk_helper cryptd sysfillrect psmouse snd sysimgblt pcspkr fb_sys_fops ptp mei_me i2c_i801 i2c_algo_bit soundcore mei shpchp i2c_core pps_core lpc_ich serio_raw wmi fan battery processor thermal video button sch_fq_codel ip_tables x_tables ext4 crc16 mbcache jbd2 sd_mod atkbd libps2 ahci libahci libata
 xhci_pci xhci_hcd ehci_pci ehci_hcd scsi_mod usbcore usb_common i8042 serio
CPU: 3 PID: 243 Comm: systemd-udevd Not tainted 4.2.0-next-20150912-ARCH #5
Hardware name: System manufacturer System Product Name/Maximus IV GENE-Z, BIOS 3603 11/09/2012
 0000000000000000 000000005ca47666 ffff88060f70b9d0 ffffffff812b9159
 0000000000000000 ffff88060f70ba08 ffffffff81074e62 ffff880612d39000
 ffffffffa06c7100 ffff880612f66098 ffffffffa06c7100 ffffffffa0691760
Call Trace:
 [<ffffffff812b9159>] dump_stack+0x4b/0x72
 [<ffffffff81074e62>] warn_slowpath_common+0x82/0xc0
 [<ffffffff81074faa>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0422291>] drm_dev_alloc+0x251/0x320 [drm]
 [<ffffffffa0424d0b>] drm_get_pci_dev+0x3b/0x1e0 [drm]
 [<ffffffffa05dd2d4>] i915_pci_probe+0x34/0x50 [i915]
 [<ffffffff812fdec5>] local_pci_probe+0x45/0xa0
 [<ffffffff812fde10>] ? pci_match_device+0xe0/0x110
 [<ffffffff812ff053>] pci_device_probe+0x103/0x150
 [<ffffffff813d7942>] driver_probe_device+0x222/0x490
 [<ffffffff813d7c34>] __driver_attach+0x84/0x90
 [<ffffffff813d7bb0>] ? driver_probe_device+0x490/0x490
 [<ffffffff813d557c>] bus_for_each_dev+0x6c/0xc0
 [<ffffffff813d70fe>] driver_attach+0x1e/0x20
 [<ffffffff813d6c4b>] bus_add_driver+0x1eb/0x280
 [<ffffffff813d8540>] driver_register+0x60/0xe0
 [<ffffffff812fd73c>] __pci_register_driver+0x4c/0x50
 [<ffffffffa0424f90>] drm_pci_init+0xe0/0x110 [drm]
 [<ffffffffa06e6000>] ? 0xffffffffa06e6000
 [<ffffffffa06e60a4>] i915_init+0xa4/0xab [i915]
 [<ffffffff81002123>] do_one_initcall+0xb3/0x200
 [<ffffffff81199801>] ? __vunmap+0x91/0xe0
 [<ffffffff811589a0>] do_init_module+0x5f/0x1ef
 [<ffffffff810fa707>] load_module+0x2197/0x27e0
 [<ffffffff810f7550>] ? symbol_put_addr+0x50/0x50
 [<ffffffff81188695>] ? __pte_alloc_kernel+0xa5/0xf0
 [<ffffffff810fae9e>] SyS_init_module+0x14e/0x190
 [<ffffffff8157046e>] entry_SYSCALL_64_fastpath+0x12/0x71
---[ end trace d2652104b24a32ff ]---

I could see that the real problem is  drm_dev_alloc, because it's the function just before the warring warn_slowpath_null. And this warn_slowpath_null function is what prints the warn. 
So how I can debug this ?

Thanks!

--
Lucas Tanure 
+55 (19) 988176559
 

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux