Re: OHCI-PCI: Thin client does not shutdown properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oops, pressed the "send button too quickly, sorry"

Am 16.05.2017 um 16:20 schrieb Alan Stern:
> You've got a BIOS developer in the same building?  That's a great
> resource!  Maybe together you can find out what condition is causing
> the BIOS to initiate a reboot.

We got everything here. We got hardware developers for our mainboards
and systems, BIOS developers, QA engineeers, product management and a
factory to build servers and PCs. The only site in Europe that is still
developing and producing PCs and servers.

> For example, exactly what does "Power-On via USB" in the BIOS do?

BIOS is waiting for a "resume" in that case. If a resume on USB is
there, machine starts. We have special keyboards with a power on button
and the trick is that this button issues a "resume" even if the keyboard
itself is not programmed to send resumes.

> I didn't expect the patch to solve the problem.  Nevertheless, I would 
> like to know exactly what effect it has on both kernels.  Can you 
> provide more details?

Here we go. That's the USB releated log with the patch:
usbhid 5-2:1.0: rkoenig handling

usbhid 5-2:1.0: shutdown

usbhid 5-1:1.1: rkoenig handling

usbhid 5-1:1.1: shutdown

usbhid 5-1:1.0: rkoenig handling

usbhid 5-1:1.0: shutdown

usb 5-2: rkoenig handling

usb 5-2: shutdown

usb 5-1: rkoenig handling

usb 5-1: shutdown

usb usb6-port4: rkoenig handling

usb usb6-port3: rkoenig handling

usb usb6-port2: rkoenig handling

usb usb6-port1: rkoenig handling

usb usb6: rkoenig handling

usb usb6: shutdown

ohci-pci 0000:00:13.0: rkoenig handling

ohci-pci 0000:00:13.0: shutdown

usb usb5-port4: rkoenig handling

usb usb5-port3: rkoenig handling

usb usb5-port2: rkoenig handling

usb usb5-port1: rkoenig handling

usb usb5: rkoenig handling

usb usb5: shutdown

ohci-pci 0000:00:12.0: rkoenig handling

ohci-pci 0000:00:12.0: shutdown

usb usb4-port4: rkoenig handling

usb usb4-port3: rkoenig handling

usb usb4-port2: rkoenig handling

usb usb4-port1: rkoenig handling

usb usb4: rkoenig handling

usb usb4: shutdown

ehci-pci 0000:00:13.2: rkoenig handling

ehci-pci 0000:00:13.2: shutdown

usb usb3-port4: rkoenig handling

usb usb3-port3: rkoenig handling

usb usb3-port2: rkoenig handling

usb usb3-port1: rkoenig handling

usb usb3: rkoenig handling

usb usb3: shutdown

ehci-pci 0000:00:12.2: rkoenig handling

ehci-pci 0000:00:12.2: shutdown

usb usb2-port2: rkoenig handling

usb usb2-port1: rkoenig handling

usb usb2: rkoenig handling

usb usb2: shutdown

usb usb1-port2: rkoenig handling

usb usb1-port1: rkoenig handling

usb usb1: rkoenig handling

usb usb1: shutdown

xhci_hcd 0000:00:10.0: rkoenig handling

xhci_hcd 0000:00:10.0: shutdown

> You should have unbound the controllers, not the devices.  That is, you
> should have unbound PCI devices 0000:00:12.0 and 0000:00:13.0 from
> ohci-pci (in /sys/bus/pci/drivers/ohci_pci), and 0000:00:12.2 and
> 0000:00:13.2 from ehci-pci (in /sys/bus/pci/drivers/ehci_pci).

Yeah, that one I tried and it works perfectly, even when the drivers
are compiled into the kernel.

>> The keyboard/mouse still continued to work on my system (which btw is
> 
> Are they connected over USB?  If they are, removing ehci-pci won't make 
> any difference.  But without ohci-pci, they won't work -- unless they 
> are plugged into a USB-3 port.

Yes, keyboard and mouse are attached on the USB ports.

>> running Ubuntu 16.04 for this tests). But now its getting strange:
>>
>> - if I shutdown the system at this point with "init 0" from a root shell
>>   it performs a shutdown, and it turns off! Yeah.
>>
>> - if I shutdown the system at this point by using the shutdown menu from
>>   the Ubuntu menu then the shutdown ends up in a kernel panic.
> 
> Don't you get any information about the panic on your serial console?  
> I would expect it to have a stack dump, at least.  A panic means 
> there's a bug, and it needs to be fixed.

That's what I get:

[  297.243132] general protection fault: 0000 [#1] SMP

[  297.250152] Modules linked in: amd_freq_sensitivity(E)
snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E)
kvm_amd(E) snd_hda_intel(E) kvm(E) snd_hda_codec(E) crct10dif_pclmul(E)
snd_hda_core(E) crc32_pclmul(E) snd_hwdep(E) ghash_clmulni_intel(E)
snd_pcm(E) aesni_intel(E) aes_x86_64(E) snd_seq_midi(E) lrw(E)
snd_seq_midi_event(E) input_leds(E) gf128mul(E) snd_rawmidi(E)
glue_helper(E) snd_seq(E) ablk_helper(E) snd_seq_device(E) snd_timer(E)
cryptd(E) edac_mce_amd(E) snd(E) serio_raw(E) edac_core(E)
fam15h_power(E) k10temp(E) soundcore(E) shpchp(E) fujitsu_laptop(E)
8250_fintek(E) i2c_piix4(E) mac_hid(E) parport_pc(E) ppdev(E) lp(E)
parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) amdkfd(E)
amd_iommu_v2(E) radeon(E) i2c_algo_bit(E) ttm(E) ohci_pci(E) ohci_hcd(E)
drm_kms_helper(E) ahci(E) xhci_pci(E) psmouse(E) r8169(E) libahci(E)
drm(E) xhci_hcd(E) mii(E) video(E) [last unloaded: ehci_hcd]

[  297.346784] CPU: 2 PID: 1 Comm: systemd-shutdow Tainted: G
E   4.2.0-rc4-bad #10

[  297.357795] Hardware name: FUJITSU D3313-A1/D3313-A1, BIOS V4.6.5.4
R1.17.0 for D3313-A1x 09/02/2016

[  297.369635] task: ffff8800688a0000 ti: ffff88006882c000 task.ti:
ffff88006882c000

[  297.379843] RIP: 0010:[<ffffffff815ce697>]  [<ffffffff815ce697>]
recursively_mark_NOTATTACHED+0x37/0xb0

[  297.392045] RSP: 0018:ffff88006882fd38  EFLAGS: 00010002

[  297.400146] RAX: 00000000355bd858 RBX: ffff8800355bd3a0 RCX:
00000000ffffffff

[  297.410114] RDX: 68894420894cc389 RSI: 0000000000000292 RDI:
ffff8800355bd3a0

[  297.420111] RBP: ffff88006882fd58 R08: 0000000000000000 R09:
0000000000000a72

[  297.430132] R10: ffffffff818657a0 R11: 0000000000000a72 R12:
ffff8800355bd3a0

[  297.440159] R13: ffff8800355bd430 R14: ffff8800355bd490 R15:
00000000fee1dead

[  297.450205] FS:  00007f7bcec7c840(0000) GS:ffff88006b300000(0000)
knlGS:0000000000000000

[  297.461248] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[  297.469957] CR2: 00007f7bce1d1840 CR3: 0000000035845000 CR4:
00000000000406e0

[  297.480092] Stack:

[  297.485092]  ffff88006882fda8 ffff8800355bd3a0 0000000000000000
ffff8800355bd430

[  297.495634]  ffff88006882fd98 ffffffff815cf098 ffff88006882fd78
0000000000000292

[  297.506189]  ffff88006882fda8 ffff8800355bd448 ffff880062a46090
ffff8800355bd430

[  297.516754] Call Trace:

[  297.522287]  [<ffffffff815cf098>] usb_set_device_state+0xb8/0x130

[  297.531509]  [<ffffffff815deb17>] usb_dev_shutdown+0x17/0x20

[  297.540300]  [<ffffffff8151995d>] device_shutdown+0xed/0x1b0

[  297.549096]  [<ffffffff8109d875>] kernel_power_off+0x35/0x70

[  297.557879]  [<ffffffff8109da53>] SYSC_reboot+0x1a3/0x220

[  297.566401]  [<ffffffff810b3126>] ? set_next_entity+0xa6/0x440

[  297.575355]  [<ffffffff81013689>] ? __switch_to+0x1f9/0x5c0

[  297.584044]  [<ffffffff817b4f2a>] ? __schedule+0x36a/0x930

[  297.592645]  [<ffffffff811fd499>] ? vfs_writev+0x39/0x50

[  297.601064]  [<ffffffff8109db2e>] SyS_reboot+0xe/0x10

[  297.609228]  [<ffffffff817b9572>] entry_SYSCALL_64_fastpath+0x16/0x75

[  297.618799] Code: 54 53 49 89 fc 48 83 ec 08 48 85 ff 74 7f 48 8b 97
78 03 00 00 48 85 d2 74 73 8b 87 c0 04 00 00 85 c0 74 3c 48 8b 92 98 00
00 00 <4c> 8b aa d0 00 00 00 85 c0 7e 2a 31 db 49 8b 95 38 02 00 00 48

[  297.645744] RIP  [<ffffffff815ce697>]
recursively_mark_NOTATTACHED+0x37/0xb0

[  297.656161]  RSP <ffff88006882fd38>

[  297.662997] ---[ end trace 0ff1895c565b8fbf ]---

[  297.672080] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b

[  297.672080]

[  297.688072] Kernel Offset: disabled

[  297.694987] drm_kms_helper: panic occurred, switching back to text
console

[  297.705361] ------------[ cut here ]------------

[  297.713485] WARNING: CPU: 2 PID: 130 at arch/x86/kernel/smp.c:124
native_smp_send_reschedule+0x60/0x70()

[  297.726561] Modules linked in: amd_freq_sensitivity(E)
snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E)
kvm_amd(E) snd_hda_intel(E) kvm(E) snd_hda_codec(E) crct10dif_pclmul(E)
snd_hda_core(E) crc32_pclmul(E) snd_hwdep(E) ghash_clmulni_intel(E)
snd_pcm(E) aesni_intel(E) aes_x86_64(E) snd_seq_midi(E) lrw(E)
snd_seq_midi_event(E) input_leds(E) gf128mul(E) snd_rawmidi(E)
glue_helper(E) snd_seq(E) ablk_helper(E) snd_seq_device(E) snd_timer(E)
cryptd(E) edac_mce_amd(E) snd(E) serio_raw(E) edac_core(E)
fam15h_power(E) k10temp(E) soundcore(E) shpchp(E) fujitsu_laptop(E)
8250_fintek(E) i2c_piix4(E) mac_hid(E) parport_pc(E) ppdev(E) lp(E)
parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) amdkfd(E)
amd_iommu_v2(E) radeon(E) i2c_algo_bit(E) ttm(E) ohci_pci(E) ohci_hcd(E)
drm_kms_helper(E) ahci(E) xhci_pci(E) psmouse(E) r8169(E) libahci(E)
drm(E) xhci_hcd(E) mii(E) video(E) [last unloaded: ehci_hcd]

[  297.832739] CPU: 2 PID: 130 Comm: kworker/2:2 Tainted: G      D     E
  4.2.0-rc4-bad #10

[  297.845183] Hardware name: FUJITSU D3313-A1/D3313-A1, BIOS V4.6.5.4
R1.17.0 for D3313-A1x 09/02/2016

[  297.858645]  0000000000000000 0000000058f5650f ffff88006b303d88
ffffffff817b2785

[  297.870479]  0000000000000000 0000000000000000 ffff88006b303dc8
ffffffff8107b7b6

[  297.882295]  ffff880062be2a68 0000000000000000 ffff88006b216580
0000000000000002

[  297.894116] Call Trace:

[  297.900893]  <IRQ>  [<ffffffff817b2785>] dump_stack+0x45/0x57

[  297.911041]  [<ffffffff8107b7b6>] warn_slowpath_common+0x86/0xc0

[  297.921444]  [<ffffffff8107b8ea>] warn_slowpath_null+0x1a/0x20

[  297.931660]  [<ffffffff8104c480>] native_smp_send_reschedule+0x60/0x70

[  297.942600]  [<ffffffff810b86fb>] trigger_load_balance+0x13b/0x230

[  297.953190]  [<ffffffff810a7a76>] scheduler_tick+0xa6/0xd0

[  297.962998]  [<ffffffff810f72a0>] ? tick_sched_handle.isra.14+0x60/0x60

[  297.973858]  [<ffffffff810e7c61>] update_process_times+0x51/0x60

[  297.984014]  [<ffffffff810f7265>] tick_sched_handle.isra.14+0x25/0x60

[  297.994508]  [<ffffffff810f72e4>] tick_sched_timer+0x44/0x80

[  298.004118]  [<ffffffff810e8573>] __hrtimer_run_queues+0xf3/0x220

[  298.014079]  [<ffffffff810e8ca8>] hrtimer_interrupt+0xa8/0x1a0

[  298.023704]  [<ffffffff8104ec7c>] local_apic_timer_interrupt+0x3c/0x70

[  298.033995]  [<ffffffff817bc201>] smp_apic_timer_interrupt+0x41/0x60

[  298.044072]  [<ffffffff817ba39b>] apic_timer_interrupt+0x6b/0x70

[  298.053757]  <EOI>  [<ffffffff810a40eb>] ? finish_task_switch+0x6b/0x1c0

[  298.064148]  [<ffffffff817b4f2a>] __schedule+0x36a/0x930

[  298.073112]  [<ffffffff817b5527>] schedule+0x37/0x80

[  298.081706]  [<ffffffff81094e6b>] worker_thread+0xcb/0x4c0

[  298.090819]  [<ffffffff81094da0>] ? process_one_work+0x440/0x440

[  298.100448]  [<ffffffff81094da0>] ? process_one_work+0x440/0x440

[  298.110042]  [<ffffffff8109b1d8>] kthread+0xd8/0xf0

[  298.118466]  [<ffffffff8109b100>] ? kthread_create_on_node+0x1b0/0x1b0

[  298.128534]  [<ffffffff817b998f>] ret_from_fork+0x3f/0x70

[  298.137454]  [<ffffffff8109b100>] ? kthread_create_on_node+0x1b0/0x1b0

[  298.147490] ---[ end trace 0ff1895c565b8fc0 ]---

[  298.155670] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x0000000b

[  298.155670]


>> I also rebuild the initrd image, but I really couldn't get rid of those
>> modules, after every new start lsmod still showed the ehci modules
>> despite the blacklist entries.
> 
> You probably have to tell the program that creates the initrd image to 
> blacklist them or leave them out entirely.  I don't know how to do this 
> for Ubuntu.

I unpacked the initrd image and it contains the /etc/modprobe.d/
directory and all the conf files there, so theoretically blacklisting
should work.

>> Next step was disabling ehci support in the kernel config. Rebuilding
>> everything and now I have a bad kernel without ehci support that boots
>> up, is able to handle keyboard and mouse and I shutdown the system (even
>> from the menu) its shuts down and keeps off. So now it seems to behave
>> like the "good" kernel.
> 
> Therefore it appears that the problem is somehow caused by the 
> operation of shutting down the EHCI controller.  Perhaps it interrupts 
> the connections to the OHCI controller briefly, in a way that leads the 
> BIOS to believe that a "Power-On via USB" event has occurred.

Looks like.

> Another possibility is to unbind ehci-pci just before shutting down, 
> for example as part of a shutdown script.

Yes, that is what I tried on Ubuntu now. Works perfectly. This is what
we will communicate to our partner that is building the thin client
Linux distribution for those machines. So even if we didn't find the
root cause for this problem we have an easy workaround now that solves
the issue for the customers that are affected by this.

> Let me know what you find out.

If we ever meet in real life remind me that I owe you a beer. ;-)

Best regards
Rainer
-- 
Dipl.-Inf. (FH) Rainer Koenig
Project Manager Linux Clients
FJ EMEIA PR PSO PM&D CCD ENG SW OSS&C

Fujitsu Technology Solutions
Bürgermeister-Ullrich-Str. 100
86199 Augsburg
Germany

Telephone: +49-821-804-3321
Telefax:   +49-821-804-2131
Mail:      mailto:Rainer.Koenig@xxxxxxxxxxxxxx

Internet         ts.fujtsu.com
Company Details  ts.fujitsu.com/imprint.html
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux