On 08/27/2015 02:29 PM, Casey Peter wrote:
I'm running a Gigabyte 970A-D3P, and with "iommu=soft" kernel parameter set
up, I don't have those errors either. (I did have them before turning iommu
on in bios and setting the kernel parameter).
I think we are getting somewhere, there is a mce on the number of CPUs:
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] smpboot: Allowing 8 CPUs, 0 hotplug CPUs
[ 0.000000] Booting paravirtualized kernel on bare hardware
[ 0.000000] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:8
nr_node_ids:1
[ 0.000000] PERCPU: Embedded 33 pages/cpu @ffff88042ec00000 s95576 r8192
d31400 u262144
[ 0.000000] pcpu-alloc: s95576 r8192 d31400 u262144 alloc=1*2097152
[ 0.000000] pcpu-alloc: [0] 0 1 2 3 4 5 6 7
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-linux
root=UUID=515ef9dc-769f-4548-9a08-3a92fa83d86b rw quiet
[ 0.000000] Memory: 16395952K/16740972K available (5699K kernel code, 893K
rwdata, 1732K rodata, 1180K init, 1152K bss, 345020K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
[ 0.000000] RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=8.
[ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8
[ 0.009332] CPU: Physical Processor ID: 0
[ 0.009333] CPU: Processor Core ID: 0
[ 0.009334] mce: CPU supports 7 MCE banks
[ 0.230921] smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (fam: 15,
model: 02, stepping: 00)
[ 0.247684] NMI watchdog: enabled on all CPUs, permanently consumes one
hw-PMU counter.
[ 0.254353] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7
[ 0.364267] x86: Booted up 1 node, 8 CPUs
[ 0.391139] cpuidle: using governor ladder
[ 0.404490] cpuidle: using governor menu
[ 0.405039] mtrr: your CPUs had inconsistent variable MTRR settings
[ 0.405040] mtrr: probably your BIOS does not setup all CPUs.
I've tried setting "amd_iommu=on" in default/grub. I'll try iommu=soft and
report back. Is there anything else to check? Funny, my IOMMU doesn't seem to
trigger any issue:
[ 0.792454] Unpacking initramfs...
[ 0.843735] Freeing initrd memory: 3924K (ffff880037846000 - ffff880037c1b000)
[ 0.844350] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 0.844351] AMD-Vi: Interrupt remapping enabled
[ 0.855146] AMD-Vi: Lazy IO/TLB flushing enabled
My issue explodes after xhci_hcd:
[ 1.159635] ohci-pci: OHCI PCI platform driver
[ 1.165660] ehci-pci 0000:00:12.2: USB 2.0 started, EHCI 1.00
[ 1.165859] hub 1-0:1.0: USB hub found
[ 1.165868] hub 1-0:1.0: 5 ports detected
[ 1.166060] xhci_hcd 0000:02:00.0: xHCI Host Controller
[ 1.166068] xhci_hcd 0000:02:00.0: new USB bus registered, assigned bus number 2
[ 1.166126] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
[ 1.167066] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
[ 1.168025] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 1.202519] AMD-Vi: Event logged [
[ 1.202571] input: AT Translated Set 2 keyboard as
/devices/platform/i8042/serio0/input/input0
[ 1.202829] IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
[ 1.202843] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 1.216256] AMD-Vi: Event logged [
[ 1.216326] firewire_ohci 0000:04:0e.0: added OHCI v1.10 device as card 0, 4
IR + 8 IT contexts, quirks 0x11
[ 1.216547] IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
[ 1.216563] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 1.716168] firewire_core 0000:04:0e.0: created device fw0: GUID
0014aafc64aa2c00, S400
[ 1.716813] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 1.932839] tsc: Refined TSC clocksource calibration: 4018.289 MHz
[ 1.932842] clocksource tsc: mask: 0xffffffffffffffff max_cycles:
0x39ebd986d5e, max_idle_ns: 440795317543 ns
[ 1.935061] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 2.937205] AMD-Vi: Event logged [
[ 2.937208] Switched to clocksource tsc
[ 2.937495] IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
[ 2.941453] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016
address=0x00000000ce9f9880 flags=0x0010]
<snip repeated>
[ 20.090108] xhci_hcd 0000:02:00.0: can't setup: -110
[ 20.094746] xhci_hcd 0000:02:00.0: USB bus 2 deregistered
[ 20.094771] ehci-pci 0000:00:13.2: EHCI Host Controller
[ 20.094778] ehci-pci 0000:00:13.2: new USB bus registered, assigned bus number 2
[ 20.094783] ehci-pci 0000:00:13.2: applying AMD SB700/SB800/Hudson-2/3 EHCI
dummy qh workaround
[ 20.094791] ehci-pci 0000:00:13.2: debug port 1
[ 20.094796] xhci_hcd 0000:02:00.0: init 0000:02:00.0 fail, -110
[ 20.094837] ehci-pci 0000:00:13.2: irq 17, io mem 0xfe507000
[ 20.099716] xhci_hcd: probe of 0000:02:00.0 failed with error -110
[ 20.104621] ehci-pci 0000:00:13.2: USB 2.0 started, EHCI 1.00
[ 20.104805] hub 2-0:1.0: USB hub found
[ 20.104811] hub 2-0:1.0: 5 ports detected
[ 20.105034] ehci-pci 0000:00:16.2: EHCI Host Controller
[ 20.105039] ehci-pci 0000:00:16.2: new USB bus registered, assigned bus number 3
[ 20.105042] ehci-pci 0000:00:16.2: applying AMD SB700/SB800/Hudson-2/3 EHCI
dummy qh workaround
[ 20.105050] ehci-pci 0000:00:16.2: debug port 1
[ 20.105073] ehci-pci 0000:00:16.2: irq 17, io mem 0xfe504000
[ 20.114633] ehci-pci 0000:00:16.2: USB 2.0 started, EHCI 1.00
[ 20.114787] hub 3-0:1.0: USB hub found
[ 20.114794] hub 3-0:1.0: 4 ports detected
[ 20.115031] ohci-pci 0000:00:12.0: OHCI PCI host controller
[ 20.115039] ohci-pci 0000:00:12.0: new USB bus registered, assigned bus number 4
[ 20.115065] ohci-pci 0000:00:12.0: irq 18, io mem 0xfe50a000
[ 20.172168] hub 4-0:1.0: USB hub found
[ 20.172177] hub 4-0:1.0: 5 ports detected
[ 20.172396] ohci-pci 0000:00:13.0: OHCI PCI host controller
[ 20.172401] ohci-pci 0000:00:13.0: new USB bus registered, assigned bus number 5
[ 20.172418] ohci-pci 0000:00:13.0: irq 18, io mem 0xfe508000
[ 20.228880] hub 5-0:1.0: USB hub found
[ 20.228889] hub 5-0:1.0: 5 ports detected
[ 20.229111] ohci-pci 0000:00:14.5: OHCI PCI host controller
[ 20.229117] ohci-pci 0000:00:14.5: new USB bus registered, assigned bus number 6
[ 20.229134] ohci-pci 0000:00:14.5: irq 18, io mem 0xfe506000
[ 20.285567] hub 6-0:1.0: USB hub found
[ 20.285575] hub 6-0:1.0: 2 ports detected
[ 20.285739] ohci-pci 0000:00:16.0: OHCI PCI host controller
[ 20.285744] ohci-pci 0000:00:16.0: new USB bus registered, assigned bus number 7
[ 20.285759] ohci-pci 0000:00:16.0: irq 18, io mem 0xfe505000
<snip boot continues normally>
I'll keep digging, but this is got me stumped.
--
David C. Rankin, J.D.,P.E.