sysfs: cannot create duplicate filename

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I ran into this error when upgrading to kernel 5.14 (it works
pre-5.14). The driver gets loaded (verified via lsmod), but the ib
utils don't show any ports.

There's a bugreport on elrepo
(https://elrepo.org/bugs/view.php?id=1176) that looks pretty much
identical, but it looks stalled.
I've bisected the kernel and commit
4a7aaf88c89f12f8048137e274ce0d40fe1056b2 seems to be the culprit.
Since I'm absolutely no dev, I'm hoping someone here can figure out
what exactly is going wrong. Latest mainline kernel still has the same
behavior.

Relevant output and logs below.

$ sudo lspci -s 04:00.0 -vv
04:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02)
Subsystem: QLogic Corp. IBA7322 QDR InfiniBand HCA
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 39
IOMMU group: 22
Region 0: Memory at fbc00000 (64-bit, non-prefetchable) [size=4M]
Expansion ROM at fc000000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <4us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s (ok), Width x4 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-
EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [b0] MSI-X: Enable- Count=32 Masked-
Vector table: BAR=0 offset=00008000
PBA: BAR=0 offset=00009000
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Kernel modules: ib_qib

Relevant /var/log/messages section:
Feb 13 09:43:29 localhost kernel: ib_qib 0000:04:00.0: qib0: Reserving
QPNs from 0x656b78 to 0x656b78 for non-verbs use
Feb 13 09:43:29 localhost kernel: sysfs: cannot create duplicate
filename '/devices/pci0000:00/0000:00:01.2/0000:01:00.0/0000:02:02.0/0000:04:00.0/infiniband/qib0/ports/1/linkcontrol'
Feb 13 09:43:29 localhost kernel: CPU: 15 PID: 1118 Comm: modprobe
Tainted: G           OE     5.17.0-rc3-robin+ #20
Feb 13 09:43:29 localhost kernel: Hardware name: ASUS System Product
Name/TUF GAMING X570-PRO (WI-FI), BIOS 3001 12/04/2020
Feb 13 09:43:29 localhost kernel: Call Trace:
Feb 13 09:43:29 localhost kernel: <TASK>
Feb 13 09:43:29 localhost kernel: dump_stack_lvl+0x48/0x5e
Feb 13 09:43:29 localhost kernel: sysfs_warn_dup.cold+0x17/0x24
Feb 13 09:43:29 localhost kernel: internal_create_group+0x348/0x350
Feb 13 09:43:29 localhost kernel: internal_create_groups.part.0+0x3d/0xa0
Feb 13 09:43:29 localhost kernel: setup_port+0x370/0x680 [ib_core]
Feb 13 09:43:29 localhost kernel: ? kobject_add+0x6e/0x90
Feb 13 09:43:29 localhost kernel: ib_setup_port_attrs+0x88/0x220 [ib_core]
Feb 13 09:43:29 localhost kernel: ib_register_device+0x576/0x650 [ib_core]
Feb 13 09:43:29 localhost kernel: ? __vmalloc_node+0x4a/0x70
Feb 13 09:43:29 localhost kernel: rvt_register_device+0x10c/0x270 [rdmavt]
Feb 13 09:43:29 localhost kernel: qib_register_ib_device+0x5f8/0x7a0 [ib_qib]
Feb 13 09:43:29 localhost kernel: qib_init_one+0x17f/0x470 [ib_qib]
Feb 13 09:43:29 localhost kernel: local_pci_probe+0x45/0x80
Feb 13 09:43:29 localhost kernel: ? pci_match_device+0xd7/0x130
Feb 13 09:43:29 localhost kernel: pci_device_probe+0xaa/0x1a0
Feb 13 09:43:29 localhost kernel: really_probe+0x1f5/0x3d0
Feb 13 09:43:29 localhost kernel: __driver_probe_device+0xfe/0x180
Feb 13 09:43:29 localhost kernel: driver_probe_device+0x1e/0x90
Feb 13 09:43:29 localhost kernel: __driver_attach+0xc0/0x1c0
Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0
Feb 13 09:43:29 localhost kernel: ? __device_attach_driver+0xe0/0xe0
Feb 13 09:43:29 localhost kernel: bus_for_each_dev+0x64/0x90
Feb 13 09:43:29 localhost kernel: bus_add_driver+0x149/0x1e0
Feb 13 09:43:29 localhost kernel: driver_register+0x8f/0xe0
Feb 13 09:43:29 localhost kernel: ? qib_init_qibfs+0x11/0x11 [ib_qib]
Feb 13 09:43:29 localhost kernel: qib_ib_init+0x3e/0xf62 [ib_qib]
Feb 13 09:43:29 localhost kernel: do_one_initcall+0x44/0x200
Feb 13 09:43:29 localhost kernel: ? kmem_cache_alloc_trace+0x163/0x2c0
Feb 13 09:43:29 localhost kernel: do_init_module+0x4c/0x260
Feb 13 09:43:29 localhost kernel: __do_sys_finit_module+0x9b/0xf0
Feb 13 09:43:29 localhost kernel: do_syscall_64+0x3b/0x90
Feb 13 09:43:29 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
Feb 13 09:43:29 localhost kernel: RIP: 0033:0x7fae66bf5ecd
Feb 13 09:43:29 localhost kernel: Code: 5b 41 5c c3 66 0f 1f 84 00 00
00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d
89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b ef
0e 00 f7 d8 64 89 01 48
Feb 13 09:43:29 localhost kernel: RSP: 002b:00007ffdb7beb198 EFLAGS:
00000246 ORIG_RAX: 0000000000000139
Feb 13 09:43:29 localhost kernel: RAX: ffffffffffffffda RBX:
00007fae677c8c80 RCX: 00007fae66bf5ecd
Feb 13 09:43:29 localhost kernel: RDX: 0000000000000000 RSI:
00007fae67173a2a RDI: 0000000000000003
Feb 13 09:43:29 localhost kernel: RBP: 0000000000040004 R08:
0000000000000000 R09: 0000000000000000
Feb 13 09:43:29 localhost kernel: R10: 0000000000000003 R11:
0000000000000246 R12: 00007fae67173a2a
Feb 13 09:43:29 localhost kernel: R13: 00007fae677c8ed0 R14:
00007fae677c8c80 R15: 00007fae677c9980
Feb 13 09:43:29 localhost kernel: </TASK>
Feb 13 09:43:29 localhost kernel: infiniband qib0: Couldn't register
device with driver model
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: Failed to
register driver with ib core.
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: qib0: cannot
register verbs: 17!
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 140
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 141
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 142
Feb 13 09:43:33 localhost kernel: ib_qib 0000:04:00.0: Disabling
notifier on HCA 0 irq 144
Feb 13 09:43:33 localhost kernel: ib_qib: probe of 0000:04:00.0 failed
with error -17

$ ibstatus
Fatal error:  device '*': sys files not found (/sys/class/infiniband/*/ports)

$ git bisect bad
4a7aaf88c89f12f8048137e274ce0d40fe1056b2 is the first bad commit
commit 4a7aaf88c89f12f8048137e274ce0d40fe1056b2
Author: Jason Gunthorpe <jgg@xxxxxxxxxx>
Date:   Fri Jun 11 19:00:30 2021 +0300

    RDMA/qib: Use attributes for the port sysfs

    qib should not be creating a mess of kobjects to attach to the port
    kobject - this is all attributes. The proper API is to create an
    attribute_group list and create it against the port's kobject.

    Link: https://lore.kernel.org/r/911e0031e1ed495b0006e8a6efec7b67a702cd5e.1623427137.git.leonro@xxxxxxxxxx
    Tested-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxxxxxxxxxxxxx>
    Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx>
    Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>

 drivers/infiniband/hw/qib/qib.h       |   4 -
 drivers/infiniband/hw/qib/qib_sysfs.c | 606 ++++++++++++++--------------------
 2 files changed, 254 insertions(+), 356 deletions(-)


Cheers
Robin



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux