On Thu, 17 Dec 2020, 9:48am, Bjorn Helgaas wrote: > ---------------------------------------------------------------------- > On Wed, Dec 16, 2020 at 04:44:47PM -0800, Arun Easi wrote: > > Hi Bjorn, > > > > This is regarding the blacklisting of one of the Marvell QLogic FC > > adapter (1077/2261) on VPD area access. The commit that did was > > this: > > > > --8<-- pruned commit message --8<-- > > | commit 0d5370d1d85251e5893ab7c90a429464de2e140b > > | Author: Ethan Zhao <ethan.zhao@xxxxxxxxxx> > > | Date: Mon Feb 27 17:08:44 2017 +0900 > > | > > | PCI: Prevent VPD access for QLogic ISP2722 > > | Call Trace: > > | <NMI> [<ffffffff817193de>] dump_stack+0x63/0x81 > > | [<ffffffff81714072>] panic+0xd0/0x20e > > | [<ffffffff8101c8b4>] do_nmi+0xf4/0x170 > > | <<EOE>> [<ffffffff815db4b3>] raw_pci_read+0x23/0x40 > > | [<ffffffff815db4fc>] pci_read+0x2c/0x30 > > | [<ffffffff8136f612>] pci_user_read_config_word+0x72/0x110 > > | [<ffffffff8136f746>] pci_vpd_pci22_wait+0x96/0x130 > > | [<ffffffff8136ff9b>] pci_vpd_pci22_read+0xdb/0x1a0 > > | [<ffffffff8136ea30>] pci_read_vpd+0x20/0x30 > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3D0d5370d1d85251e5893ab7c90a429464de2e140b&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=ELM92cQ6_T7TDXwxkSNKe9OLqFeqz7taCSqUM65T0s0&s=FIa8kmuzs8pePhWEFklKpu49_i72T5mJMZgLp1QrEg0&e= > > > > While investigating the original report of the issue, I found an > > interesting information that may explain why Ethan Zhao was > > hitting the NMI/crash. > > > > If you notice the stack referred in the commit, you could see that > > a bunch of old vpd access functions, pci_vpd_pci22*, were referred. > > When these functions were in use (these functions were renamed > > around 2016 Feb by f1cd93f9aabe), there was a critical VPD > > access bug missing; missing in fact most of the life of those > > functions. > > > > This one: > > 104daa71b396: PCI: Determine actual VPD size on first access > > > > Without this patch, a read of the vpd sysfs file can access area > > outside of VPD space, which is not allowed by the spec. > > > > My guess here is that, Ethan, when trying to access VPD of the QLogic > > 1077/2261 adapter, was using a kernel that had the bug present and > > it led up to the NMI/crash he was observing on his machine. > > > > We had an early firmware that returned CA on VPD access beyond > > bounds that is known to NMI some servers. The FW has since changed to > > not clear the VPD flag upon out-of-bound access. > > > > In light of the above, plus the fact that I did try the > > experiment on multiple setups and was not able to reproduce the > > issue, would you be willing to revert the above patch? If so, I > > could send a git revert (or equivalent) patch of the commit. > > > > This blacklisting is preventing multiple customers from accessing > > the VPD area of the said production adapter and making their life > > a bit difficult. > > > > Regards, > > -Arun > > > > Old discussion of the topic: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lkml.org_lkml_2019_5_21_991&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=ELM92cQ6_T7TDXwxkSNKe9OLqFeqz7taCSqUM65T0s0&s=1vnPz2Y-ziZnOVBFLuc9QupJeVDdtUUJTMJrNknSe7o&e= > > It makes sense to revert 0d5370d1d852 ("PCI: Prevent VPD access for > QLogic ISP2722") as long as the resulting kernel works correctly on > all versions of the 2722 firmware. We have to assume some customers > still have the old firmware on their adapters. > > Bjorn > Thanks Bjorn. I'll post the patch with a stable cc of v4.6+, which has the VPD size check available. Regards, -Arun