On Tue, 14 Jan 2014 17:15:25 +0800 Wei Yang <weiyang@xxxxxxxxxxxxxxxxxx> wrote: > >> Error log: > >> mlx4_core 0003:05:00.0: Multiple PFs not yet supported. Skipping > >> PF. mlx4_core: probe of 0003:05:00.0 failed with error -22 > > > >1. Please include the full error log starting from host bootup and > >driver start > > Will provide one later. >From procedure __mlx4_init_one() in file drivers/net/ethernet/mellanox/mlx4_main.c: /* We reset the device and enable SRIOV only for physical * devices. Try to claim ownership on the device; * if already taken, skip -- do not allow multiple PFs */ ==> err = mlx4_get_ownership(dev); if (err) { if (err < 0) goto err_free_dev; else { mlx4_warn(dev, "Multiple PFs not yet supported." " Skipping PF.\n"); err = -EINVAL; goto err_free_dev; } } =================== #define MLX4_OWNER_BASE 0x8069c #define MLX4_OWNER_SIZE 4 static int mlx4_get_ownership(struct mlx4_dev *dev) { void __iomem *owner; u32 ret; if (pci_channel_offline(dev->pdev)) return -EIO; ==> owner = ioremap(pci_resource_start(dev->pdev, 0) + MLX4_OWNER_BASE, MLX4_OWNER_SIZE); if (!owner) { mlx4_err(dev, "Failed to obtain ownership bit\n"); return -ENOMEM; } ret = readl(owner); iounmap(owner); ==> return (int) !!ret; } I suspect one of the following scenarios: 1. BAR 0 contains a "PPF Selection" (i.e., ownership) semaphore: The first PF probe which reads the semaphore "acquires" it (i.e., the first read grabs the semaphore (the read returns zero). Subsequent reads return non-zero. When the PF driver is unloaded, it calls "mlx4_free_ownership()", which writes a zero into the semaphore dword, so that the next read will return zero. In this scenario, initialization of the "PPF selection" semaphore to zero has been compromised somehow, so that even the first read attempt returns a non-zero value. In this scenario, note that the ioremap DID succeed, or we would see the "Failed to obtain ownership bit" message in the error log. Maybe pre-fetching has something to do with this? (i.e., maybe if the BAR is not prefetched, the initial value of the semaphore is compromised). 2. For some reason the same PF is being probed twice by the kernel. In this case the second probe attempt fails because the PF has already been probed once. This is the reason that I want to see the entire log -- to see if indeed the device is being "double-probed" -Jack -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html