Re: [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/14/18 10:54 AM, Hannes Reinecke wrote:
On 12/14/18 10:06 AM, Thadeu Lima de Souza Cascardo wrote:
On Fri, Dec 14, 2018 at 06:56:06AM -0200, Thadeu Lima de Souza Cascardo wrote:
On Fri, Dec 14, 2018 at 08:47:20AM +0100, Hannes Reinecke wrote:
But you haven't answered my question:

Why can't we patch 'lsblk' to provide the required information even with the
current sysfs layout?



Just to be clear here. If with 'current sysfs layout' you mean without any of the patches we have been talking about, lsblk is not broken. It just works with nvme multipath enabled. It will show the multipath paths and simply ignore the underlying/hidden ones. If we hid them, we meant for them to be hidden, right?

What I am trying to fix here is how to find out which PCI device/driver is needed to get to the block device holding the root filesystem, which is what initramfs needs. And the nvme multipath device is a virtual device, pointing to no driver at all, and no relation to its underlying devices, needed for it to
work.


Well ...
But this is an entirely different proposition.
The 'slaves'/'holders' trick just allows to map the relationship between _block_ devices, which arguably is a bit pointless here seeing that we don't actually have block devices for the underlying devices. But even if we _were_ implementing that you would still fail to get to the PCI device providing the block devices as there is no link pointing from one to another.

With the currently layout we have this hierarchy:

NVMe namespace (/dev/nvmeXn1Y) -> NVMe-subsys -> NVMe controller

and the NVMe controller is missing a link pointing to the device presenting the controller:

# ls -l /sys/devices/virtual/nvme-fabrics/ctl/nvme2
total 0
-r--r--r-- 1 root root 4096 Dec 13 13:18 address
-r--r--r-- 1 root root 4096 Dec 13 13:18 cntlid
--w------- 1 root root 4096 Dec 13 13:18 delete_controller
-r--r--r-- 1 root root 4096 Dec 13 13:18 dev
lrwxrwxrwx 1 root root    0 Dec 13 13:18 device -> ../../ctl
-r--r--r-- 1 root root 4096 Dec 13 13:18 firmware_rev
-r--r--r-- 1 root root 4096 Dec 13 13:18 model
drwxr-xr-x 9 root root    0 Dec  3 13:55 nvme2c64n1
drwxr-xr-x 2 root root    0 Dec 13 13:18 power
--w------- 1 root root 4096 Dec 13 13:18 rescan_controller
--w------- 1 root root 4096 Dec 13 13:18 reset_controller
-r--r--r-- 1 root root 4096 Dec 13 13:18 serial
-r--r--r-- 1 root root 4096 Dec 13 13:18 state
-r--r--r-- 1 root root 4096 Dec 13 13:18 subsysnqn
lrwxrwxrwx 1 root root    0 Dec  3 13:44 subsystem -> ../../../../../class/nvme
-r--r--r-- 1 root root 4096 Dec 13 13:18 transport
-rw-r--r-- 1 root root 4096 Dec 13 13:18 uevent

So what we need to do is to update the 'device' link to point to the PCI device providing the controller. (Actually, we would need to point the 'device' link to point to the entity providing the transport address, but I guess we don't have that for now.)

And _that's_ what we need to fix; the slaves/holders stuff doesn't solve the underlying problem, and really shouldn't be merged at all.

Mind you, it _does_ work for PCI-NVMe:

# ls -l /sys/class/nvme/nvme0
total 0
-r--r--r--  1 root root 4096 Dec 14 11:14 cntlid
-r--r--r--  1 root root 4096 Dec 14 11:14 dev
lrwxrwxrwx  1 root root    0 Dec 14 11:14 device -> ../../../0000:45:00.0
-r--r--r--  1 root root 4096 Dec 14 11:14 firmware_rev
-r--r--r--  1 root root 4096 Dec 14 11:14 model
drwxr-xr-x 12 root root    0 Dec  3 13:43 nvme1n1
drwxr-xr-x  2 root root    0 Dec 14 11:14 power
--w-------  1 root root 4096 Dec 14 11:14 rescan_controller
--w-------  1 root root 4096 Dec 14 11:14 reset_controller
-r--r--r--  1 root root 4096 Dec 14 11:14 serial
-r--r--r--  1 root root 4096 Dec 14 11:14 state
-r--r--r--  1 root root 4096 Dec 14 11:14 subsysnqn
lrwxrwxrwx 1 root root 0 Dec 3 13:43 subsystem -> ../../../../../../class/nvme
-r--r--r--  1 root root 4096 Dec 14 11:14 transport
-rw-r--r--  1 root root 4096 Dec 14 11:14 uevent

So it might be as simple as this patch:

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index feb86b59170e..1ecdec6b8b4a 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3117,7 +3117,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
         * Defer this to the connect path.
         */

-       ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0);
+       ret = nvme_init_ctrl(&ctrl->ctrl, ctrl->dev, &nvme_fc_ctrl_ops, 0);
        if (ret)
                goto out_cleanup_admin_q;


As for RDMA / TCP we're running on a network address which really isn't tied to a specific device, so we wouldn't have any device to hook on without some trickery.

Cheers,

Hannes
--
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@xxxxxxx			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux