[...] >>> >> I took a peek at the v1, but suffice to say didn't follow it that >> closely since there's a number of changes in v2 related to patch >> ordering. You could get 3 different opinions on that matter too! > > Hopefully a consensus before too long. :) > yeah - v1 was lots of little steps that you were asked to combine. OTOH there are those that like to combine multiple things in one patch which can make it really difficult to follow. The order here was "followable" - my suggestion was more to try and get the guts (qemu) done first, then the fluff (domain/xml). Consider your patch order to be making progress towards some sustainable feature/bug fix - each step along the way being able to pass "make check syntax-check" so that future attempts to git bisect can point the fickle finger of fate (at someone else!). If a patch adds something that will be external that a followup patch will use - then great. If it's testable, even better. For example, adding virSCSIOpenVhost as a separate patch with the tests/qemuxml2argvmock.c is fine. One more tidbit here - it's a very faint memory, but check out tests/commandtest.c and the lengthy comment in mymain() regarding "expected" fd values. I think you're "safe" in what you've done - although your mock should mimic whatever changes you make to the primary function. [...] >> A wwpn doesn't generally have the "naa." prefix on it. So, why not just >> add that to the command line instead? That is a wwpn is a 16 hex digit >> value. I'm assuming that in order to support this feature a "naa." is >> prefixed and sent to qemu. Does that necessarily mean some other >> hypervisor would choose to place the "naa." prefix or would then code >> need to be added to strip it off for those other hypervisors? Looking >> at the qemu code - it seems fairly specific. > > Fairly certain that it's more a vhost thing of which the hypervisor > utilizes. Per a recommendation made on this list years ago, I used > targetcli to do the configuration rather than direct manipulation of > sysfs, and according to its man page: > > All WWNs are prefaced by their type, such as "iqn", "naa", or "ib", > and may be given with or without colons. > I don't use targetcli so I'm not familiar with it. Still not convinced using naa.# in the XML is the best way to go, but I'm not against it. I guess a lot depends on how the sysfs views things. > As to whether other hypervisors follow the same behavior, I do not > know. I was just trying to conform to what it would allow in the case > of vhost-scsi. Now, I suppose I'm confusing matters by saying "wwpn" > here since as you say that's 16 hex digits, and should probably say just > "wwn" ? > I guess I've always thought of a wwn/wwpn/wwnn as that string of bytes... Usually WWN/WWNN have been interchangeable, while WWPN is a WWN assigned to a port in a Fibre Channel fabric FWIW: virsh nodedev-dumpxml on a vport capable HBA and the created vHBA will list a wwnn, wwpn, and fabric_wwn. The fabric_wwn's will match. The wwnn/wwpn for the vport is what was used when running the HBA's/vport_create function. It's a tangled web. Wait until vGPU's show up (that's another black magic trick). [...] >> It would be nice to tell a user how to find that wwpn value via some >> command (virsh or otherwise) in order to fill in the wwpn attribute. >> Although there's always varying opinions on this since many times it's >> distro dependent. That's why I suggest virsh - at least if it's >> supportable we can display the wwpn there using nodedev-dumpxml. > > Hrm... nodedev-dumpxml is new to me, neat! I'll go down that bunny > trail, but it seems like it'd be a bit of work to add the smarts for this. > Hours of all sorts of fun.... The 'nodedev-{list|dumpxml}' are essentially the libvirt mechanism to list/describe the various rabbit holes that is/are the sysfs tree and especially branches. The nodedev-list --tree will probably be of most interest, but so is virsh nodedev-list {scsi_host|pci|usb|scsi|scsi_generic|scs_target|net} >> >> What/where is the 'vhost_scsi_wwpn'? IOW: It's not something defined >> yet - I'm assuming there's some output somewhere where that's pulled >> from. > > As near as I can tell, it's not anything that can be autogenerated off > the bat. Somebody somewhere needs to first establish the sysfs entries > for the vhost target, which can then be stored and loaded on a next > (host) boot. The end result looks something like this: > AFAIK systemd handles the sysfs creation (ok at least for my purposes). This is what I was concerned about - we really need a way to tell someone how to "easily" find the wwpn that's to be used. Although I don't have vhost-scsi on my laptop ;-) - I have to figure that by using 'ls -al' on the path from a 'virsh nodedev-dumpxml' of a scsi_hostN that is an HBA will show a <path> that path would then be a starting point... Googling "scsi_host HBA wwn" shows me some interesting results including ones that seem to indicate 'fc_host' is involved in some way even for the HBA (I guess that makes sense - where the vHBA is just an abstraction off of that). That said, a '/sys/class/fc_host/' is the starting point then to find things and virsh nodedev-list --cap vports would be the way to get capable HBA's. > [root@xxxxxxxx ~]# ls -l > /sys/kernel/config/target/vhost/naa.50014056e4794195/tpgt_1/lun/lun_0/ > total 0 Is the 'vhost' or 'vhost/naa.%s' linked from some /sys/class/{scsi|fc}_host/host%d directory? I think that becomes the key for traversal. Or maybe finding "5001...4195" in some other link from /sys/class/{scsi|fc}_host/.... > lrwxrwxrwx 1 root root 0 Sep 13 15:44 752e150d11 -> > ../../../../../../target/core/iblock_0/disk0 > -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_gp > -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_offline > -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_status > -rw-r--r-- 1 root root 4096 Sep 13 15:45 alua_tg_pt_write_md > drwxr-xr-x 5 root root 0 Sep 13 15:44 statistics > [root@xxxxxxxx ~]# cat /sys/kernel/config/target/core/iblock_0/disk0/info > Status: ACTIVATED Max Queue Depth: 0 SectorSize: 512 HwMaxSectors: 4592 > iBlock device: dm-0 UDEV PATH: /dev/mapper/mpathb readonly: 0 > Major: 252 Minor: 0 CLAIMED: IBLOCK > > (Aside: note the folder name in /sys/kernel/config that contains the > wwn/wwpn name with the "naa." prefix.) > > I see... Then again "/sys/kernel" isn't the path used by nodedev driver, hence my question above. >> >> One final question - what is this exposed as on the guest? "Some" >> scsi_hostM? For a SCSI LUN it's some 'sgN' value. I see the qemu side >> seems to infer some amount of scsi_generic code (read_naa_id), so it's >> mostly curiosity/learning for me. > > Anything that's put behind the target gets seen to the guest. So that > can be one or more sgN LUNs and their sdX equivalents, and as near as I > can tell a scsi_hostM as you describe. A quick test with two LUNs > behind one vhost-scsi target gives me the following in the guest: > > # ls -l /sys/bus/scsi/devices > total 0 > lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:0 -> > ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:0 > lrwxrwxrwx 1 root root 0 Sep 13 16:10 0:0:1:1 -> > ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1/0:0:1:1 > lrwxrwxrwx 1 root root 0 Sep 13 16:10 host0 -> > ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0 > lrwxrwxrwx 1 root root 0 Sep 13 16:11 target0:0:1 -> > ../../../devices/css0/0.0.0002/0.0.beef/virtio2/host0/target0:0:1 > # lsscsi > [0:0:1:0] disk LIO-ORG disk0 4.0 /dev/sda > [0:0:1:1] disk LIO-ORG disk1 4.0 /dev/sdb > > FYI, hotplugging/unplugging devices to a target works just fine, except > there's no notification that a guest can pick up on to know that a > device had been added. Or worse, taken away. > Hmm. Maybe I'm just not asking the right question. I guess my perception was that if you expose the 'scsi_hostN' HBA then you are giving the guest the whole HBA to use as it sees fit. The host shouldn't be touching the luns afterwards. What you've shown is 2 LUNS in the /sys/bus/scsi/ tree; however, is there a /sys/bus/scsi_host/hostN that corresponds to the passed through HBA? [...] >>> + <attribute name="protocol"> >>> + <choice> >>> + <value>vhost</value> <!-- vhost, required --> >>> + </choice> >>> + </attribute> >>> + <attribute name="wwpn"> >>> + <data type="string"> >>> + <param name="pattern">(naa\.)[0-9a-fA-F]{16}</param> >>> + </data> >> If you go with the we'll add "naa." later, then you could just: >> >> <ref name='wwn'/> > > This rings a bell with me, as to why I had the label be "wwpn" instead > of "wwn" above. My attempt at trying to distinguish libvirt internals > with something user-facing, maybe? > I think wwpn will still be technically correct (for some strange reason after skimming google results). The HBA will have a port which is how it's reached. I suppose the answer though is in how sysfs views things. [...] >>> "pci", >>> - "scsi") >>> + "scsi", >>> + "scsi_host") >>> VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend, >>> VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST, >>> @@ -660,6 +661,10 @@ VIR_ENUM_IMPL(virDomainHostdevSubsysSCSIProtocol, >>> "adapter", >>> "iscsi") >>> +VIR_ENUM_IMPL(virDomainHostdevSubsysHostProtocol, >>> + VIR_DOMAIN_HOSTDEV_HOST_PROTOCOL_TYPE_LAST, >>> + "vhost") >>> + >> Notice how the virDomainHostdevSubsysSCSIProtocol adds 'adapter' first - >> this should do something similar (although the preference is to use >> "none" - I cannot recall why that was different). > > Because "none" == "adapter" in the case of SCSI? > Some sort of RNG magic in commit id '54ac483e6' - I forget the details (too long ago), but I think it had to do with being able to reuse the existing RNG without creating a whole new option. [...] >> hmmm... If you're going to change this here, then >> qemuDomainAttachHostDevice would be changed at the same time. Hence why >> this ordering stuff to patches is important. >> >> I cannot recall if removing would cause a build error in this case. > > It does, because of the switch statement not having a "default" case. > Perhaps a different ordering of patches removes the need.... But I know it's tough with this new type... I guess it's odd the *RemoveHostDevice needed it, but the corollary Attach didn't... Ahhh - I see why - the switch in Attach doesn't have the typecast (virDomainHostdevSubsysType). I can send a patch for that or you can make the modification too... [...] > Thanks for the review. (Silent ACK on a lot of the above comments.) > I'll try to get these all in place for a v3 with a little runway before > the next freeze. > OK - hopefully I won't be neck deep in some other firefight and will be able to give them quicker attention. At the very least getting the capabilities changes in which is generally the most conflict... John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list