On 01/26/2018 04:07 AM, John Ferlan wrote: > > > On 01/18/2018 11:04 AM, Michal Privoznik wrote: >> This is a definition that holds information on SCSI persistent >> reservation settings. The XML part looks like this: >> >> <reservations enabled='yes' managed='no'> >> <source type='unix' path='/path/to/qemu-pr-helper.sock' mode='client'/> >> </reservations> >> >> If @managed is set to 'yes' then the <source/> is not parsed. >> This design was agreed on here: >> >> https://www.redhat.com/archives/libvir-list/2017-November/msg01005.html >> >> Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> >> --- >> docs/formatdomain.html.in | 25 +++- >> docs/schemas/domaincommon.rng | 19 +-- >> docs/schemas/storagecommon.rng | 34 +++++ >> src/conf/domain_conf.c | 36 +++++ >> src/libvirt_private.syms | 3 + >> src/util/virstoragefile.c | 148 +++++++++++++++++++++ >> src/util/virstoragefile.h | 15 +++ >> .../disk-virtio-scsi-reservations-not-managed.xml | 40 ++++++ >> .../disk-virtio-scsi-reservations.xml | 38 ++++++ >> .../disk-virtio-scsi-reservations-not-managed.xml | 1 + >> .../disk-virtio-scsi-reservations.xml | 1 + >> tests/qemuxml2xmltest.c | 4 + >> 12 files changed, 348 insertions(+), 16 deletions(-) >> create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations-not-managed.xml >> create mode 100644 tests/qemuxml2argvdata/disk-virtio-scsi-reservations.xml >> create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations-not-managed.xml >> create mode 120000 tests/qemuxml2xmloutdata/disk-virtio-scsi-reservations.xml >> > > Before digging too deep into this... > > - I assume we're avoiding <disk> iSCSI mainly because those > reservations would take place elsewhere, safe assumption? I believe so, but I'll let Paolo answer that. The way I understand reservations is that qemu needs to issue 'privileged' SCSI commands and thus for regular SCSI (which for purpose of this argument involves iSCSI emulated by kernel) either qemu needs CAP_SYS_RAWIO or a helper process to which it'll pass the FD and which will issue the 'privileged' SCSI commands on qemu's behalf. > > - What about using lun's from a storage pool (and what could become > your favorite, NPIV devices ;-)) > > <disk type='volume' device='lun'> > <driver name='qemu' type='raw'/> > <source pool='sourcepool' volume='unit:0:4:0'/> > <target dev='sda' bus='scsi'/> > </disk> These should work too with my patches (not tested though - I don't have any real SCSI machine). > > - What about <hostdev>'s? > > <hostdev mode='subsystem' type='scsi'> > > but not iSCSI or vHost hostdev's. I think that creates the SCSI > generic LUN, but it's been a while since I've thought about the > terminology used for hostdev's... I think these don't need the feature since qemu can access the device directly. > > I also have this faint recollection of PR related to sgio filtering > and perhaps even rawio, but dredging that back up could send me down the > path of some "downstream only" type bz's. Although searching on just > VIR_DOMAIN_DISK_DEVICE_LUN does bring up qemuSetUnprivSGIO. > > And finally... I assume there is one qemu-pr-manager (qemu.conf changes > eventually)... Eventually there's magic that allows/adds per domain > *and* per LUN some socket path. If libvirt provided it's generated via > the domain temporary directory; however, what's not really clear is how > that unmanaged path really works. Need a virtual whiteboard... So, in case of unmanaged path, here are the assumptions that my patches are built on: 1) unmanaged helper process (UHP) is spawned by somebody else's than libvirtd (hence unmanaged) - it doesn't have to be user, it can be systemd for instance. 2) path to UHP's socket has to be labeled correctly - libvirt doesn't touch that, because it knows nothing about usage scenario, whether sys admin intended one UHP per whole host and thus configured label that way, or it is spawned by mgmt app (or systemd, or whomever) per one domain, or even disk. Therefore, we can do nothing more than shrug shoulders and require users to label the socket correctly. Or use managed helper. 3) in future, when UHP dies, libvirt will NOT spawn it again. It's unmanaged after all. It's user/sysadmin responsibility to spawn it again. Now, for the managed helper process (MHP) the assumptions are: 1) there's one MHP per domain (all SCSI disks in the domain share the same MHP). 2) the MHP runs as root, but is placed into the same CGroup, mount namespace as qemu process it serves 3) MHP is lives and dies with the domain it is associated with. The code might be complicated more than needed - it is prepared to have one MHP per disk rather than domain (should we ever need it). Therefore instead of storing one pid_t, we store them in a hash table where more can be stored. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list