On 02/21/2017 12:03 PM, Daniel P. Berrange wrote: > On Tue, Feb 21, 2017 at 11:33:25AM -0500, John Ferlan wrote: >> Repost: http://www.redhat.com/archives/libvir-list/2017-February/msg00501.html >> >> to update to top of branch as of commit id '5ad03b9db2' > > BTW, could you include the full cover letter in each new version rather > than making people follow links all the way back to v1 to find info > about the patch series goals. OK - I'll try to remember. > > IIUC, the intention here is that we automatically create NPIV devices > when starting guests and delete them when stopping guests. I can see > some appeal in this, but at the same time I'm not convinced we should > add such a feature. A bit more than that - create the vHBA and assign the LUN's to the guest as they are discovered and remove them as they are removed (events from udev). This is a mechanism/idea from Paolo. The RHV team would be the primary consumer and IIRC they don't use storage pools. > > AFAICT, the node device APIs already allow a management application to > achieve the same end goal without needing this integration. Yes, it > would simplify usage of NPIV on the surface, but the cost of doing this > is that it ends a specific usage policy for NPIV in the libvirt code and > makes error handling harder. In particular it is possible to get into a > situation where a VM fails to start and we're also unable to clear up > the NPIV device we just auto-created. Now this could be said to apply > to pretty much everything we do during guest startup, but in most cases > the failure is harmless or gets auto-cleaned up by the kernel (ie the > tap devices get auto-deleted when the FD is closed, or SELinux labels > get reset next time a VM wants that file, locks are released when we > close the virtlockd file handle, etc). NPIV is significantly more > complicated and more likely to hit failure scenarios due to fact that > it involves interactions with off-node hardware resources. I agree with your points. The "purpose" of libvirt taking care of it would be to let libvirt handle all those nasty and odd failure or integration issues - including migration. Of course from a libvirt perspective, I'd rather take the 'scsi_hostX' vHBA and just pass that through to QEMU directly to allow it (or the guest) to find the LUN's, but that's push the problem the other way. I said early on that this is something that could be done by the upper layers that would be able to receive the add/remove lun events whether they created a storage pool just for that purpose or they created the vHBA themselves. It's probably even in the bz's on this. > > Is there some aspect of NPIV mgmt that can only be achieved if libvirt > is explicitly managing the device lifecycle during VM start/stop, as > opposed to having the mgmt app manage it ? > Beyond the upper layers not needing to handle anything other than creating the vHBA for the domain and letting libvirt handle the rest. > If OpenStack were to provide NPIV support I think it'd probably end > up dealing with device setup explicitly via the node device APIs > rather than relying on libvirt to create/delete them. That way it > can track the lifecycle of NPIV devices explicitly, and if it is not > possible to delete them at time of QEMU shutdown for some reason, it > can easily arrange to delete them later. > > > Overall I think one of the more successful aspects of libvirt's design > has been the way we minimise the addition of usage policy decisions, in > favour of providing mechanisms that applications can use to implement > policies. This has had a cost in that applications need todo more work > themselves, but on balance I still think it is a win to avoid adding > policy driven features to libvirt. > > A key question is just where "autocreation/delete of NPIV devices" falls > in the line between mechanism & policy, since the line is not entirely > black & white. I tend towards it being policy though, since it is just > providing a less general purpose way todo something that can be achieved > already via the node device APIs. > > Regards, > Daniel > I understand - to a degree I guess I had assumed some of these type discussions had taken place by those that wanted the feature added. One other good thing that's come out of these changes is a bit more testing for vHBA creation via nodedev/storage pool and quite a bit of code cleanup once/if most of the patches I posted earlier in the week are accepted. John (FWIW: I'll have limited access to email over the next couple of days...) -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list