On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: > On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote: > >On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote: > >>On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: > >>>On Wed, Jul 11, 2012 at 09:15:00PM +0000, Nicholas A. Bellinger wrote: > > > ><SNIP> > > > >>> > >>>It still seems not 100% clear whether this driver will have major > >>>userspace using it. And if not, it would be very hard to support a driver > >>>when recent userspace does not use it in the end. > >> > >>I don't think this is a good reason to exclude something from the kernel. > >>However, there are good reasons why this doesn't make sense for something like > >>QEMU--specifically because we have a large number of features in our block layer > >>that tcm_vhost would bypass. > >> > > > >I can definitely appreciate your concern here as the QEMU maintainer. > > > >>But perhaps it makes sense for something like native kvm tool. And if it did go > >>into the kernel, we would certainly support it in QEMU. > >> > > > >... > > > >>But I do think the kernel should carefully consider whether it wants to support > >>an interface like this. This an extremely complicated ABI with a lot of subtle > >>details around state and compatibility. > >> > >>Are you absolutely confident that you can support a userspace application that > >>expects to get exactly the same response from all possible commands in 20 kernel > >>versions from now? Virtualization requires absolutely precise compatibility in > >>terms of bugs and features. This is probably not something the TCM stack has > >>had to consider yet. > >> > > > >We most certainly have thought about long term userspace compatibility > >with TCM. Our userspace code (that's now available in all major > >distros) is completely forward-compatible with new fabric modules such > >as tcm_vhost. No update required. > > I'm not sure we're talking about the same thing when we say compatibility. > > I'm not talking about the API. I'm talking about the behavior of > the commands that tcm_vhost supports. > > If you add support for a new command, you need to provide userspace > a way to disable this command. If you change what gets reported for > VPD, you need to provide userspace a way to make VPD look like what > it did in a previous version. > > Basically, you need to be able to make a TCM device behave 100% the > same as it did in an older version of the kernel. > > This is unique to virtualization due to live migration. If you > migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure > that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel > because the guest that is interacting with it does not realize that > live migration happened. > > Yes, you can add knobs via configfs to control this behavior, but I > think the question is, what's the plan for this? > > BTW, I think this is a good thing to cover in > Documentation/vhost/tcm_vhost.txt. I think that's probably the only > change that's needed here. > > Regards, > > Anthony Liguori I agree it's needed but it's not a requirement for merging IMHO. As a first step we can disable live migration. > > > >Also, by virtue of the fact that we are using configfs + rtslib (python > >object library) on top, it's very easy to keep any type of compatibility > >logic around in python code. With rtslib, we are able to hide configfs > >ABI changes from higher level apps. > > > >So far we've had a track record of 100% userspace ABI compatibility in > >mainline since .38, and I don't intend to merge a patch that breaks this > >any time soon. But if that ever happens, apps using rtslib are not > >going to be effected. > > > >>>I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. > >>>Then we don't commit to an ABI. > >> > >>I think this is a good idea. Even if it goes in, a really clear policy would be > >>needed wrt the userspace ABI. > >> > >>While tcm_vhost is probably more useful than vhost_blk, it's a much more complex > >>ABI to maintain. > >> > > > >As far as I am concerned, the kernel API (eg: configfs directory layout) > >as it is now in sys/kernel/config/target/vhost/ is not going to change. > >It's based on the same drivers/target/target_core_fabric_configfs.c > >generic layout that we've had since .38. > > > >The basic functional fabric layout in configfs is identical (with fabric > >dependent WWPN naming of course) regardless of fabric driver, and by > >virtue of being generic it means we can add things like fabric dependent > >attributes + parameters in the future for existing fabrics without > >breaking userspace. > > > >So while I agree the ABI is more complex than vhost-blk, the logic in > >target_core_fabric_configfs.c is a basic ABI fabric definition that we > >are enforcing across all fabric modules in mainline for long term > >compatibility. > > > >--nab > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html