On Mon, Oct 04, 2010 at 08:38:57AM +0200, Daniel Veillard wrote: > On Sun, Oct 03, 2010 at 11:51:12PM +1100, Justin Clift wrote: > > On 10/03/2010 08:33 PM, Richard W.M. Jones wrote: > > <snip> > > >Indeed. I'm sure we need a whitelist, not a blacklist as suggested by > > >the other comment. All domains I'd ever want to create would match > > >the regexp > > > > > >^[[:alpha:]][-_[:alnum:]]*$ > > > > > >This might break existing users however. > > > > Wonder if there are characters supported by some hypervisors, but not > > others? > > I remember we had troubles with Xen, a long time ago, yes > So unfortunately this is really hypervsor specific... Maybe we could > have a generic checking routine but only providing a warning when > the name isn't a simple name the XML way. One of the problem of the > checking too is that most of the hypervisor APIs don't say a word about > encoding, so you're not manipulating characters but 0 terminated byte > strings. From there even your simple regexp goes havoc because what is > an alphanumeric character, requires character analysis and you need the > encoding for this. At least at libvirt API things are rather clear, > in XML data there is no ambiguity possible, and outside we expect > strings to be UTF-8. > Actually I think that for ESX since all exchanges with the hypervisor > are XML based there isn't that ambiguity about encoding at least. > > > ie maybe Xen supports '/', '*', '+' in guest names, but ESX doesn't > > > > That could lead to some interesting guest import problems. :( > > goes beyond that, someone using any non-ascii name will hit hypervisor > specific behaviour, ISO-Latin, asian language ... and we habe no control > over this except for some checking and the possibility of a warning. I think any reasonable analysis of this should start with where the names come from: - virDomainDefineXML (eg. virsh define, virt-install, V2V import etc) - a list of existing domains from a hypervisor API (eg. /etc/xen files, Xen hypercall, ESX XMLRPC call) - already defined in an older version of libvirt which didn't do checking - [any others?] For the virDomainDefineXML route, we (a) know the names are UTF-8, and (b) know that these domains are being created for the first time. And I think for this route we should add a regexp-like restriction. (Note when I wrote [:alnum:] before, that ought to cover all Unicode characters in the alphanumeric classes, so it doesn't exclude non-US characters). There are further points that may need to be fixed within the drivers. The drivers are probably just passing the UTF-8 strings through to everything, but may need to do conversion. eg. If I've learned anything about Microsoft developers, then a hypothetical Hyper-V driver would almost certainly need to convert between UTF-8 and UTF-16LE. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into Xen guests. http://et.redhat.com/~rjones/virt-p2v -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list