On Fri, Aug 25, 2017 at 08:30:52AM +0200, Martin Kletzander wrote:
On Wed, Aug 23, 2017 at 04:47:03PM -0400, John Ferlan wrote:On 08/23/2017 07:47 AM, Martin Kletzander wrote:We always truncated the name at 20 bytes instead of characters. In case 20 bytes were in the middle of a multi-byte character, then the string became invalid and various parts of the code would error out (e.g. XML parsing of that string). Let's instead properly truncate it after 20 characters instead. In some cases the truncation didn't even work (as can be seen from the modified tests), which is fixed by this as well.Didn't work as well? Try at all... As a test I changed the name of pcie-expander-bus to 80 characters and all 80 were printed. I think this goes against your original intention of a shortname...Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1448766 Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> --- src/conf/domain_conf.c | 30 +++++++++++++++++++--- .../qemuxml2argv-aarch64-virt-default-nic.args | 2 +- .../qemuxml2argv-hugepages-pages2.args | 4 +-- .../qemuxml2argv-hugepages-pages3.args | 5 ++-- .../qemuxml2argv-hugepages-pages5.args | 4 +-- .../qemuxml2argv-hugepages-pages6.args | 2 +- .../qemuxml2argv-pcie-expander-bus.args | 2 +- 7 files changed, 37 insertions(+), 12 deletions(-)Can we add a multibyte name test? These just modify existing tests (I think wrongly too).Good point, I don't know why I didn't add one. I created one especially for testing this, but didn't add it to the repo it seems. However I'm not sure we will be able to force the characters to be multibyte, we would have to modify the locale of the test, but also skip the test if the NLS for the filesystem is something like ISO 8859-1 or so. I can add one anyway, at least on some systems it might check for something.diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 47eba4dbb315..9a46ceece2d6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -27131,6 +27131,8 @@ virDomainDefHasMemballoon(const virDomainDef *def) } +#define VIR_DOMAIN_SHORT_NAME_MAX 20 + /** * virDomainObjGetShortName: * @vm: Machine for which to get a name @@ -27141,15 +27143,37 @@ virDomainDefHasMemballoon(const virDomainDef *def) char * virDomainObjGetShortName(const virDomainDef *def) { - const int dommaxlen = 20; + wchar_t wshortname[VIR_DOMAIN_SHORT_NAME_MAX + 1] = {0}; + size_t len = 0; + char *shortname = NULL; char *ret = NULL;It would seem that we could figure there were multibyte chars in the name "if strlen(def->name) != mbstowcs(NULL, def->name, 0)" and not do any conversion, leaving the old code "as is" (mostly) and then implementing something for these wide characters.Good point, we can do that.Of course I'm not getting anything other than -1 returned for the mbstowcs and I didn't really want to dig any further, so I leave it up to you!What the errno in that case?I tried to modify the same test and change the name to "ÄèÎØÛ", but the mbstowcs kept returning -1, until I added: if (setlocale(LC_ALL, "en_US.UTF-8") == NULL) {What is the output of `locale` on your machine?But how does one know which locale to use? From my quick read of the mbstowcs man page it seems a locale needs to be set first. I also tried the man page example of "Grüße!" and that worked with UTF-8.It should use the locale of the system. Maybe if you have e.g. 8859-1 set, there will be no multibyte characters, so it does not do anything. If we can gather that info from the errno, then we can safely skip the conversion as well.
The whole problem is that we don't call virGettextInitialize in tests. I think the (wrong) reasoning behind that was that tests should not depend on the locale from your machine, however setlocale(LC_ALL, "") should be done always. I'll send a v2 with hopefully everything fixed.
- ignore_value(virAsprintf(&ret, "%d-%.*s", - def->id, dommaxlen, def->name)); + if (mbstowcs(wshortname, def->name, VIR_DOMAIN_SHORT_NAME_MAX) == (size_t) -1) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Cannot convert domain name to wide character string")); + return NULL; + } + + len = wcstombs(NULL, wshortname, 0); + if (len == (size_t) -1) + return NULL; + if (VIR_ALLOC_N(shortname, len + 1) < 0) + return NULL; + + if (wcstombs(shortname, wshortname, len) == (size_t) -1) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Cannot convert domain name from wide character string")); + goto cleanup; + } + + ignore_value(virAsprintf(&ret, "%d-%s", def->id, def->name));You go through all this trouble and still write def->name! I didn't realize that at first either - kept wondering why the whole damn name was being printed!What the fsck have I sent?!?!? I see what I did wrong, interesting. So I ran the tests with VIR_TEST_DEBUG=1 and I've seen some truncation not happening and I (by mistake) though that new truncation *is* happening now, so I re-ran with VIR_TEST_REGENERATE_OUTPUT=1. And it fixed the issue because there was no more truncation happening. Anyway, thanks for seeing that, that's what the reviews are for ;) v2 on the way.
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list