On 08/25/2017 02:30 AM, Martin Kletzander wrote: > On Wed, Aug 23, 2017 at 04:47:03PM -0400, John Ferlan wrote: >> >> >> On 08/23/2017 07:47 AM, Martin Kletzander wrote: >>> We always truncated the name at 20 bytes instead of characters. In >>> case 20 bytes were in the middle of a multi-byte character, then the >>> string became invalid and various parts of the code would error >>> out (e.g. XML parsing of that string). Let's instead properly >>> truncate it after 20 characters instead. >>> >>> In some cases the truncation didn't even work (as can be seen from the >>> modified tests), which is fixed by this as well. >> >> Didn't work as well? Try at all... >> >> As a test I changed the name of pcie-expander-bus to 80 characters and >> all 80 were printed. I think this goes against your original intention >> of a shortname... >> >>> >>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1448766 >>> >>> Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> >>> --- >>> src/conf/domain_conf.c | 30 >>> +++++++++++++++++++--- >>> .../qemuxml2argv-aarch64-virt-default-nic.args | 2 +- >>> .../qemuxml2argv-hugepages-pages2.args | 4 +-- >>> .../qemuxml2argv-hugepages-pages3.args | 5 ++-- >>> .../qemuxml2argv-hugepages-pages5.args | 4 +-- >>> .../qemuxml2argv-hugepages-pages6.args | 2 +- >>> .../qemuxml2argv-pcie-expander-bus.args | 2 +- >>> 7 files changed, 37 insertions(+), 12 deletions(-) >>> >> >> Can we add a multibyte name test? These just modify existing tests (I >> think wrongly too). >> > > Good point, I don't know why I didn't add one. I created one especially > for testing this, but didn't add it to the repo it seems. However I'm > not sure we will be able to force the characters to be multibyte, we > would have to modify the locale of the test, but also skip the test if > the NLS for the filesystem is something like ISO 8859-1 or so. I can > add one anyway, at least on some systems it might check for something. > >> >>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c >>> index 47eba4dbb315..9a46ceece2d6 100644 >>> --- a/src/conf/domain_conf.c >>> +++ b/src/conf/domain_conf.c >>> @@ -27131,6 +27131,8 @@ virDomainDefHasMemballoon(const virDomainDef >>> *def) >>> } >>> >>> >>> +#define VIR_DOMAIN_SHORT_NAME_MAX 20 >>> + >>> /** >>> * virDomainObjGetShortName: >>> * @vm: Machine for which to get a name >>> @@ -27141,15 +27143,37 @@ virDomainDefHasMemballoon(const >>> virDomainDef *def) >>> char * >>> virDomainObjGetShortName(const virDomainDef *def) >>> { >>> - const int dommaxlen = 20; >>> + wchar_t wshortname[VIR_DOMAIN_SHORT_NAME_MAX + 1] = {0}; >>> + size_t len = 0; >>> + char *shortname = NULL; >>> char *ret = NULL; >>> >> >> It would seem that we could figure there were multibyte chars in the >> name "if strlen(def->name) != mbstowcs(NULL, def->name, 0)" and not do >> any conversion, leaving the old code "as is" (mostly) and then >> implementing something for these wide characters. >> > > Good point, we can do that. > >> Of course I'm not getting anything other than -1 returned for the >> mbstowcs and I didn't really want to dig any further, so I leave it up >> to you! >> > > What the errno in that case? > "Invalid or incomplete multibyte or wide character" Same message even after adding a virGettextInitialize call in the function as a test... It was one of those annoying end of the day things too - as in this *should* work, why is it not working. >> I tried to modify the same test and change the name to "ÄèÎØÛ", but the >> mbstowcs kept returning -1, until I added: >> >> if (setlocale(LC_ALL, "en_US.UTF-8") == NULL) { >> > > What is the output of `locale` on your machine? > en_US.UTF-8 >> But how does one know which locale to use? From my quick read of the >> mbstowcs man page it seems a locale needs to be set first. I also tried >> the man page example of "Grüße!" and that worked with UTF-8. >> > > It should use the locale of the system. Maybe if you have e.g. 8859-1 > set, there will be no multibyte characters, so it does not do anything. > If we can gather that info from the errno, then we can safely skip the > conversion as well. > >>> - ignore_value(virAsprintf(&ret, "%d-%.*s", >>> - def->id, dommaxlen, def->name)); >>> + if (mbstowcs(wshortname, def->name, VIR_DOMAIN_SHORT_NAME_MAX) >>> == (size_t) -1) { >>> + virReportError(VIR_ERR_INVALID_ARG, "%s", >>> + _("Cannot convert domain name to wide >>> character string")); >>> + return NULL; >>> + } >>> + >>> + len = wcstombs(NULL, wshortname, 0); >>> + if (len == (size_t) -1) >>> + return NULL; >>> >>> + if (VIR_ALLOC_N(shortname, len + 1) < 0) >>> + return NULL; >>> + >>> + if (wcstombs(shortname, wshortname, len) == (size_t) -1) { >>> + virReportError(VIR_ERR_INVALID_ARG, "%s", >>> + _("Cannot convert domain name from wide >>> character string")); >>> + goto cleanup; >>> + } >>> + >>> + ignore_value(virAsprintf(&ret, "%d-%s", def->id, def->name)); >> >> You go through all this trouble and still write def->name! I didn't >> realize that at first either - kept wondering why the whole damn name >> was being printed! >> > > What the fsck have I sent?!?!? > Yeah well I only saw it after spending more than a few minutes messing around with the code. > I see what I did wrong, interesting. So I ran the tests with > VIR_TEST_DEBUG=1 and I've seen some truncation not happening and I (by > mistake) though that new truncation *is* happening now, so I re-ran with > VIR_TEST_REGENERATE_OUTPUT=1. And it fixed the issue because there was > no more truncation happening. Ahhh... If I run the test directly: VIR_TEST_DEBUG=1 ./run tests/qemuxml2argvtest then things work as (mostly) expected as long as locale is set via a virGettextInitialize in virDomainObjGetShortName; however, if I run via: VIR_TEST_DEBUG=1 make -C tests check TESTS=qemuxml2argvtest then I get the mbtowcs failure util I explicitly set the locale to en_US.UTF-8 in virDomainObjGetShortName. It's very odd/strange... Since I only was messing with pcie-expander-bus-test I can also use VIR_TEST_RANGE=565 in order to see just that output... > > Anyway, thanks for seeing that, that's what the reviews are for ;) v2 on > the way. I see v2 hit the list - I'll look at that... John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list