On 12.12.2018 17:28, John Ferlan wrote: > > > On 12/12/18 5:59 AM, Nikolay Shirokovskiy wrote: >> >> >> On 11.12.2018 19:41, John Ferlan wrote: >>> >>> >>> On 12/11/18 6:04 AM, Nikolay Shirokovskiy wrote: >>>> >>>> >>>> On 10.12.2018 19:58, John Ferlan wrote: >>>>> >>>>> >>>>> On 11/8/18 8:02 AM, Nikolay Shirokovskiy wrote: >>>>>> For qemu capable of setting l2-cache-size for qcow2 images >>>>>> to INT64_MAX and semantics of upper limit on l2 cache >>>>>> size. We can only check this by qemu version (3.1.0) now. >>>>>> >>>>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@xxxxxxxxxxxxx> >>>>>> --- >>>>>> src/qemu/qemu_capabilities.c | 5 +++++ >>>>>> src/qemu/qemu_capabilities.h | 1 + >>>>>> 2 files changed, 6 insertions(+) >>>>>> >>>>> >>>>> This patch needs to be updated to top of tree. >>>>> >>>>>> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c >>>>>> index 2ca5af3..49a3b60 100644 >>>>>> --- a/src/qemu/qemu_capabilities.c >>>>>> +++ b/src/qemu/qemu_capabilities.c >>>>>> @@ -509,6 +509,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, >>>>>> "vfio-pci.display", >>>>>> "blockdev", >>>>>> "vfio-ap", >>>>>> + "qcow2.l2-cache-size", >>>>> >>>>> When you do update, you will need to fix the comma-less entry for >>>>> "egl-headless.rendernode" as well, unless someone else gets to it first. >>>>> >>>>>> ); >>>>>> >>>>>> >>>>>> @@ -4117,6 +4118,10 @@ virQEMUCapsInitQMPMonitor(virQEMUCapsPtr qemuCaps, >>>>>> virQEMUCapsSet(qemuCaps, QEMU_CAPS_MACHINE_PSERIES_MAX_CPU_COMPAT); >>>>>> } >>>>>> >>>>>> + /* l2-cache-size before 3001000 does not accept INT64_MAX */ >>>>>> + if (qemuCaps->version >= 3001000) >>>>>> + virQEMUCapsSet(qemuCaps, QEMU_CAPS_QCOW2_L2_CACHE_SIZE); >>>>>> + >>>>> >>>>> Not a fan of this. >>>>> >>>>> The 'l2-cache-size' was supported since QEMU 2.2. A "newer" algorithm >>>>> for cache adjustment has been supported since QEMU 2.12 (and there's >>>>> l2-cache-entry-size that "could be used" to know that). Then there's >>>>> some unreferenced commit indicating usage of INT64_MAX. Tracking that >>>>> down, I find qemu.git commit 6c1c8d5d from Max which enforces MAX_INT. >>>> >>>> This is a failure rather than enforcement. And AFAIU code that limit cache >>>> to appropriate maximum when INT64_MAX is given in read_cache_sizes: >>>> >>>> *l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); >>>> >>>> only appeared after release of 3.0 in >>>> >>>> commit b749562d9822d14ef69c9eaa5f85903010b86c30 >>>> Author: Leonid Bloch <lbloch@xxxxxxxxxxxxx> >>>> Date: Wed Sep 26 19:04:43 2018 +0300 >>>> >>>> qcow2: Assign the L2 cache relatively to the image size >>>> >>>> >>>> ie setting cache size to INT64_MAX before 3.1 will fail. In >>>> other words in earlier versions there were no value to specify that >>>> cache size need to be set to maximum. You can calculate this value >>>> yourself knowing disk size and disk cluster size and set it but >>>> this is not convinient. >>>> >>> >>> So prior to this patch the max could be 32 MB? And now it's calculate-able? >> >> 32 MB is default value on Linux in 3.1 (the patch set it it to 1 MB but later >> patch 80668d0f increases to 32 MB). >> >> Before the patch one can set greater values but can not set INT64_MAX. >> The exact value that can be set depends on cluster size due to the check >> againts INT_MAX you already mentioned. So one can set up to INT_MAX * cluster_size, >> but cluster_size is unknown to us (?). Given minium cluster size is 512 bytes, we >> can set INT_MAX * 512 (1 TB) always. This will cover 64 TB disks for cluster >> size 512 bytes and 8192 PB for default cluster size. So we should refuse >> to start with policy 'maximum' and disk size greater then 64 TB. >> Or may be we can just check cluster size at start up. How do you think? > > I'm not sure I'd go through the trouble of determining the size. If > we're not changing the default cluster_size, then using 512 "seems" > reasonable (whether others feel the same way who knows yet). I'm not > even sure if we already get that information - feel free to research and > add though. > > I'm not sure I would "refuse" to start - think in terms of what is the > downside to not being able to have a large enough value - well that > seems to be I/O performance. If we can get "better" I/O performance by > providing "something" more than "default" for those environments, then I > would think that's better than doing nothing or failing to start. > > I'm not sure how much "effort" you want to put into the window of > support prior to 3.1 where "maximum" is true maximum. It could be > simple enough to document that for prior to 3.1 the maximum value is > limited to XXX. After 3.1 it's essentially unlimited since libvirt can > provide the maximum integer value to QEMU and QEMU then decides the > maximum up to that value. There's ways to wordsmith it - I'm pretty > sure it's been done before, but I don't have an example readily > available (search docs for size, maximum, or default)... > >> >> By the way the patch works only for -blockdev configuration which is >> available since QEMU 2.10 AFAIK. So we could purse to support 2.10, 2.11, >> 2.12 and 3.0 version in principle. > > Ah true, the --blockdev option is required. > > I've put the "artificial" point in time of 2.12 because that when I > recall Berto making adjustments to the cache size algorithms and when > the @l2-cache-entry-size appeared in qapi/block-core.json. I think prior > to those changes the algorithm was even more difficult to decipher. > UPDATE Turns out -blockdev will be supported only starting from QEMU 3.0.0 according to [1]. I thought 2.10 and further versions will be supported too because grepped 'qdev' in tests/qemucapabilitiesdata/ - turns out I found support for query-block command :) So I think extra complexity to support just 3.0.0 is unnessesary. Nikolay [1] [PATCHv2 62/62] DO NOT APPLY: Enable QEMU_CAPS_BLOCKDEV if 'query-blockstats' works with -blockdev https://www.redhat.com/archives/libvir-list/2018-August/msg00755.html --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -1307,6 +1307,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = { { "blockdev-add/arg-type/+qcow2/encrypt/+luks/key-secret", QEMU_CAPS_QCOW2_LUKS }, { "nbd-server-start/arg-type/tls-creds", QEMU_CAPS_NBD_TLS }, { "screendump/arg-type/device", QEMU_CAPS_SCREENDUMP_DEVICE }, + { "query-blockstats/ret-type/qdev", QEMU_CAPS_BLOCKDEV }, }; >> >>> >>> Do you think it would be "worthwhile" to add logic that knows QEMU 2.12 >>> and 3.0 could still support using l2_cache_size, but with a lower value >>> say 32MB? Then if we find 3.1 installed then we can supply INT_MAX for >>> the value? Oh, and if the "maximum" attribute isn't set, then the >>> default of 0 is used. Thus being able to use "Z" instead of "I" for patch 3? >> >> In default case l2-cache-size just does not get set. >> > > Yes on earlier support and I think using l2-cache-entry-size will give > us a point in time to support from. The max size chosen is up to you - I > think if we document what we set to prior to QEMU 3.1, then we'll be good. > > Adding a second capability that would be (yuck) version based because we > know a bug was fixed is going to have to be acceptable. Then, the logic > to "set" the value would be something like: > > if (domain maximum provided) { > if (some 3.1 capability is set) > max = INT64_MAX; > else if (at least 2.12 capability is set) > max = "whatever smaller value you choose" > else > error this qemu doesn't support setting... > } > >>> >>> BTW: >>> One issue with providing a numerical QEMU version for something is that >>> someone could backport that patch (and it's dependencies) into some >>> maintenance branch or even downstream repo that won't report as 3.1, but >>> would have the patch. But since libvirt would only accept 3.1 or later, >>> it wouldn't allow usage. >> >> Does not look critical. The functionality just won't work from start >> and when they start to investigate why in the result they can backport >> appropriate qemu patches as well and fix libvirt caps code for example. > > True, not critical, but it's one of the reasons numerical version > checking is disliked. Similarly providing in docs that something is > supported as of QEMU maj.min will get "dinged" with the reasoning of > what if someone backports... and I have a similar response to yours for > that. > > John > > >> >> Nikolay >> >>> >>> >>>>> >>>>> Still does that really matter? Let QEMU pick their own max and don't >>>>> have libvirt be the arbiter of that (like I noted in my 1/4 review). My >>>>> reasoning is that there's been quite a few "adjustments" to the >>>>> algorithms along the way. Those adjustments are one of the concerns >>>>> voiced in the past why making any "semi-permanent" change to the value >>>>> in some libvirt XML format has been NACKed historically. THus by placing >>>>> boundaries that are true today we limit ourselves for the future. >>>> >>>> IIUC setting INT64_MAX is ok in this respect. It is not real value for cache >>>> but rather order for QEMU to pick up one. >>>> >>> >>> Right providing some unrealistically large value would appear to work. >>> What that value "could be" for 2.12 and 3.0.0 which do support some >>> level of alteration and would be 'nice' to include in the mix, but not >>> required I suppose. And yes, I do have some downstream representation in >>> mind by thinking about this - it's in beta now ;-). >>> >>> John >>> >>>>> >>>>> BTW: If 3.1 was the "base" from which you want to work, then adjustments >>>>> to the tests/qemucapabilitiesdata/*3_1*{.replies|.xml } files would be >>>>> required as evidenced by your patch 4. The *.xml file would need to have >>>>> the correct <version>3001000</version> setting which should allow patch4 >>>>> to be merged into patch3. >>>> >>>> Yeah, but 3.1 is not yet released and I need blockdev also as >>>> patch only makes difference in latter case. >>>> >>>> Nikolay >>>> >>>>> >>>>> John >>>>> >>>>>> if (virQEMUCapsProbeQMPCommands(qemuCaps, mon) < 0) >>>>>> goto cleanup; >>>>>> >>>>>> diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h >>>>>> index 6bb9a2c..bb2ac17 100644 >>>>>> --- a/src/qemu/qemu_capabilities.h >>>>>> +++ b/src/qemu/qemu_capabilities.h >>>>>> @@ -493,6 +493,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ >>>>>> QEMU_CAPS_VFIO_PCI_DISPLAY, /* -device vfio-pci.display */ >>>>>> QEMU_CAPS_BLOCKDEV, /* -blockdev and blockdev-add are supported */ >>>>>> QEMU_CAPS_DEVICE_VFIO_AP, /* -device vfio-ap */ >>>>>> + QEMU_CAPS_QCOW2_L2_CACHE_SIZE, /* -blockdev supports l2-cache-size with INT64_MAX value */ >>>>>> >>>>>> QEMU_CAPS_LAST /* this must always be the last item */ >>>>>> } virQEMUCapsFlags; >>>>>> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list