On 03/14/2012 07:42 PM, Jesse J. Cook wrote: > 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a > more appropriate limit and should be sufficient. You are more likely to run > into other system limitations first, such as the 31998 inode link limit on > ext3. > --- > src/remote/remote_protocol.x | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x > index 59774b2..58f0871 100644 > --- a/src/remote/remote_protocol.x > +++ b/src/remote/remote_protocol.x > @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; > const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; > > /* Upper limit on lists of storage pool names. */ > -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; > +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; While I agree that it would be nice to avoid capping the number of pools that can actually exist, I have to say NACK to the approach in this patch. The reason we originally chose a limit of 256 is because each storage pool name is an arbitrary-length string, and when you pile multiple strings in a row, you can generate a rather lengthy RPC message for which the receiver has to pre-allocate memory in order to receive correctly. See this comment: /* Maximum length of a memory peek buffer message. * Note applications need to be aware of this limit and issue multiple * requests for large amounts of data. */ const REMOTE_DOMAIN_MEMORY_PEEK_BUFFER_MAX = 65536; That is, you cannot send any RPC with more than 64k data, because . With a cap of 256 pool names, that means each pool name can be (on average) about 256 bytes before you hit the RPC cap. Bumping REMOTE_STORAGE_POOL_NAME_LIST_MAX to 65536 is pointless; each remote_nonnull_string already occupies several bytes towards the 64k cap. I could perhaps see bumping things to 512, which lowers your average pool name to about 128 bytes before you hit the RPC cap; or even a cap of 1024 names averaging 64 bytes each. But the fact remains that you will hit a hard wall on RPC sizing, with diminishing returns if you change the pool name list maximum, which means that the _real_ solution to this has to be at a more fundamental level. We need a new API, which lets you specify a starting point and range, in order to query only part of the pool list. For example, see how the recent virDomainGetCPUStats has start_cpu and ncpus parameters, as well as a cap on only 128 cpus per RPC call. If you have a beefy machine with 256 cpus, you have to make two separate RPC calls in order to collect all of the statistics. In other words, I think we need: virConnectListStoragePoolsRange(virConnectPtr conn, char **const names, int start, int maxnames, unsigned int flags); where flags controls whether we are listing active or defined pools (virConnectListStoragePools and virConnectListDefinedStoragePools were added back in the days before we realized how useful a flags argument would be), where 'maxnames' can be at most 256 (per the RPC limits), but where 'start' lets you choose which point in the array that you are starting your enumeration so that multiple calls let you see the entire array. It may also be worth considering the addition of _just_ a new ranged RPC call, but no new public API, by making the public API smart enough to automatically call the ranged RPC multiple times to build up a single return to the user. After all, if we exposed the ranged call to the user, then each client will have to add the logic to make the multiple calls; whereas our current limitation is _only_ in the RPC aspect (once the data is across the RPC and into the target process' memory space, there's no longer a 64k cap to worry about). The same need for a new ranged API or RPC applies to any situation where the current RPC limits are starting to feel too small compared to the amount of data available to transmit in a single call. There's also an issue of concurrency - a single API call sees consistent data, but back-to-back calls over subsets of the array could race with another client adding or removing a pool in the meantime. But I'm not sure what can be done to prevent issues on that front; you already have TOCTTOU races between grabbing the list of pools and acting on that list even without a range command. Maybe we can invent a new API for batching several commands into one transaction, with multiple RPC calls but with a guarantee that no RPCs from any other connection will alter the state between the batch calls; but then we start to get into issues with what happens if a network connection goes down in the middle of operating on a batch command. So it may be something that we declare as not worth trying to solve. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list