On 4/22/21 10:22 AM, Alexey Makhalov wrote: > Hello, > >>> 2) It's possible users set particular conditions in percpu variables >>> that are not tied to just statistics summing (such as the cpu >>> runqueues). Users would have to provide online init and exit functions >>> which could get weird. > I do not think online init/exit function is a right approach. > There are many places in the Linux where percpu data get initialized right after got allocated: > ptr = alloc_percpu(); > for_each_possible_cpu(cpu) { > initialize (per_cpu_ptr(ptr, cpu)); > } > Let’s keep all such instances untouched. Hope initialize() just touch content of percpu area without allocating substructures. If so - it should be redesigned. I'm afraid that 'hope' won't get us far. For example in the mm/page_alloc.c we use INIT_LIST_HEAD() for percpu structures. Which means it's initialized to empty list_head which are two "self-pointers" and you can't just memcpy that elsewhere. You could try to special-case this stuff in your "initialize N from A" approach but it becomes rather fragile so we would indeed need callbacks for proper init/exit on online/offline. > BTW, this loop does extra work (runtime overhead) to initialize areas for possible cpus which might never arrive. > > The proposal: > - in case of possible_cpus > online_cpus, add additional unit (call it A) to the chunks which will contain initialized image of percpu data for possible cpus. > - for_each_possible_cpu(cpu) from snippet above should go through all online cpus + 1 (for unit A). > - on new CPU #N arrival, percpu should allocate corresponding unit N and initialize its content by data from unit A. Repeat for all chunks. > - on CPU D departure - release unit D from the chunks, keeping unit A intact. > - in case of possible_cpus > online_cpus, overhead will be +1 (for unit A), while current overhead is +(possible_cpus-online_cpus). > - in case of possible_cpus == online_cpus (no CPU hotplug) - do not allocate unit A, keep percpu allocator as it is now - no overhead. > > Does it fully cover 2nd concern? > >>> As Roman mentioned, I think it would be much better to not have the >>> large discrepancy between the cpu_online_mask and the cpu_possible_mask. >> >> Indeed it is quite common on PowerPC to set a VM with a possible high number of CPUs but with a reasonnable number of online CPUs. This allows the user to scale up its VM when needed. Yeah somehow it's always PowerPC with this kind of possible vs online problem :) Last time I recall it was SLUB page order. So I'm not against the hotplug support, but it really won't be simple. >> For instance we may see up to 1024 possible CPUs while the online number is *only* 128. > Agree. In VMs, vCPUs there are just threads/processes on the host and can be easily added/removed on demand. > > Thanks, > —Alexey > > >