[resending in modified form since this didn't seem to reach the lists] Hi, padata has been undergoing some surgery lately[0] and now seems ready for another enhancement: splitting up and multithreading CPU-intensive kernel work. I'm planning to send an RFC for this, but I wanted to post some thoughts on the design and a work-in-progress branch first to see if the direction looks ok. Quoting from an earlier series[1], this is the problem I'm trying to solve: A single CPU can spend an excessive amount of time in the kernel operating on large amounts of data. Often these situations arise during initialization- and destruction-related tasks, where the data involved scales with system size. These long-running jobs can slow startup and shutdown of applications and the system itself while extra CPUs sit idle. There are several paths where this problem exists, but here are three to start: - struct page initialization (at boot-time, during memory hotplug, and for persistent memory) - VFIO page pinning (kvm guest initialization) - fallocating a HugeTLB page (database initialization) padata is a general mechanism for parallel work and so seems natural for this functionality[2], but now it can only manage a series of small, ordered jobs. The coming RFC will bring enhancements to split up a large job among a set of helper threads according to the user's wishes, load balance the work between them, set concurrency limits to control the overall number of helpers in the system and per NUMA node, and run extra helper threads beyond the first at a low priority level so as not to disturb other activity on the system for the sake of an optimization. (While extra helpers are run at low priority for most of the job, their priority is raised one by one at job end to match the caller's to avoid starvation on a busy system.) The existing padata interfaces and features will remain, but serialization becomes optional because these sorts of jobs don't need it. The advantage to enhancing padata rather than having the multithreading stand alone is that there would be one central place in the kernel to manage the number of helper threads that run at any given time. A machine's idle CPU resources can be harnessed yet controlled (the low priority idea) to provide the right amount of multithreading for the system. Here's a work-in-progress branch with some of this already done in the last five patches. git://oss.oracle.com/git/linux-dmjordan.git padata-mt-wip https://oss.oracle.com/git/gitweb.cgi?p=linux-dmjordan.git;a=shortlog;h=refs/heads/padata-mt-wip Thoughts? Questions? Thanks. Daniel [0] https://lore.kernel.org/linux-crypto/?q=s%3Apadata+d%3A20190101.. [1] https://lore.kernel.org/lkml/20181105165558.11698-1-daniel.m.jordan@xxxxxxxxxx/ [2] https://lore.kernel.org/lkml/20181107103554.GL9781@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/