padata has been undergoing some surgery over the last year[0] and now seems ready for another enhancement: splitting up and multithreading CPU-intensive kernel work. Quoting from an earlier series[1], the problem I'm trying to solve is A single CPU can spend an excessive amount of time in the kernel operating on large amounts of data. Often these situations arise during initialization- and destruction-related tasks, where the data involved scales with system size. These long-running jobs can slow startup and shutdown of applications and the system itself while extra CPUs sit idle. Here are the current consumers: - struct page init (boot, hotplug, pmem) - VFIO page pinning (kvm guest init) - fallocating a hugetlb file (database shared memory init) On a large-memory server, DRAM page init is ~23% of kernel boot (3.5s/15.2s), and it takes over a minute to start a VFIO-enabled kvm guest or fallocate a hugetlb file that occupy a significant fraction of memory. This work results in 7-20x speedups and is currently increasing the uptime of our production kernels. Future areas include munmap/exit, umount, and __ib_umem_release. Some of these need coarse locks broken up for multithreading (zone->lock, lru_lock). Positive outcomes for the session would be... - Finding a strategy for capping the maximum number of threads in a job. - Agreeing on a way for the job's threads to respect resource controls. In the past few weeks I've been thinking about whether remote charging in the CPU controller is feasible (RFD to come), am also considering creating workqueue workers directly in cgroup-specific pools instead, and have proposed migrating workers in and out of cgroups before[2]. There's also memory policy and sched_setaffinity() to think about. - Checking the overall design of this thing with the mm community, given that current users are all mm-related. - Getting advice from others (hallway track) on why some pmem devices perform better than others under multithreading. This work-in-progress branch shows what it looks like now. git://oss.oracle.com/git/linux-dmjordan.git padata-mt-wip-v0.2 https://oss.oracle.com/git/gitweb.cgi?p=linux-dmjordan.git;a=shortlog;h=refs/heads/padata-mt-wip-v0.2 [0] https://lore.kernel.org/linux-crypto/?q=s%3Apadata+d%3A20190212..20200212 [1] https://lore.kernel.org/lkml/20181105165558.11698-1-daniel.m.jordan@xxxxxxxxxx/ [2] https://lore.kernel.org/lkml/20190605133650.28545-1-daniel.m.jordan@xxxxxxxxxx/