On 21/05/21 13:02, Quentin Perret wrote: ... > So I think Will has a point since, IIRC, the root domains get rebuilt > during hotplug. So you can imagine a case with a single root domain, but > CPUs 4-7 are offline. In this case, sched_setattr() will happily promote > a task to DL as long as its affinity mask is a superset of the rd span, > but things may get ugly when CPUs are plugged back in later on. > > This looks like an existing bug though. I just tried the following on a > system with 4 CPUs: > > // Create a task affined to CPU [0-2] > > while true; do echo "Hi" > /dev/null; done & > [1] 560 > > mypid=$! > > taskset -p 7 $mypid > pid 560's current affinity mask: f > pid 560's new affinity mask: 7 > > // Try to move it DL, this should fail because of the affinity > > chrt -d -T 5000000 -P 16666666 -p 0 $mypid > chrt: failed to set pid 560's policy: Operation not permitted > > // Offline CPU 3, so the rd now covers CPUs 0-2 only > > echo 0 > /sys/devices/system/cpu/cpu3/online > [ 400.843830] CPU3: shutdown > [ 400.844100] psci: CPU3 killed (polled 0 ms) > > // Try to admit the task again, which now succeeds > > chrt -d -T 5000000 -P 16666666 -p 0 $mypid > > // Plug CPU3 back online > > echo 1 > /sys/devices/system/cpu/cpu3/online > [ 408.819337] Detected PIPT I-cache on CPU3 > [ 408.819642] GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000 > [ 408.820165] CPU3: Booted secondary processor 0x0000000003 [0x410fd083] > > I don't see any easy way to fix this w/o iterating over all deadline > tasks in the rd when hotplugging a CPU back on, and blocking the hotplug > operation if it'll cause affinity issues. Urgh. > Yeah this looks like a plain existing bug, joy. :) We fixed a few around AC lately, but I guess work wasn't complete. Thanks, Juri