The following commit has been merged into the sched/core branch of tip: Commit-ID: 01ccf592362a984534371b3596d4c953da6a7bb2 Gitweb: https://git.kernel.org/tip/01ccf592362a984534371b3596d4c953da6a7bb2 Author: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> AuthorDate: Wed, 19 Aug 2020 21:55:05 +02:00 Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx> CommitterDate: Wed, 26 Aug 2020 12:41:58 +02:00 sched: Bring the PF_IO_WORKER and PF_WQ_WORKER bits closer together The bits PF_IO_WORKER and PF_WQ_WORKER are tested together in sched_submit_work() which is considered to be a hot path. If the two bits cross the 8 or 16 bit boundary then most architecture require multiple load instructions in order to create the constant value. Also, such a value can not be encoded within the compare opcode. By moving the bit definition within the same block, the compiler can create/use one immediate value. For some reason gcc-10 on ARM64 requires both bits to be next to each other in order to issue "tst reg, val; bne label". Otherwise the result is "mov reg1, val; tst reg, reg1; bne label". Move PF_VCPU out of the way so that PF_IO_WORKER can be next to PF_WQ_WORKER. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Link: https://lkml.kernel.org/r/20200819195505.y3fxk72sotnrkczi@xxxxxxxxxxxxx --- include/linux/sched.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 93ecd93..2bf0af1 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1489,9 +1489,10 @@ extern struct pid *cad_pid; /* * Per process flags */ +#define PF_VCPU 0x00000001 /* I'm a virtual CPU */ #define PF_IDLE 0x00000002 /* I am an IDLE thread */ #define PF_EXITING 0x00000004 /* Getting shut down */ -#define PF_VCPU 0x00000010 /* I'm a virtual CPU */ +#define PF_IO_WORKER 0x00000010 /* Task is an IO worker */ #define PF_WQ_WORKER 0x00000020 /* I'm a workqueue worker */ #define PF_FORKNOEXEC 0x00000040 /* Forked but didn't exec */ #define PF_MCE_PROCESS 0x00000080 /* Process policy on mce errors */ @@ -1515,7 +1516,6 @@ extern struct pid *cad_pid; #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */ #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */ #define PF_MEMALLOC_NOCMA 0x10000000 /* All allocation request will have _GFP_MOVABLE cleared */ -#define PF_IO_WORKER 0x20000000 /* Task is an IO worker */ #define PF_FREEZER_SKIP 0x40000000 /* Freezer should not count it as freezable */ #define PF_SUSPEND_TASK 0x80000000 /* This thread called freeze_processes() and should not be frozen */