On Wed, Nov 8, 2023 at 5:15 AM Juraj Marcin <juraj@xxxxxxxxxxxxxxx> wrote: > > Currently, filename transitions are stored separately from other type > enforcement rules and only support exact name matching. However, in > practice, the names contain variable parts. This leads to many > duplicated rules in the policy that differ only in the part of the name, > or it is even impossible to cover all possible combinations. > > This patch changes the filename transition table in the policydb > structure into an array of three tables, where the index determines the > match type for the rules contained (extract, prefix, and suffix match). > Then the patch extends the functions that access the table through the > policydb structure to accompany this change while reusing the majority > of the old filename transitions code. > > This patch also updates the code responsible for finding the right > filename transition based on the context and the name. The rules have > the following order of prioriy, if no matching rule is found, the code > moves on to the next category: > - exact filename transitions, > - prefix filename transitions in the order of the longest prefix match, > - suffix filename transitions in the order of the longest suffix match. > This ensures the compatibility with older policies. > > Without prefix/suffix rules in the policy, this patch has no impact on > performance or policy loading times. Moreover, with prefix/suffix rules, > the overall number of filename transitions can be reduced, which results > in smaller binary policy size and therefore also slightly lower load > time and memory usage. > > Performance tests: > > 1: Reference kernel (f5bbdeda34c63), Fedora policy (format v33) > 2: This patch, Fedora policy (format v33) > 3: This patch, Fedora policy without prefix/suffix rules (format v34) > 4: This patch, Fefora policy with prefix rules (format v35) > > | Mem | Binary | Policy | Create tty [1] | osbench [2] > | Usage | policy | load | | create > | | size | time | (ms/file) | files > | (MiB) | (KiB) | (ms) | real | kernel | (us/file) > ---+--------+--------+---------+----------+-----------+----------- > 1 | 298.7 | 3682 | 58.626 | 1.0228 | 0.6793 | 8.4916 > | sd=4.1 | | sd=0.47 | sd=0.058 | sd=0.0497 | sd=0.131 > 2 | 296.3 | 3682 | 58.915 | 1.0209 | 0.6752 | 8.5728 > | sd=3.9 | | sd=0.28 | sd=0.021 | sd=0.0244 | sd=0.156 > 3 | 295.7 | 3682 | 56.374 | 1.0160 | 0.6616 | 8.7467 > | sd=2.6 | | sd=0.44 | sd=0.008 | sd=0.0141 | sd=0.126 > 4 | 296.2 | 2585 | 51.434 | 1.0116 | 0.6699 | 8.7467 > | sd=4.1 | | sd=0.39 | sd=0.012 | sd=0.0115 | sd=0.126 > > [1] Create test_tty benchmark: > > mknod /dev/test_tty c 4 1 > time for i in `seq 1 10000`; do > mknod /dev/test_tty$i c 4 1 > done > > This benchmark should simulate the worst case scenario as many filename > transitions affect files created in the /dev directory. > > [2] https://github.com/mbitsnbites/osbench > > Reviewed-by: Ondrej Mosnacek <omosnace@xxxxxxxxxx> > Signed-off-by: Juraj Marcin <juraj@xxxxxxxxxxxxxxx> > --- As with the previous patch, seeing these warnings due to performing blocking allocation when atomic is required. [ 2186.277242] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306 [ 2186.277250] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 10017, name: mkdir [ 2186.277256] preempt_count: 0, expected: 0 [ 2186.277260] RCU nest depth: 1, expected: 0 [ 2186.277263] INFO: lockdep is turned off. [ 2186.277268] CPU: 0 PID: 10017 Comm: mkdir Tainted: G B W OE 6.6.0-rc1+ #66 [ 2186.277280] Call Trace: [ 2186.277284] <TASK> [ 2186.277288] dump_stack_lvl+0x75/0x90 [ 2186.277300] __might_resched+0x1e1/0x310 [ 2186.277311] __kmem_cache_alloc_node+0x343/0x380 [ 2186.277320] ? security_compute_sid.part.0+0xa60/0xe90 [ 2186.277332] ? security_compute_sid.part.0+0xa60/0xe90 [ 2186.277341] __kmalloc_node_track_caller+0x55/0x160 [ 2186.277356] kstrdup+0x34/0x60 [ 2186.277364] security_compute_sid.part.0+0xa60/0xe90 [ 2186.277379] ? lock_acquire+0xb5/0x390 [ 2186.277388] ? __pfx_security_compute_sid.part.0+0x10/0x10 [ 2186.277397] ? rcu_is_watching+0x23/0x50 [ 2186.277407] ? lock_acquire+0xb5/0x390 [ 2186.277414] ? rcu_is_watching+0x23/0x50 [ 2186.277423] ? lock_release+0xa0/0x380 [ 2186.277429] ? avc_has_perm_noaudit+0xb4/0x250 [ 2186.277437] ? __pfx_lock_release+0x10/0x10 [ 2186.277446] ? avc_has_perm_noaudit+0xcc/0x250 [ 2186.277455] ? avc_has_perm_noaudit+0xcc/0x250 [ 2186.277471] security_transition_sid+0x63/0xa0 [ 2186.277483] may_create+0x16a/0x1c0 [ 2186.277493] ? __pfx_may_create+0x10/0x10 [ 2186.277501] ? selinux_inode_permission+0x1c6/0x290 [ 2186.277510] ? __pfx_selinux_inode_permission+0x10/0x10 [ 2186.277525] security_inode_mkdir+0x61/0x80 [ 2186.277534] vfs_mkdir+0x226/0x380 [ 2186.277547] do_mkdirat+0x1a8/0x1d0 [ 2186.277557] ? __pfx_do_mkdirat+0x10/0x10 [ 2186.277567] ? getname_flags.part.0+0xc6/0x250 [ 2186.277573] ? __pfx___x64_sys_chdir+0x10/0x10 [ 2186.277583] __x64_sys_mkdir+0x78/0xa0 [ 2186.277592] do_syscall_64+0x3c/0x90 [ 2186.277602] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 2186.277611] RIP: 0033:0x7f2cb7788d5b [ 2186.277637] Code: 8b 05 b1 30 0d 00 bb ff ff ff ff 64 c7 00 16 00 00 00 e9 62 ff ff ff e8 e3 1b 02 00 0f 1f 00 f3 0f 1e fa b8 53 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 79 30 0d 00 f7 d8 [ 2186.277645] RSP: 002b:00007ffe4833a5d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000053 [ 2186.277653] RAX: ffffffffffffffda RBX: 00007ffe4833a840 RCX: 00007f2cb7788d5b [ 2186.277658] RDX: 00007ffe4833a840 RSI: 00000000000001ff RDI: 00007ffe4833c310 [ 2186.277663] RBP: 00007ffe4833a660 R08: 00000000000001ff R09: 0000000000000001 [ 2186.277668] R10: 000055776dfb5252 R11: 0000000000000246 R12: 00007ffe4833c310 [ 2186.277673] R13: 00007ffe4833c2dc R14: 00007ffe4833c319 R15: 00007ffe4833c318 [ 2186.277688] </TASK>