Is it actually a deadlock, or just missing nesting annotation? I'm unable to reproduce (and this was one of my test cases). On my device I use: while true; do load_policy /vendor/etc/selinux/precompiled_sepolicy; echo -n .; sleep 0.1; done & for i in $(seq 1 256); do runcon u:r:untrusted_app:s0:c$i,c256 echo -n x || break done dmesg is full of: <snip> [ 73.731342] c4 8041 SELinux: Converting 1694 SID table entries... [ 73.740946] c4 8041 SELinux: policy capability network_peer_controls=1 [ 73.747800] c4 8041 SELinux: policy capability open_perms=1 [ 73.759296] c4 8041 SELinux: policy capability always_check_network=0 [ 73.771677] c4 8041 SELinux: policy capability nnp_nosuid_transition=1 [ 73.957571] c3 8049 SELinux: Converting 1700 SID table entries... [ 73.968435] c4 8049 SELinux: policy capability network_peer_controls=1 [ 73.975278] c4 8049 SELinux: policy capability open_perms=1 [ 73.986776] c4 8049 SELinux: policy capability always_check_network=0 [ 73.999145] c4 8049 SELinux: policy capability nnp_nosuid_transition=1 <snip> On Mon, Nov 4, 2019 at 9:44 PM Stephen Smalley <sds@xxxxxxxxxxxxx> wrote: > > On 11/4/19 2:15 PM, Jeff Vander Stoep wrote: > > This replaces the reverse table lookup and reverse cache with a > > hashtable which improves cache-miss reverese-lookup times from > > O(n) to O(1)* and maintains the same performance as a reverse > > cache hit. > > > > This reduces the time needed to add a new sidtab entry from ~500us > > to 5us on a Pixel 3 when there are ~10,000 sidtab entries. > > > > The implementation uses the kernel's generic hashtable API, > > It uses the context's string represtation as the hash source, > > and the kernels generic string hashing algorithm full_name_hash() > > to reduce the string to a 32 bit value. > > > > This change also maintains the improvement introduced in commit > > ee1a84fd which removed the need to keep the current sidtab locked > > during policy reload. It does however introduce periodic locking of > > the target sidtab while converting the hashtable. Sidtab entries > > are never modified or removed, so the context struct stored in the > > sid_to_context tree can also be used for the context_to_sid > > hashtable to reduce memory usage. > > > > This bug was reported by: > > - Stephen Smally on the selinux bug tracker. > > BUG: kernel softlockup due to too many SIDs/contexts #37 > > https://github.com/SELinuxProject/selinux-kernel/issues/37 > > - Jovana Knezevic on Android's bugtracker. > > Bug: 140252993 > > "During multi-user performance testing, we create and remove users > > many times. selinux_android_restorecon_pkgdir goes from 1ms to over > > 20ms after about 200 user creations and removals. Accumulated over > > ~280 packages, that adds a significant time to user creation, > > making perf benchmarks unreliable." > > > > * Hashtable lookup is only O(1) when n < the number of buckets. > > > > Changes in V2: > > -The hashtable uses sidtab_entry_leaf objects directly so these > > objects are shared between the sid_to_context lookup tree and the > > context_to_sid hashtable. > > -Move the hash_add() statement to after the smp_store_release() > > to ensure that the contents have been written before the object > > becomes available in the hashtable. > > Don't forget to cc Paul and anyone who commented previously (e.g. Ondrej). > > I could be wrong, but I think that only helps if we also do a > smp_load_acquire() prior to reading/walking the hash table and if there > is an actual data dependency between the target of the store/load and > what we are trying to protect here. I don't think that's the case > here. So I don't think you can or should piggyback on the existing > load_acquire/store_release pair. Ondrej can correct me if I am wrong > (entirely possible). > > I also get this splat upon running Ondrej's reproducer for the bug fixed > by his sidtab rewrite. > > Reproducer: > > while true; do load_policy; echo -n .; sleep 0.1; done & > for (( i = 0; i < 1024; i++ )); do > runcon -l s0:c$i echo -n x || break > # or: > # chcon -l s0:c$i <some_file> || break > done > > Splat: > > 6>[ 1070.843961] SELinux: Converting 3927 SID table entries... > <4>[ 1070.858060] > <4>[ 1070.858064] ============================================ > <4>[ 1070.858066] WARNING: possible recursive locking detected > <4>[ 1070.858068] 5.4.0-rc1+ #50 Not tainted > <4>[ 1070.858070] -------------------------------------------- > <4>[ 1070.858072] runcon/13122 is trying to acquire lock: > <4>[ 1070.858074] ffff88869467c248 (&(&s->lock)->rlock){..-.}, at: > sidtab_context_to_sid+0x379/0x700 > <4>[ 1070.858080] > <4>[ 1070.858080] but task is already holding lock: > <4>[ 1070.858082] ffff8886e608c248 (&(&s->lock)->rlock){..-.}, at: > sidtab_context_to_sid+0x64/0x700 > <4>[ 1070.858086] > <4>[ 1070.858086] other info that might help us debug this: > <4>[ 1070.858088] Possible unsafe locking scenario: > <4>[ 1070.858088] > <4>[ 1070.858089] CPU0 > <4>[ 1070.858090] ---- > <4>[ 1070.858091] lock(&(&s->lock)->rlock); > <4>[ 1070.858093] lock(&(&s->lock)->rlock); > <4>[ 1070.858095] > <4>[ 1070.858095] *** DEADLOCK *** > <4>[ 1070.858095] > <4>[ 1070.858097] May be due to missing lock nesting notation > <4>[ 1070.858097] > <4>[ 1070.858099] 3 locks held by runcon/13122: > <4>[ 1070.858100] #0: ffff8887fa97c820 (sb_writers#9){.+.+}, at: > vfs_write+0x215/0x280 > <4>[ 1070.858105] #1: ffffffffae0230e8 > (&selinux_ss.policy_rwlock){.+.?}, at: > security_context_to_sid_core+0x2c0/0x340 > <4>[ 1070.858109] #2: ffff8886e608c248 (&(&s->lock)->rlock){..-.}, at: > sidtab_context_to_sid+0x64/0x700 > <4>[ 1070.858113] > <4>[ 1070.858113] stack backtrace: > <4>[ 1070.858116] CPU: 1 PID: 13122 Comm: runcon Not tainted 5.4.0-rc1+ #50 > <4>[ 1070.858118] Hardware name: Dell Inc. OptiPlex 7050/0NW6H5, BIOS > 1.8.3 03/23/2018 > <4>[ 1070.858120] Call Trace: > <4>[ 1070.858124] dump_stack+0x9a/0xf0 > <4>[ 1070.858128] __lock_acquire.cold+0x12e/0x212 > <4>[ 1070.858136] ? lockdep_hardirqs_on+0x260/0x260 > <4>[ 1070.858142] lock_acquire+0xe5/0x210 > <4>[ 1070.858160] ? sidtab_context_to_sid+0x379/0x700 > <4>[ 1070.858164] _raw_spin_lock_irqsave+0x44/0x60 > <4>[ 1070.858166] ? sidtab_context_to_sid+0x379/0x700 > <4>[ 1070.858169] sidtab_context_to_sid+0x379/0x700 > <4>[ 1070.858173] ? security_compute_sid.part.0+0x960/0x960 > <4>[ 1070.858177] security_context_to_sid_core+0x177/0x340 > <4>[ 1070.858180] ? context_to_sid.isra.0+0x60/0x60 > <4>[ 1070.858182] ? match_held_lock+0x1b/0x240 > <4>[ 1070.858200] ? __might_fault+0x6f/0xd0 > <4>[ 1070.858203] security_context_to_sid+0x12/0x20 > <4>[ 1070.858206] sel_write_context+0x19b/0x200 > <4>[ 1070.858209] ? selinux_transaction_write+0xa0/0xa0 > <4>[ 1070.858213] ? _copy_from_user+0xa1/0xd0 > <4>[ 1070.858216] ? selinux_transaction_write+0xa0/0xa0 > <4>[ 1070.858218] selinux_transaction_write+0x72/0xa0 > <4>[ 1070.858221] vfs_write+0x12f/0x280 > <4>[ 1070.858224] ksys_write+0xc3/0x160 > <4>[ 1070.858227] ? __ia32_sys_read+0x50/0x50 > <4>[ 1070.858230] ? lockdep_hardirqs_off+0xbe/0x100 > <4>[ 1070.858232] ? mark_held_locks+0x24/0x90 > <4>[ 1070.858236] do_syscall_64+0x74/0xd0 > <4>[ 1070.858239] entry_SYSCALL_64_after_hwframe+0x49/0xbe > <4>[ 1070.858242] RIP: 0033:0x7f7e94d3a467 > <4>[ 1070.858245] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 > 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 > 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 > <4>[ 1070.858247] RSP: 002b:00007ffd48dbae78 EFLAGS: 00000246 ORIG_RAX: > 0000000000000001 > <4>[ 1070.858250] RAX: ffffffffffffffda RBX: 00007ffd48dbc038 RCX: > 00007f7e94d3a467 > <4>[ 1070.858252] RDX: 000000000000002f RSI: 000055e32f081600 RDI: > 0000000000000003 > <4>[ 1070.858254] RBP: 000055e32f081600 R08: 00000000ffffffff R09: > 00007ffd48dbad10 > <4>[ 1070.858255] R10: 0000000000000000 R11: 0000000000000246 R12: > 0000000000000003 > <4>[ 1070.858257] R13: 00007ffd48dbbefc R14: 0000000000000000 R15: > 0000000000000000 > > > -The new sidtab hash stats file in selinuxfs has been moved out of > > the avc dir and into a new "ss" dir. > > > > Signed-off-by: Jeff Vander Stoep <jeffv@xxxxxxxxxx> > > Reported-by: Stephen Smalley <sds@xxxxxxxxxxxxx> > > Reported-by: Jovana Knezevic <jovanak@xxxxxxxxxx> > > --- > > security/selinux/Kconfig | 12 ++ > > security/selinux/include/security.h | 1 + > > security/selinux/selinuxfs.c | 64 +++++++ > > security/selinux/ss/context.h | 11 +- > > security/selinux/ss/policydb.c | 5 + > > security/selinux/ss/services.c | 87 +++++++--- > > security/selinux/ss/services.h | 4 +- > > security/selinux/ss/sidtab.c | 249 +++++++++++++--------------- > > security/selinux/ss/sidtab.h | 16 +- > > 9 files changed, 287 insertions(+), 162 deletions(-) > > > <snip>