On 7/19/2024 10:13 AM, Paul Moore wrote:
On Fri, Jul 12, 2024 at 5:44 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
On Thu, Jul 11, 2024 at 7:13 AM Xu Kuohai <xukuohai@xxxxxxxxxxxxxxx> wrote:
From: Xu Kuohai <xukuohai@xxxxxxxxxx>
LSM BPF prog returning a positive number attached to the hook
file_alloc_security makes kernel panic.
Here is a panic log:
[ 441.235774] BUG: kernel NULL pointer dereference, address: 00000000000009
[ 441.236748] #PF: supervisor write access in kernel mode
[ 441.237429] #PF: error_code(0x0002) - not-present page
[ 441.238119] PGD 800000000b02f067 P4D 800000000b02f067 PUD b031067 PMD 0
[ 441.238990] Oops: 0002 [#1] PREEMPT SMP PTI
[ 441.239546] CPU: 0 PID: 347 Comm: loader Not tainted 6.8.0-rc6-gafe0cbf23373 #22
[ 441.240496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b4
[ 441.241933] RIP: 0010:alloc_file+0x4b/0x190
[ 441.242485] Code: 8b 04 25 c0 3c 1f 00 48 8b b0 30 0c 00 00 e8 9c fe ff ff 48 3d 00 f0 ff fb
[ 441.244820] RSP: 0018:ffffc90000c67c40 EFLAGS: 00010203
[ 441.245484] RAX: ffff888006a891a0 RBX: ffffffff8223bd00 RCX: 0000000035b08000
[ 441.246391] RDX: ffff88800b95f7b0 RSI: 00000000001fc110 RDI: f089cd0b8088ffff
[ 441.247294] RBP: ffffc90000c67c58 R08: 0000000000000001 R09: 0000000000000001
[ 441.248209] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000001
[ 441.249108] R13: ffffc90000c67c78 R14: ffffffff8223bd00 R15: fffffffffffffff4
[ 441.250007] FS: 00000000005f3300(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[ 441.251053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 441.251788] CR2: 00000000000001a9 CR3: 000000000bdc4003 CR4: 0000000000170ef0
[ 441.252688] Call Trace:
[ 441.253011] <TASK>
[ 441.253296] ? __die+0x24/0x70
[ 441.253702] ? page_fault_oops+0x15b/0x480
[ 441.254236] ? fixup_exception+0x26/0x330
[ 441.254750] ? exc_page_fault+0x6d/0x1c0
[ 441.255257] ? asm_exc_page_fault+0x26/0x30
[ 441.255792] ? alloc_file+0x4b/0x190
[ 441.256257] alloc_file_pseudo+0x9f/0xf0
[ 441.256760] __anon_inode_getfile+0x87/0x190
[ 441.257311] ? lock_release+0x14e/0x3f0
[ 441.257808] bpf_link_prime+0xe8/0x1d0
[ 441.258315] bpf_tracing_prog_attach+0x311/0x570
[ 441.258916] ? __pfx_bpf_lsm_file_alloc_security+0x10/0x10
[ 441.259605] __sys_bpf+0x1bb7/0x2dc0
[ 441.260070] __x64_sys_bpf+0x20/0x30
[ 441.260533] do_syscall_64+0x72/0x140
[ 441.261004] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 441.261643] RIP: 0033:0x4b0349
[ 441.262045] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 88
[ 441.264355] RSP: 002b:00007fff74daee38 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[ 441.265293] RAX: ffffffffffffffda RBX: 00007fff74daef30 RCX: 00000000004b0349
[ 441.266187] RDX: 0000000000000040 RSI: 00007fff74daee50 RDI: 000000000000001c
[ 441.267114] RBP: 000000000000001b R08: 00000000005ef820 R09: 0000000000000000
[ 441.268018] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[ 441.268907] R13: 0000000000000004 R14: 00000000005ef018 R15: 00000000004004e8
This is because the filesystem uses IS_ERR to check if the return value
is an error code. If it is not, the filesystem takes the return value
as a file pointer. Since the positive number returned by the BPF prog
is not a real file pointer, this misinterpretation causes a panic.
Since other LSM modules always return either a negative error code
or a valid pointer, this specific issue only exists in BPF LSM. The
proposed solution is to reject LSM BPF progs returning unexpected
values in the verifier. This patch set adds return value check to
ensure only BPF progs returning expected values are accepted.
Since each LSM hook has different excepted return values, we need to
know the expected return values for each individual hook to do the
check. Earlier versions of the patch set used LSM hook annotations
to specify the return value range for each hook. Based on Paul's
suggestion, current version gets rid of such annotations and instead
converts hook return values to a common pattern: return 0 on success
and negative error code on failure.
Basically, LSM hooks are divided into two types: hooks that return a
negative error code and zero or other values, and hooks that do not
return a negative error code. This patch set converts all hooks of the
first type and part of the second type to return 0 on success and a
negative error code on failure (see patches 1-10). For certain hooks,
like ismaclabel and inode_xattr_skipcap, the hook name already imply
that returning 0 or 1 is the best choice, so they are not converted.
There are four unconverted hooks. Except for ismaclabel, which is not
used by BPF LSM, the other three are specified with a BTF ID list to
only return 0 or 1.
Thank you for following up on your initial work with this patchset, Xu
Kuohai. It doesn't look like I'm going to be able to finish my review
by the end of the day today, so expect that a bit later, but so far I
think most of the changes look good and provide a nice improvement :)
You should have my feedback now, let me know if you have any questions.
One additional comment I might make is that you may either want to
wait until after v6.11-rc1 is released and I've had a chance to rebase
the lsm/{dev,next} branches and merge the patchsets which are
currently queued; there are a few patches queued up which will have an
impact on this work. While it's an unstable branch, you can take a
peek at those queues patches in the lsm/dev-staging branch.
https://github.com/LinuxSecurityModule/kernel/blob/main/README.md
Got it, thanks for your valuable time and feedback! The individual
comment will be replied once I'm sure I understand it or confirmed
the next step.
Additionally, for the next update, I'll split the series into two,
as the refactoring patches and the BPF patches are not closely
related.