[PATCH 1/1] [FSCACHE] Oops in fscache_op_complete due to race in decrementing refcount of op->npages

kiran.modukuri@xxxxxxxxx · Fri, 21 Sep 2018 10:50:10 -0700

From: "kiran.modukuri" <kiran.modukuri@xxxxxxxxx>

[Trace]
seen this in 4.4.x kernels and the same bug affects fscache in latest upstreams kernels.
Jun 25 11:32:08  kernel: [4740718.880898] FS-Cache:
Jun 25 11:32:08  kernel: [4740718.880920] FS-Cache: Assertion failed
Jun 25 11:32:08  kernel: [4740718.880934] FS-Cache: 0 > 0 is false
Jun 25 11:32:08  kernel: [4740718.881001] ------------[ cut here ]------------
Jun 25 11:32:08  kernel: [4740718.881017] kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449!
Jun 25 11:32:08  kernel: [4740718.881040] invalid opcode: 0000 [#1] SMP

Jun 25 11:32:08  kernel: [4740718.881656] CPU: 8 PID: 39555 Comm: kworker/u161:6 Tainted: P        W  OE   4.4.0-92-generic #115~14.04.1-Ubuntu
Jun 25 11:32:08  kernel: [4740718.881692] Hardware name: NVIDIA DGX-1/DGX-1, BIOS S2W_3A01.NVD02 11/03/2016
Jun 25 11:32:08  kernel: [4740718.881731] Workqueue: fscache_operation fscache_op_work_func [fscache]
Jun 25 11:32:08  kernel: [4740718.881756] task: ffff8833f0da0e00 ti: ffff8820e1104000 task.ti: ffff8820e1104000
Jun 25 11:32:08  kernel: [4740718.881781] RIP: 0010:[<ffffffffc037eacd>]  [<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
Jun 25 11:32:08  kernel: [4740718.881825] RSP: 0018:ffff8820e1107d68  EFLAGS: 00010282
Jun 25 11:32:08  kernel: [4740718.881850] RAX: 0000000000000018 RBX: ffff8822829d8000 RCX: 0000000000000006
Jun 25 11:32:08  kernel: [4740718.881879] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff883f7f60dd90
Jun 25 11:32:08  kernel: [4740718.881903] RBP: ffff8820e1107d88 R08: 0000000000000000 R09: ffff883f60f95a00
Jun 25 11:32:08  kernel: [4740718.883193] R10: 0000000000000157 R11: 00000000000011e7 R12: ffff88010527c300
Jun 25 11:32:08  kernel: [4740718.884105] R13: ffff883f3dbd1f00 R14: ffff8804e3ca6ee0 R15: ffff8822829d8000
Jun 25 11:32:08  kernel: [4740718.884932] FS:  0000000000000000(0000) GS:ffff883f7f600000(0000) knlGS:0000000000000000
Jun 25 11:32:08  kernel: [4740718.885747] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 25 11:32:08  kernel: [4740718.886537] CR2: 00007fa5a84d0000 CR3: 0000001b3921f000 CR4: 00000000003406e0
Jun 25 11:32:08  kernel: [4740718.887314] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 25 11:32:08  kernel: [4740718.888092] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jun 25 11:32:08  kernel: [4740718.888896] Stack:
Jun 25 11:32:08  kernel: [4740718.889663]  ffff8833f0da0e00 ffff88010527c300 ffff883f3dbd1f00 ffff8804e3ca6ee0
Jun 25 11:32:08  kernel: [4740718.890472]  ffff8820e1107df8 ffffffffc1464cf9 ffff8833f0da0e00 ffff8820e1108000
Jun 25 11:32:08  kernel: [4740718.891646]  ffff8822829d8088 0000000000000007 ffff8822829d8000 ffff88010527c420
Jun 25 11:32:08  kernel: [4740718.892659] Call Trace:
Jun 25 11:32:08  kernel: [4740718.893506]  [<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles]
Jun 25 11:32:08  kernel: [4740718.894374]  [<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache]
Jun 25 11:32:08  kernel: [4740718.895180]  [<ffffffff81096da0>] process_one_work+0x150/0x3f0
Jun 25 11:32:08  kernel: [4740718.895966]  [<ffffffff8109751a>] worker_thread+0x11a/0x470
Jun 25 11:32:08  kernel: [4740718.896753]  [<ffffffff81808e59>] ? __schedule+0x359/0x980
Jun 25 11:32:08  kernel: [4740718.897783]  [<ffffffff81097400>] ? rescuer_thread+0x310/0x310
Jun 25 11:32:08  kernel: [4740718.898581]  [<ffffffff8109cdd6>] kthread+0xd6/0xf0
Jun 25 11:32:08  kernel: [4740718.899469]  [<ffffffff8109cd00>] ? kthread_park+0x60/0x60
Jun 25 11:32:08  kernel: [4740718.900477]  [<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70
Jun 25 11:32:08  kernel: [4740718.901514]  [<ffffffff8109cd00>] ? kthread_park+0x60/0x60
Jun 25 11:32:08  kernel: [4740718.902376] Code: 48 c7 c7 1b 43 38 c0 e8 c1 51 e0 c0 48 c7 c7 29 43 38 c0 e8 b5 51 e0 c0 49 63 74 24 20 31 d2 48 c7 c7 d8 2e 38 c0 e8 a2 51 e0 c0 <0f> 0b 41 8b 54 24 24 85 d2 0f 8f 22 ff ff ff 48 c7 c7 1b 43 38
Jun 25 11:32:08  kernel: [4740718.904192] RIP  [<ffffffffc037eacd>] fscache_op_complete+0x10d/0x180 [fscache]
Jun 25 11:32:08  kernel: [4740718.905030]  RSP <ffff8820e1107d68>
Jun 25 11:32:08  kernel: [4740718.909227] ---[ end trace a11984da3948fae0 ]---

[Problem]
        atomic_sub(n_pages, &op->n_pages);
        if (atomic_read(&op->n_pages) <= 0)
                fscache_op_complete(&op->op, true);

The code in fscache_retrieval_complete is using atomic_sub followed by an atomic_read.
This causes two threads doing a decrement of pages to race with each other seeing the op->refcount 0 at same time,
and end up calling fscache_op_complete in both the threads leading to the OOPs.

[Fix]

The fix is trivial to use atomic_sub_return instead of two calls.

Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@xxxxxxxxx>
---
 include/linux/fscache-cache.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h
index 34cf0fd..bf98ed8 100644
--- a/include/linux/fscache-cache.h
+++ b/include/linux/fscache-cache.h
@@ -196,11 +196,11 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op)
 static inline void fscache_retrieval_complete(struct fscache_retrieval *op,
 					      int n_pages)
 {
-	atomic_sub(n_pages, &op->n_pages);
-	if (atomic_read(&op->n_pages) <= 0)
+	if (atomic_sub_return(n_pages, &op->n_pages) <= 0)
 		fscache_op_complete(&op->op, false);
 }
 
+
 /**
  * fscache_put_retrieval - Drop a reference to a retrieval operation
  * @op: The retrieval operation affected
-- 
2.7.4

--
Linux-cachefs mailing list
Linux-cachefs@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cachefs