Re: [PATCH v4 0/4] ceph: size handling for the fscrypt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2021-11-02 at 06:52 -0400, Jeff Layton wrote:
> On Tue, 2021-11-02 at 17:44 +0800, Xiubo Li wrote:
> > On 11/1/21 6:27 PM, Jeff Layton wrote:
> > > On Mon, 2021-11-01 at 10:04 +0800, xiubli@xxxxxxxxxx wrote:
> > > > From: Xiubo Li <xiubli@xxxxxxxxxx>
> > > > 
> > > > This patch series is based on the "fscrypt_size_handling" branch in
> > > > https://github.com/lxbsz/linux.git, which is based Jeff's
> > > > "ceph-fscrypt-content-experimental" branch in
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git
> > > > and added two upstream commits, which should be merged already.
> > > > 
> > > > These two upstream commits should be removed after Jeff rebase
> > > > his "ceph-fscrypt-content-experimental" branch to upstream code.
> > > > 
> > > I don't think I was clear last time. I'd like for you to post the
> > > _entire_ stack of patches that is based on top of
> > > ceph-client/wip-fscrypt-fnames. wip-fscrypt-fnames is pretty stable at
> > > this point, so I think it's a reasonable place for you to base your
> > > work. That way you're not beginning with a revert.
> > 
> > Hi Jeff,
> > 
> > BTW, have test by disabling the CONFIG_FS_ENCRYPTION option for branch 
> > ceph-client/wip-fscrypt-fnames ?
> > 
> > I have tried it today but the kernel will crash always with the 
> > following script. I tried many times the terminal, which is running 'cat 
> > /proc/kmsg' will always be stuck without any call trace about it.
> > 
> > # mkdir dir && echo "123" > dir/testfile
> > 
> > By enabling the CONFIG_FS_ENCRYPTION, I haven't countered any issue yet.
> > 
> > I am still debugging on it.
> > 
> > 
> 
> 
> No, I hadn't noticed that, but I can reproduce it too. AFAICT, bash is
> sitting in a pselect() call:
> 
> [jlayton@client1 ~]$ sudo cat /proc/1163/stack
> [<0>] poll_schedule_timeout.constprop.0+0x53/0xa0
> [<0>] do_select+0xb51/0xc70
> [<0>] core_sys_select+0x2ac/0x620
> [<0>] do_pselect.constprop.0+0x101/0x1b0
> [<0>] __x64_sys_pselect6+0x9a/0xc0
> [<0>] do_syscall_64+0x3b/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> After playing around a bit more, I saw this KASAN pop, which may be
> related:
> 
> [ 1046.013880] ==================================================================
> [ 1046.017053] BUG: KASAN: out-of-bounds in encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.019441] Read of size 18446744071716025685 at addr ffff8881011bf558 by task kworker/7:1/82
> [ 1046.022243] 
> [ 1046.022785] CPU: 7 PID: 82 Comm: kworker/7:1 Tainted: G            E     5.15.0-rc6+ #43
> [ 1046.025421] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
> [ 1046.028159] Workqueue: ceph-msgr ceph_con_workfn [libceph]
> [ 1046.030111] Call Trace:
> [ 1046.030983]  dump_stack_lvl+0x57/0x72
> [ 1046.032177]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.033864]  print_address_description.constprop.0+0x1f/0x140
> [ 1046.035807]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.037221]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.038680]  kasan_report.cold+0x7f/0x11b
> [ 1046.039853]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.041317]  ? encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.042782]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.044168]  kasan_check_range+0xf5/0x1d0
> [ 1046.045325]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.046679]  memcpy+0x20/0x60
> [ 1046.047555]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.048930]  encode_cap_msg+0x76c/0xa80 [ceph]
> [ 1046.050383]  ? ceph_kvmalloc+0xdd/0x110 [libceph]
> [ 1046.051888]  ? ceph_msg_new2+0xf7/0x210 [libceph]
> [ 1046.053395]  __send_cap+0x40/0x180 [ceph]
> [ 1046.054696]  ceph_check_caps+0x5a2/0xc50 [ceph]
> [ 1046.056482]  ? deref_stack_reg+0xb0/0xb0
> [ 1046.057786]  ? ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.059471]  ? __ceph_should_report_size+0x90/0x90 [ceph]
> [ 1046.061190]  ? lock_is_held_type+0xe0/0x110
> [ 1046.062453]  ? find_held_lock+0x85/0xa0
> [ 1046.063684]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.065089]  ? lock_release+0x1c7/0x3e0
> [ 1046.066225]  ? wait_for_completion+0x150/0x150
> [ 1046.067570]  ? __ceph_caps_file_wanted+0x25a/0x380 [ceph]
> [ 1046.069319]  handle_cap_grant+0x113c/0x13a0 [ceph]
> [ 1046.070962]  ? ceph_kick_flushing_inode_caps+0x240/0x240 [ceph]
> [ 1046.081699]  ? __cap_is_valid+0x82/0x100 [ceph]
> [ 1046.091755]  ? rb_next+0x1e/0x80
> [ 1046.096640]  ? __ceph_caps_issued+0xe0/0x130 [ceph]
> [ 1046.101331]  ceph_handle_caps+0x10f9/0x2280 [ceph]
> [ 1046.106003]  ? mds_dispatch+0x134/0x2470 [ceph]
> [ 1046.110416]  ? ceph_remove_capsnap+0x90/0x90 [ceph]
> [ 1046.114901]  ? __mutex_lock+0x180/0xc10
> [ 1046.119178]  ? release_sock+0x1d/0xf0
> [ 1046.123331]  ? mds_dispatch+0xaf/0x2470 [ceph]
> [ 1046.127588]  ? __mutex_unlock_slowpath+0x105/0x3c0
> [ 1046.131845]  mds_dispatch+0x6fb/0x2470 [ceph]
> [ 1046.136002]  ? tcp_recvmsg+0xe0/0x2c0
> [ 1046.140038]  ? ceph_mdsc_handle_mdsmap+0x3c0/0x3c0 [ceph]
> [ 1046.144255]  ? wait_for_completion+0x150/0x150
> [ 1046.148235]  ceph_con_process_message+0xd9/0x240 [libceph]
> [ 1046.152387]  ? iov_iter_advance+0x8e/0x480
> [ 1046.156239]  process_message+0xf/0x100 [libceph]
> [ 1046.160219]  ceph_con_v2_try_read+0x1561/0x1b00 [libceph]
> [ 1046.164317]  ? __handle_control+0x1730/0x1730 [libceph]
> [ 1046.168345]  ? __lock_acquire+0x830/0x2c60
> [ 1046.172183]  ? __mutex_lock+0x180/0xc10
> [ 1046.175910]  ? ceph_con_workfn+0x41/0x8b0 [libceph]
> [ 1046.179814]  ? lockdep_hardirqs_on_prepare+0x220/0x220
> [ 1046.183688]  ? mutex_lock_io_nested+0xba0/0xba0
> [ 1046.187559]  ? lock_release+0x3e0/0x3e0
> [ 1046.191422]  ceph_con_workfn+0x224/0x8b0 [libceph]
> [ 1046.195464]  process_one_work+0x4fd/0x9a0
> [ 1046.199281]  ? pwq_dec_nr_in_flight+0x100/0x100
> [ 1046.203075]  ? rwlock_bug.part.0+0x60/0x60
> [ 1046.206787]  worker_thread+0x2d4/0x6e0
> [ 1046.210488]  ? process_one_work+0x9a0/0x9a0
> [ 1046.214254]  kthread+0x1e3/0x210
> [ 1046.217911]  ? set_kthread_struct+0x80/0x80
> [ 1046.221694]  ret_from_fork+0x22/0x30
> [ 1046.225553] 
> [ 1046.228927] The buggy address belongs to the page:
> [ 1046.232690] page:000000001ee14099 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1011bf
> [ 1046.237195] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> [ 1046.241352] raw: 0017ffffc0000000 ffffea0004046fc8 ffffea0004046fc8 0000000000000000
> [ 1046.245998] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [ 1046.250612] page dumped because: kasan: bad access detected
> [ 1046.254948] 
> [ 1046.258789] addr ffff8881011bf558 is located in stack of task kworker/7:1/82 at offset 296 in frame:
> [ 1046.263501]  ceph_check_caps+0x0/0xc50 [ceph]
> [ 1046.267766] 
> [ 1046.271643] this frame has 3 objects:
> [ 1046.275934]  [32, 36) 'implemented'
> [ 1046.275941]  [48, 56) 'oldest_flush_tid'
> [ 1046.280091]  [80, 352) 'arg'
> [ 1046.284281] 
> [ 1046.291847] Memory state around the buggy address:
> [ 1046.295874]  ffff8881011bf400: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f2 f2 f2
> [ 1046.300247]  ffff8881011bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.304752] >ffff8881011bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.309172]                                                     ^
> [ 1046.313414]  ffff8881011bf580: 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00
> [ 1046.318113]  ffff8881011bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [ 1046.322543] ==================================================================
> 
> I'll keep investigating too.

Found it -- this patch seems to fix it. I'll plan to roll it into the
earlier patch that caused the bug, and will push an updated branch to
wip-fscrypt-fnames.

Good catch!

--------------------------------------8<-------------------------------

[PATCH] SQUASH: fix cap encoding when fscrypt is disabled

Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
 fs/ceph/caps.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 6e9f4de883d1..80f521dd7254 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1312,11 +1312,16 @@ static void encode_cap_msg(struct ceph_msg *msg,
struct cap_msg_args *arg)
 	ceph_encode_64(&p, 0);
 	ceph_encode_64(&p, 0);
 
+#if IS_ENABLED(CONFIG_FS_ENCRYPTION)
 	/* fscrypt_auth and fscrypt_file (version 12) */
 	ceph_encode_32(&p, arg->fscrypt_auth_len);
 	ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len);
 	ceph_encode_32(&p, arg->fscrypt_file_len);
 	ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len);
+#else /* CONFIG_FS_ENCRYPTION */
+	ceph_encode_32(&p, 0);
+	ceph_encode_32(&p, 0);
+#endif /* CONFIG_FS_ENCRYPTION */
 }
 
 /*
-- 
2.31.1






[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux