Add the ceph-devel in this thread, sorry to missing it.
On 2019/11/14 19:34, Jeff Layton wrote:
On Wed, 2019-11-13 at 21:37 -0500, xiubli@xxxxxxxxxx wrote:
From: Xiubo Li <xiubli@xxxxxxxxxx>
Any reason not to cc ceph-devel here? Was that just an oversight, or do
you think this needs a security embargo?
Sorry, just pressed the button and post it by mistake without the
ceph-devel list, and I had post it again just after this with the list
yesterday.
When mounting if the server path has more than one slash, such as:
'mount.ceph 192.168.195.165:40176:/// /mnt/cephfs/'
In the MDS server side the extra slash will be treated as snap dir,
and then we can get the following debug logs:
It sort of sounds like the real problem is in the MDS.
Shouldn't it just be ignoring the extra '/' characters? I'm not a huge
fan of adding in this complex handling to work around an MDS bug.
Yeah, the MDS is buggy too and I will try to fix it later.
But here in the kclient IMO we still need to ignore the extra slashes in
case that if some one want to do or by type mistake:
1, 'mount.ceph 192.168.195.165:40176:/mydir/ /mnt/cephfs/'
2, 'mount.ceph 192.168.195.165:40176:/mydir// /mnt/cephfs/'
or
1, 'mount.ceph 192.168.195.165:40176:/mydir/ /mnt/cephfs/'
2, 'mount.ceph 192.168.195.165:40176://mydir /mnt/cephfs/'
Both pairs above should be the same in theory, but when running the
Step2 and comparing and searching a existing same sb, it will fail and
then create a new one. The code like:
ceph_mount()-->sget() --> ceph_compare_super() --> compare_mount_options():
ret = strcmp_null(fsopt1->server_path, fsopt2->server_path);
Currently the server_paths here will only remove the leading slash in
the server paths just before mount opening.
With this fix we can do nothing for the Step2 and just return with:
"mount error 16 = Device or resource busy"
<7>[ ...] ceph: mount opening path //
<7>[ ...] ceph: open_root_inode opening '//'
<7>[ ...] ceph: fill_trace 0000000059b8a3bc is_dentry 0 is_target 1
<7>[ ...] ceph: alloc_inode 00000000dc4ca00b
<7>[ ...] ceph: get_inode created new inode 00000000dc4ca00b 1.ffffffffffffffff ino 1
<7>[ ...] ceph: get_inode on 1=1.ffffffffffffffff got 00000000dc4ca00b
And then when creating any new file or directory under the mount
point, we can get the following crash core dump:
<4>[ ...] ------------[ cut here ]------------
<2>[ ...] kernel BUG at fs/ceph/inode.c:1347!
Which BUG_ON is this? I'd guess one of these, but it'd be good to
confirm it:
BUG_ON(ceph_ino(dir) != dvino.ino);
BUG_ON(ceph_snap(dir) != dvino.snap);
Yeah, It is the second one and the bug is:
https://tracker.ceph.com/issues/42771
<4>[ ...] invalid opcode: 0000 [#1] SMP PTI
<4>[ ...] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G E 5.4.0-rc5+ #1
<4>[ ...] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
<4>[ ...] Workqueue: ceph-msgr ceph_con_workfn [libceph]
<4>[ ...] RIP: 0010:ceph_fill_trace+0x992/0xb30 [ceph]
<4>[ ...] Code: ff 0f 0b 0f 0b 0f 0b 4c 89 fa 48 c7 c6 4d [...]
<4>[ ...] RSP: 0018:ffffa23d40067c70 EFLAGS: 00010297
<4>[ ...] RAX: fffffffffffffffe RBX: ffff8a229eb566c0 RCX: 0000000000000006
<4>[ ...] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff8a23aec17900
<4>[ ...] RBP: ffff8a226bd91eb0 R08: 0000000000000001 R09: 0000000000000885
<4>[ ...] R10: 000000000002dfd8 R11: ffff8a226bd95b30 R12: ffff8a239347e000
<4>[ ...] R13: 0000000000000000 R14: ffff8a22fabeb000 R15: ffff8a2338b0c900
<4>[ ...] FS: 0000000000000000(0000) GS:ffff8a23aec00000(0000) knlGS:0000000000000000
<4>[ ...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ ...] CR2: 000055b479d92068 CR3: 00000003764f6004 CR4: 00000000003606f0
<4>[ ...] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[ ...] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[ ...] Call Trace:
<4>[ ...] dispatch+0x2ac/0x12b0 [ceph]
<4>[ ...] ceph_con_workfn+0xd40/0x27c0 [libceph]
<4>[ ...] process_one_work+0x1b0/0x350
<4>[ ...] worker_thread+0x50/0x3b0
<4>[ ...] kthread+0xfb/0x130
<4>[ ...] ? process_one_work+0x350/0x350
<4>[ ...] ? kthread_park+0x90/0x90
<4>[ ...] ret_from_fork+0x35/0x40
<4>[ ...] Modules linked in: ceph(E) libceph fscache [...]
<4>[ ...] ---[ end trace ba883d8ccf9afcb0 ]---
<4>[ ...] RIP: 0010:ceph_fill_trace+0x992/0xb30 [ceph]
<4>[ ...] Code: ff 0f 0b 0f 0b 0f 0b 4c 89 fa 48 c7 c6 [...]
<4>[ ...] RSP: 0018:ffffa23d40067c70 EFLAGS: 00010297
<4>[ ...] RAX: fffffffffffffffe RBX: ffff8a229eb566c0 RCX: 0000000000000006
<4>[ ...] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff8a23aec17900
<4>[ ...] RBP: ffff8a226bd91eb0 R08: 0000000000000001 R09: 0000000000000885
<4>[ ...] R10: 000000000002dfd8 R11: ffff8a226bd95b30 R12: ffff8a239347e000
<4>[ ...] R13: 0000000000000000 R14: ffff8a22fabeb000 R15: ffff8a2338b0c900
<4>[ ...] FS: 0000000000000000(0000) GS:ffff8a23aec00000(0000) knlGS:0000000000000000
<4>[ ...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ ...] CR2: 000055b479d92068 CR3: 00000003764f6004 CR4: 00000000003606f0
<4>[ ...] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[ ...] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
And we should ignore the extra slashes in the server path when mount
opening in case users have added them by mistake:
"//mydir1///mydir//" --> "mydir1/mydir/"
Shouldn't that be:
"//mydir1///mydir//" --> "/mydir1/mydir/"
Why strip off all of the leading slashes?
There is one old commit From Yan, Zheng:
commit ce2728aaa82bbebae7d20345324af3f0f49eeb20
Author: Yan, Zheng <zyan@xxxxxxxxxx>
Date: Wed Sep 14 14:53:05 2016 +0800
ceph: avoid accessing / when mounting a subpath
Accessing / causes failuire if the client has caps that restrict path
Signed-off-by: Yan, Zheng <zyan@xxxxxxxxxx>
This will remove the leading '/' when mount opening.
Signed-off-by: Xiubo Li <xiubli@xxxxxxxxxx>
---
fs/ceph/super.c | 71 ++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 61 insertions(+), 10 deletions(-)
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index b47f43fc2d68..91985c53e611 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -430,6 +430,39 @@ static int strcmp_null(const char *s1, const char *s2)
return strcmp(s1, s2);
}
+static char *path_remove_extra_slash(const char *path)
+{
+ bool last_is_slash;
+ int i, j;
+ int len;
+ char *p;
+
+ if (!path)
+ return NULL;
+
+ len = strlen(path);
+
+ p = kmalloc(len, GFP_NOFS);
+ if (!p)
+ return ERR_PTR(-ENOMEM);
+
Given that this should only be called when mounting, you can probably
use GFP_KERNEL here.
Yeah, will fix it.
BRs
Xiubo
+ last_is_slash = false;
+ for (j = 0, i = 0; i < len; i++) {
+ if (path[i] == '/') {
+ if (last_is_slash)
+ continue;
+ last_is_slash = true;
+ } else {
+ last_is_slash = false;
+ }
+ p[j++] = path[i];
+ }
+
+ p[j] = '\0';
+
+ return p;
+}
+
static int compare_mount_options(struct ceph_mount_options *new_fsopt,
struct ceph_options *new_opt,
struct ceph_fs_client *fsc)
@@ -437,6 +470,7 @@ static int compare_mount_options(struct ceph_mount_options *new_fsopt,
struct ceph_mount_options *fsopt1 = new_fsopt;
struct ceph_mount_options *fsopt2 = fsc->mount_options;
int ofs = offsetof(struct ceph_mount_options, snapdir_name);
+ char *p1, *p2;
int ret;
ret = memcmp(fsopt1, fsopt2, ofs);
@@ -449,9 +483,21 @@ static int compare_mount_options(struct ceph_mount_options *new_fsopt,
ret = strcmp_null(fsopt1->mds_namespace, fsopt2->mds_namespace);
if (ret)
return ret;
- ret = strcmp_null(fsopt1->server_path, fsopt2->server_path);
+
+ p1 = path_remove_extra_slash(fsopt1->server_path);
+ if (IS_ERR(p1))
+ return PTR_ERR(p1);
+ p2 = path_remove_extra_slash(fsopt2->server_path);
+ if (IS_ERR(p2)) {
+ kfree(p1);
+ return PTR_ERR(p2);
+ }
+ ret = strcmp_null(p1, p2);
+ kfree(p1);
+ kfree(p2);
if (ret)
return ret;
+
ret = strcmp_null(fsopt1->fscache_uniq, fsopt2->fscache_uniq);
if (ret)
return ret;
@@ -507,12 +553,10 @@ static int parse_mount_options(struct ceph_mount_options **pfsopt,
*/
dev_name_end = strchr(dev_name, '/');
if (dev_name_end) {
- if (strlen(dev_name_end) > 1) {
- fsopt->server_path = kstrdup(dev_name_end, GFP_KERNEL);
- if (!fsopt->server_path) {
- err = -ENOMEM;
- goto out;
- }
+ fsopt->server_path = kstrdup(dev_name_end, GFP_KERNEL);
+ if (!fsopt->server_path) {
+ err = -ENOMEM;
+ goto out;
}
} else {
dev_name_end = dev_name + strlen(dev_name);
@@ -945,7 +989,7 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc)
mutex_lock(&fsc->client->mount_mutex);
if (!fsc->sb->s_root) {
- const char *path;
+ const char *path, *p;
err = __ceph_open_session(fsc->client, started);
if (err < 0)
goto out;
@@ -957,17 +1001,24 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc)
goto out;
}
- if (!fsc->mount_options->server_path) {
+ p = path_remove_extra_slash(fsc->mount_options->server_path);
+ if (IS_ERR(p)) {
+ err = PTR_ERR(p);
+ goto out;
+ }
+ /* if the server path is omitted in the dev_name or just '/' */
+ if (!p || (p && strlen(p) == 1)) {
path = "";
dout("mount opening path \\t\n");
} else {
- path = fsc->mount_options->server_path + 1;
+ path = p + 1;
dout("mount opening path %s\n", path);
}
ceph_fs_debugfs_init(fsc);
root = open_root_dentry(fsc, path, started);
+ kfree(p);
if (IS_ERR(root)) {
err = PTR_ERR(root);
goto out;