On Wed, Jul 7, 2021 at 12:05 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Fri, 2021-07-02 at 20:27 +0530, Venky Shankar wrote: > > On Fri, Jul 2, 2021 at 7:20 PM Luis Henriques <lhenriques@xxxxxxx> wrote: > > > > > > On Fri, Jul 02, 2021 at 04:40:18PM +0530, Venky Shankar wrote: > > > > On Fri, Jul 2, 2021 at 4:14 PM Luis Henriques <lhenriques@xxxxxxx> wrote: > > > > > > > > > > On Fri, Jul 02, 2021 at 12:18:19PM +0530, Venky Shankar wrote: > > > > > > The new device syntax requires the cluster FSID as part > > > > > > of the device string. Use this FSID to verify if it matches > > > > > > the cluster FSID we get back from the monitor, failing the > > > > > > mount on mismatch. > > > > > > > > > > > > Also, rename parse_fsid() to ceph_parse_fsid() as it is too > > > > > > generic. > > > > > > > > > > > > Signed-off-by: Venky Shankar <vshankar@xxxxxxxxxx> > > > > > > --- > > > > > > fs/ceph/super.c | 9 +++++++++ > > > > > > fs/ceph/super.h | 1 + > > > > > > include/linux/ceph/libceph.h | 1 + > > > > > > net/ceph/ceph_common.c | 5 +++-- > > > > > > 4 files changed, 14 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > > > > > > index 0b324e43c9f4..03e5f4bb2b6f 100644 > > > > > > --- a/fs/ceph/super.c > > > > > > +++ b/fs/ceph/super.c > > > > > > @@ -268,6 +268,9 @@ static int ceph_parse_new_source(const char *dev_name, const char *dev_name_end, > > > > > > if (!fs_name_start) > > > > > > return invalfc(fc, "missing file system name"); > > > > > > > > > > > > + if (ceph_parse_fsid(fsid_start, &fsopt->fsid)) > > > > > > + return invalfc(fc, "invalid fsid format"); > > > > > > + > > > > > > ++fs_name_start; /* start of file system name */ > > > > > > fsopt->mds_namespace = kstrndup(fs_name_start, > > > > > > dev_name_end - fs_name_start, GFP_KERNEL); > > > > > > @@ -750,6 +753,12 @@ static struct ceph_fs_client *create_fs_client(struct ceph_mount_options *fsopt, > > > > > > } > > > > > > opt = NULL; /* fsc->client now owns this */ > > > > > > > > > > > > + /* help learn fsid */ > > > > > > + if (fsopt->new_dev_syntax) { > > > > > > + ceph_check_fsid(fsc->client, &fsopt->fsid); > > > > > > > > > > This call to ceph_check_fsid() made me wonder what would happen if I use > > > > > the wrong fsid with the new syntax. And the result is: > > > > > > > > > > [ 41.882334] libceph: mon0 (1)192.168.155.1:40594 session established > > > > > [ 41.884537] libceph: bad fsid, had d52783e6-efc2-4dce-ad01-aa3272fa5f66 got 90bdb539-9d95-402e-8f23-b0e26cba8b1b > > > > > [ 41.885955] libceph: bad fsid, had d52783e6-efc2-4dce-ad01-aa3272fa5f66 got 90bdb539-9d95-402e-8f23-b0e26cba8b1b > > > > > [ 41.889313] libceph: bad fsid, had d52783e6-efc2-4dce-ad01-aa3272fa5f66 got 90bdb539-9d95-402e-8f23-b0e26cba8b1b > > > > > [ 41.892578] libceph: osdc handle_map corrupt msg > > > > > > > > > > ... followed by a msg dump. > > > > > > > > > > I guess this means that manually setting the fsid requires changes to the > > > > > messenger (I've only tested with v1) so that it gracefully handles this > > > > > scenario. > > > > > > > > Yes, this results in a big dump of messages. I haven't looked at > > > > gracefully handling these. > > > > > > > > I'm not sure if it needs to be done in these set of patches though. > > > > > > Ah, sure! I didn't meant you'd need to change the messenger to handle it > > > (as I'm not even sure it's the messenger or the mons client that require > > > changes). But I also don't think that this patchset can be merged without > > > making sure we can handle a bad fsid correctly and without all this noise. > > > > True. However, for most cases users really won't be filling in the > > fsid and the mount helper would fill the "correct" one automatically. > > > > Yes, but some of them may and I think we do need to handle this > gracefully. Let's step back a moment and consider: > > AIUI, the fsid is only here to disambiguate when you have multiple > clusters. The kernel doesn't really care about this value at all. All it > cares about is whether it's talking to the right mons (as evidenced by > the fact that we don't pass the fsid in at mount time today). > > So probably the right thing to do is to just return an error (-EINVAL?) > to mount() when there is a mismatch between the fsid and the one in the > maps. Is that possible? Gracefully handling this is the correct way forward. With these changes, a fsid mismatch results in a big splat of message dumps and mount failure. I haven't yet figured where (all) to plumb in the work to handle this, but looks doable. > > > > > > > Cheers, > > > -- > > > Luís > > > > > > > > > > > > > > > > > Cheers, > > > > > -- > > > > > Luís > > > > > > > > > > > + fsc->client->have_fsid = true; > > > > > > + } > > > > > > + > > > > > > fsc->client->extra_mon_dispatch = extra_mon_dispatch; > > > > > > ceph_set_opt(fsc->client, ABORT_ON_FULL); > > > > > > > > > > > > diff --git a/fs/ceph/super.h b/fs/ceph/super.h > > > > > > index 8f71184b7c85..ce5fb90a01a4 100644 > > > > > > --- a/fs/ceph/super.h > > > > > > +++ b/fs/ceph/super.h > > > > > > @@ -99,6 +99,7 @@ struct ceph_mount_options { > > > > > > char *server_path; /* default NULL (means "/") */ > > > > > > char *fscache_uniq; /* default NULL */ > > > > > > char *mon_addr; > > > > > > + struct ceph_fsid fsid; > > > > > > }; > > > > > > > > > > > > struct ceph_fs_client { > > > > > > diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h > > > > > > index 409d8c29bc4f..75d059b79d90 100644 > > > > > > --- a/include/linux/ceph/libceph.h > > > > > > +++ b/include/linux/ceph/libceph.h > > > > > > @@ -296,6 +296,7 @@ extern bool libceph_compatible(void *data); > > > > > > extern const char *ceph_msg_type_name(int type); > > > > > > extern int ceph_check_fsid(struct ceph_client *client, struct ceph_fsid *fsid); > > > > > > extern void *ceph_kvmalloc(size_t size, gfp_t flags); > > > > > > +extern int ceph_parse_fsid(const char *str, struct ceph_fsid *fsid); > > > > > > > > > > > > struct fs_parameter; > > > > > > struct fc_log; > > > > > > diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c > > > > > > index 97d6ea763e32..da480757fcca 100644 > > > > > > --- a/net/ceph/ceph_common.c > > > > > > +++ b/net/ceph/ceph_common.c > > > > > > @@ -217,7 +217,7 @@ void *ceph_kvmalloc(size_t size, gfp_t flags) > > > > > > return p; > > > > > > } > > > > > > > > > > > > -static int parse_fsid(const char *str, struct ceph_fsid *fsid) > > > > > > +int ceph_parse_fsid(const char *str, struct ceph_fsid *fsid) > > > > > > { > > > > > > int i = 0; > > > > > > char tmp[3]; > > > > > > @@ -247,6 +247,7 @@ static int parse_fsid(const char *str, struct ceph_fsid *fsid) > > > > > > dout("parse_fsid ret %d got fsid %pU\n", err, fsid); > > > > > > return err; > > > > > > } > > > > > > +EXPORT_SYMBOL(ceph_parse_fsid); > > > > > > > > > > > > /* > > > > > > * ceph options > > > > > > @@ -465,7 +466,7 @@ int ceph_parse_param(struct fs_parameter *param, struct ceph_options *opt, > > > > > > break; > > > > > > > > > > > > case Opt_fsid: > > > > > > - err = parse_fsid(param->string, &opt->fsid); > > > > > > + err = ceph_parse_fsid(param->string, &opt->fsid); > > > > > > if (err) { > > > > > > error_plog(&log, "Failed to parse fsid: %d", err); > > > > > > return err; > > > > > > -- > > > > > > 2.27.0 > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Venky > > > > > > > > > > > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- Cheers, Venky