On Wed, 2021-07-14 at 17:35 +0100, Luis Henriques wrote: > On Wed, Jul 14, 2021 at 12:17:33PM -0400, Jeff Layton wrote: > > On Wed, 2021-07-14 at 15:35 +0530, Venky Shankar wrote: > > > Note that the new monitors are just shown in /proc/mounts. > > > Ceph does not (re)connect to new monitors yet. > > > > > > Signed-off-by: Venky Shankar <vshankar@xxxxxxxxxx> > > > --- > > > fs/ceph/super.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > > > index d8c6168b7fcd..d3a5a3729c5b 100644 > > > --- a/fs/ceph/super.c > > > +++ b/fs/ceph/super.c > > > @@ -1268,6 +1268,13 @@ static int ceph_reconfigure_fc(struct fs_context *fc) > > > else > > > ceph_clear_mount_opt(fsc, ASYNC_DIROPS); > > > > > > + if (strcmp(fsc->mount_options->mon_addr, fsopt->mon_addr)) { > > > + kfree(fsc->mount_options->mon_addr); > > > + fsc->mount_options->mon_addr = fsopt->mon_addr; > > > + fsopt->mon_addr = NULL; > > > + printk(KERN_NOTICE "ceph: monitor addresses recorded, but not used for reconnection"); > > > > It's currently more in-vogue to use pr_notice() for this. I'll plan to > > make that (minor) change before I merge. No need to resend. > > Yeah, this was the only comment I had too. I saw some issues in the > previous revision but the changes to ceph_parse_source() seem to fix it in > this revision. > > The other annoying thing I found isn't related with this patchset but with > a change that's been done some time ago by Xiubo (added to CC): it looks > like that if we have an invalid parameter (for example, wrong secret) > we'll always get -EHOSTUNREACH. > > See below a possible fix (although I'm not entirely sure that's the correct > one). > > Cheers, > -- > Luís > > From a988d24d8e72fc4933459f3dd5d303cbc9a566ed Mon Sep 17 00:00:00 2001 > From: Luis Henriques <lhenriques@xxxxxxx> > Date: Wed, 14 Jul 2021 16:56:36 +0100 > Subject: [PATCH] ceph: don't hide error code if we don't have mdsmap > > Since commit 97820058fb28 ("ceph: check availability of mds cluster on mount > after wait timeout") we're returning -EHOSTUNREACH, even if the error isn't > related with the MDSs availability. For example, we'll get it even if we're > trying to mounting a filesystem with an invalid username or secret. > > Only return this error if we get -EIO. > > Fixes: 97820058fb28 ("ceph: check availability of mds cluster on mount after wait timeout") > Signed-off-by: Luis Henriques <lhenriques@xxxxxxx> > --- > fs/ceph/super.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > index 086a1ceec9d8..67d70059ce9f 100644 > --- a/fs/ceph/super.c > +++ b/fs/ceph/super.c > @@ -1230,7 +1230,8 @@ static int ceph_get_tree(struct fs_context *fc) > return 0; > > out_splat: > - if (!ceph_mdsmap_is_cluster_available(fsc->mdsc->mdsmap)) { > + if ((err == -EIO) && > + !ceph_mdsmap_is_cluster_available(fsc->mdsc->mdsmap)) { > pr_info("No mds server is up or the cluster is laggy\n"); > err = -EHOSTUNREACH; > } Yeah, I've noticed that message pop up under all sorts of circumstances and it is an annoyance. I'm happy to consider such a patch if you send it separately. That said, I'm honestly not sure this message is really helpful, and overriding errors like this at a high level seems sort of sketchy. Maybe we should just drop that message, or figure out a way to limit it to _just_ that situation. -- Jeff Layton <jlayton@xxxxxxxxxx>