Re: df shows wrong size of cephfs share when a subdirectory is mounted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 22, 2022 at 03:39:04PM +0100, Luís Henriques wrote:
> On Thu, Apr 21, 2022 at 08:53:48PM +0000, Ryan Taylor wrote:
> > 
> > Hi  Luís,
> > 
> > I did just that:
> > 
> > [fedora@cephtest ~]$ sudo ./debug.sh 
> ...
> > [94831.006412] ceph:  release inode 000000003bb3ccb2 dir file 00000000b0b84d82
> > [94831.006573] ceph:  do_getattr inode 000000003bb3ccb2 mask AsXsFs mode 040755
> > [94831.006575] ceph:  __ceph_caps_issued_mask ino 0x1001b45c2fa cap 000000000cde56f9 issued pAsLsXsFs (mask AsXsFs)
> > [94831.006576] ceph:  __touch_cap 000000003bb3ccb2 cap 000000000cde56f9 mds0
> > [94831.006581] ceph:  statfs
> 
> 
> OK, this was the point where I expected to see something useful.
> Unfortunately, it looks like the quota code doesn't have good enough debug
> info here :-(
> 
> I've spent a lot of hours today trying to reproduce it: recompiled
> v14.4.22, tried Fedora as a client, but nothing.  I should be able to
> debug this problem but I'd need to be able to reproduce the issue.
> 
> I'm adding Jeff and Xiubo to CC, maybe they have some further ideas.  I
> must confess I'm clueless at this point.

I *think* I've figured it out (and in between I fixed a somewhat related
bug in the kernel client quotas code).  I've update the tracker [1] with
what I've found.  The TL;DR is that this is a mix of Linux security
modules with authentication capabilities configuration.  Please have a
look at the comments there and see if any of the workarounds work for
you.

[1] https://tracker.ceph.com/issues/55090

Cheers,
--
Luís

> 
> Cheers,
> --
> Luís
> 
> > 
> > Thanks,
> > -rt
> > 
> > Ryan Taylor
> > Research Computing Specialist
> > Research Computing Services, University Systems
> > University of Victoria
> > 
> > ________________________________________
> > From: Luís Henriques <lhenriques@xxxxxxx>
> > Sent: April 21, 2022 1:35 PM
> > To: Ryan Taylor
> > Cc: Hendrik Peyerl; Ramana Venkatesh Raja; ceph-users@xxxxxxx
> > Subject: Re:  Re: df shows wrong size of cephfs share when a subdirectory is mounted
> > 
> > Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
> > 
> > 
> > On Thu, Apr 21, 2022 at 07:28:19PM +0000, Ryan Taylor wrote:
> > >
> > >  Hi Luís,
> > >
> > > dmesg looks normal I think:
> > 
> > Yep, I don't see anything suspicious either.
> > 
> > >
> > > [  265.269450] Key type ceph registered
> > > [  265.270914] libceph: loaded (mon/osd proto 15/24)
> > > [  265.303764] FS-Cache: Netfs 'ceph' registered for caching
> > > [  265.305460] ceph: loaded (mds proto 32)
> > > [  265.513616] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [  265.520982] libceph: client3734313 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  265.539710] ceph: mds0 rejected session
> > > [  265.544592] libceph: mon1 (1)10.30.202.3:6789 session established
> > > [  265.549564] libceph: client3698116 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  265.552624] ceph: mds0 rejected session
> > > [  316.849402] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [  316.855077] libceph: client3734316 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  316.886834] ceph: mds0 rejected session
> > > [  372.064685] libceph: mon2 (1)10.30.203.3:6789 session established
> > > [  372.068731] libceph: client3708026 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  372.071651] ceph: mds0 rejected session
> > > [  372.074641] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [  372.080435] libceph: client3734319 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  372.083270] ceph: mds0 rejected session
> > > [  443.855530] libceph: mon2 (1)10.30.203.3:6789 session established
> > > [  443.863231] libceph: client3708029 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [  555.889186] libceph: mon2 (1)10.30.203.3:6789 session established
> > > [  555.893677] libceph: client3708032 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [ 1361.181405] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [ 1361.187230] libceph: client3734325 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [ 1415.463391] libceph: mon2 (1)10.30.203.3:6789 session established
> > > [ 1415.467663] libceph: client3708038 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [ 2018.707478] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [ 2018.712834] libceph: client3734337 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [ 2276.564841] libceph: mon1 (1)10.30.202.3:6789 session established
> > > [ 2276.568899] libceph: client3698128 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [ 2435.596579] libceph: mon2 (1)10.30.203.3:6789 session established
> > > [ 2435.600599] libceph: client3708050 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [89805.777644] libceph: mon0 (1)10.30.201.3:6789 session established
> > > [89805.782455] libceph: client3740982 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > > [89868.055719] libceph: mon1 (1)10.30.202.3:6789 session established
> > > [89868.059600] libceph: client3704767 fsid 50004482-d5e3-4b76-9a4c-abd0626c9882
> > >
> > > Pretty sure "mds0 rejected session" was when I was accidentally trying to mount the wrong share yesterday.
> > >
> > > Could it depend on the Ceph version (ours is v14.2.22) , or could it depend on something Manila is doing?
> > 
> > I thought that too, but then I compiled a 14.2.22 and I still couldn't
> > reproduce it either (note: this was on a vstart cluster, not a *real*
> > one).
> > 
> > > Is there any other useful information I could collect?
> > 
> > I guess you could try to get some more detailed kernel logs, but I'm not
> > sure your kernel is compiled with the required options.  To check if it
> > is, just see if file "/sys/kernel/debug/dynamic_debug/control" exists.  If
> > it does, we're good to go!
> > 
> > # enable kernel client debug:
> > echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control
> > # run the mount command on the subdir
> > # run df -h
> > # disable kernel client debug:
> > echo 'module ceph -p' > /sys/kernel/debug/dynamic_debug/control
> > 
> > Note that the kernel logging can be quite verbose.  Maybe a good idea is
> > to just script it to be quick ;-)
> > 
> > After that, please share the log (dmesg).  Maybe it has some hint on
> > what's going on.
> > 
> > Cheers,
> > --
> > Luís
> > 
> > >
> > > Thanks,
> > > -rt
> > >
> > > Ryan Taylor
> > > Research Computing Specialist
> > > Research Computing Services, University Systems
> > > University of Victoria
> > >
> > > ________________________________________
> > > From: Luís Henriques <lhenriques@xxxxxxx>
> > > Sent: April 21, 2022 4:32 AM
> > > To: Ryan Taylor
> > > Cc: Hendrik Peyerl; Ramana Venkatesh Raja; ceph-users@xxxxxxx
> > > Subject: Re:  Re: df shows wrong size of cephfs share when a subdirectory is mounted
> > >
> > > Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
> > >
> > >
> > > On Wed, Apr 20, 2022 at 07:05:37PM +0000, Ryan Taylor wrote:
> > > >
> > > > Hi Luís,
> > > >
> > > > The same cephx key is used for both mounts. It is a regular rw key which
> > > > does not have permission to set any ceph xattrs (that was done
> > > > separately with a different key).  But it can read ceph xattrs and set
> > > > user xattrs.
> > >
> > > Thank you for the very detail description.  I'm still scratching my head
> > > to figure out what's wrong as I can't reproduce this.  Just out of
> > > curiosity: are you seeing any errors/warnings in the kernel log? (dmesg)
> > >
> > > Cheers,
> > > --
> > > Luís
> > >
> > > >
> > > > I just did a test using the latest Fedora 35 kernel and reproduced the problem:
> > > >
> > > > [fedora@cephtest ~]$ sudo mkdir /mnt/ceph1
> > > > [fedora@cephtest ~]$ sudo mkdir /mnt/ceph2
> > > > [fedora@cephtest ~]$ sudo mount -t ceph 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2            /mnt/ceph1 -o name=rwkey,secret=...
> > > > [fedora@cephtest ~]$ sudo mkdir /mnt/ceph1/testsubdir
> > > > [fedora@cephtest ~]$ sudo mount -t ceph 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/testsubdir /mnt/ceph2 -o name=rwkey,secret=...
> > > > [fedora@cephtest ~]$ df | grep ceph
> > > > 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2                           5242880000        291385344       4951494656   6% /mnt/ceph1
> > > > 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/testsubdir 4287562399744 295238516736 3992323883008   7% /mnt/ceph2
> > > > [fedora@cephtest ~]$ uname -r
> > > > 5.16.20-200.fc35.x86_64
> > > >
> > > > Furthermore I then repeated my earlier test regarding ceph.quota.max_bytes.
> > > > The volume root already has the right quota based on the size of my Manila share in Openstack, and it matches the size reported by df (5000 GiB)
> > > >
> > > > [fedora@cephtest ~]$ getfattr -n ceph.quota.max_bytes  /mnt/ceph1/
> > > > getfattr: Removing leading '/' from absolute path names
> > > > # file: mnt/ceph1/
> > > > ceph.quota.max_bytes="5368709120000"
> > > >
> > > > And on a separate system with admin credentials I applied a max_bytes quota to the testsubdir:
> > > >
> > > > sudo setfattr -n  ceph.quota.max_bytes -v 121212 /mnt/cephfs/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/testsubdir/
> > > >
> > > > I unmounted and remounted testsubdir exactly as before, but even with ceph.quota.max_bytes applied on the subdir it still shows the wrong size:
> > > >
> > > > [fedora@cephtest ~]$ df | grep ceph
> > > > 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2                           5242880000        291385344       4951494656   6% /mnt/ceph1
> > > > 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/testsubdir 4287544954880 295264587776 3992280367104   7% /mnt/ceph2
> > > >
> > > > [fedora@cephtest ~]$ getfattr -n ceph.quota.max_bytes  /mnt/ceph1/testsubdir/
> > > > getfattr: Removing leading '/' from absolute path names
> > > > # file: mnt/ceph1/testsubdir/
> > > > ceph.quota.max_bytes="121212"
> > > >
> > > > [fedora@cephtest ~]$ getfattr -n ceph.quota.max_bytes  /mnt/ceph2
> > > > getfattr: Removing leading '/' from absolute path names
> > > > # file: mnt/ceph2
> > > > ceph.quota.max_bytes="121212"
> > > >
> > > > Thanks,
> > > > -rt
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ________________________________________
> > > > From: Luís Henriques <lhenriques@xxxxxxx>
> > > > Sent: April 20, 2022 7:16 AM
> > > > To: Ryan Taylor
> > > > Cc: Hendrik Peyerl; Ramana Venkatesh Raja; ceph-users@xxxxxxx
> > > > Subject: Re:  Re: df shows wrong size of cephfs share when a subdirectory is mounted
> > > >
> > > > Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
> > > >
> > > >
> > > > On Tue, Apr 19, 2022 at 08:51:50PM +0000, Ryan Taylor wrote:
> > > > > Thanks for the pointers! It does look like https://tracker.ceph.com/issues/55090
> > > > > and I am not surprised Dan and I are hitting the same issue...
> > > >
> > > > Just a wild guess (already asked this on the tracker):
> > > >
> > > > Is it possible that you're using different credentials/keys so that the
> > > > credentials used for mounting the subdir are not allowed to access the
> > > > volume base directory?  Would it be possible to get more details on the
> > > > two mount commands being used?
> > > >
> > > > Cheers,
> > > > --
> > > > Luís
> > > >
> > > > >
> > > > >
> > > > > I am using the latest available Almalinux 8, 4.18.0-348.20.1.el8_5.x86_64
> > > > >
> > > > > Installing kernel-debuginfo-common-x86_64
> > > > > I see in /usr/src/debug/kernel-4.18.0-348.2.1.el8_5/linux-4.18.0-348.2.1.el8_5.x86_64/fs/ceph/quota.c
> > > > > for example:
> > > > >
> > > > > static inline bool ceph_has_realms_with_quotas(struct inode *inode)
> > > > > {
> > > > >         struct super_block *sb = inode->i_sb;
> > > > >         struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(sb);
> > > > >         struct inode *root = d_inode(sb->s_root);
> > > > >
> > > > >         if (atomic64_read(&mdsc->quotarealms_count) > 0)
> > > > >                 return true;
> > > > >         /* if root is the real CephFS root, we don't have quota realms */
> > > > >         if (root && ceph_ino(root) == CEPH_INO_ROOT)
> > > > >                 return false;
> > > > >         /* otherwise, we can't know for sure */
> > > > >         return true;
> > > > > }
> > > > >
> > > > > So this EL8.5 kernel already has at least some of the patches from https://lore.kernel.org/all/20190301175752.17808-1-lhenriques@xxxxxxxx/T/#u
> > > > > for https://tracker.ceph.com/issues/38482
> > > > > That does not mention a specific commit, just says "Merged into 5.2-rc1."
> > > > >
> > > > > So it seems https://tracker.ceph.com/issues/55090  is either a new issue or a regression of the previous issue.
> > > > >
> > > > > Thanks,
> > > > > -rt
> > > > >
> > > > > Ryan Taylor
> > > > > Research Computing Specialist
> > > > > Research Computing Services, University Systems
> > > > > University of Victoria
> > > > >
> > > > > ________________________________________
> > > > > From: Hendrik Peyerl <hpeyerl@xxxxxxxxxxxx>
> > > > > Sent: April 19, 2022 6:05 AM
> > > > > To: Ramana Venkatesh Raja
> > > > > Cc: Ryan Taylor; ceph-users@xxxxxxx
> > > > > Subject: Re:  df shows wrong size of cephfs share when a subdirectory is mounted
> > > > >
> > > > > Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
> > > > >
> > > > >
> > > > > I did hit this issue aswell: https://tracker.ceph.com/issues/38482
> > > > >
> > > > > you will need a kernel >= 5.2 that can handle the quotas on subdirectories.
> > > > >
> > > > >
> > > > > > On 19. Apr 2022, at 14:47, Ramana Venkatesh Raja <rraja@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Sat, Apr 16, 2022 at 10:15 PM Ramana Venkatesh Raja <rraja@xxxxxxxxxx> wrote:
> > > > > >>
> > > > > >> On Thu, Apr 14, 2022 at 8:07 PM Ryan Taylor <rptaylor@xxxxxxx> wrote:
> > > > > >>>
> > > > > >>> Hello,
> > > > > >>>
> > > > > >>>
> > > > > >>> I am using cephfs via Openstack Manila (Ussuri I think).
> > > > > >>>
> > > > > >>> The cephfs cluster is v14.2.22 and my client has kernel  4.18.0-348.20.1.el8_5.x86_64
> > > > > >>>
> > > > > >>>
> > > > > >>> I have a Manila share
> > > > > >>>
> > > > > >>> /volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2
> > > > > >>>
> > > > > >>>
> > > > > >>> that is 5000 GB in size. When I mount it the size is reported correctly:
> > > > > >>>
> > > > > >>>
> > > > > >>> # df -h /cephfs
> > > > > >>> Filesystem                                                                                                 Size  Used Avail Use% Mounted on
> > > > > >>> 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2  4.9T  278G  4.7T   6% /cephfs
> > > > > >>>
> > > > > >>>
> > > > > >>> However when I mount a subpath /test1 of my share, then both the size and usage are showing the size of the whole cephfs filesystem rather than my private share.
> > > > > >>>
> > > > > >>>
> > > > > >>> # df -h /cephfs
> > > > > >>> Filesystem                                                                                                       Size  Used Avail Use% Mounted on
> > > > > >>> 10.30.201.3:6789,10.30.202.3:6789,10.30.203.3:6789:/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/test1  4.0P  277T  3.7P   7% /cephfs
> > > > > >>>
> > > > > >>
> > > > > >> What are the capabilities of the ceph client user ID that you used to
> > > > > >> mount "/volumes/_nogroup/55e46a89-31ff-4878-9e2a-81b4226c3cb2/test1" ?
> > > > > >> Maybe you're hitting this limitation in
> > > > > >> https://docs.ceph.com/en/latest/cephfs/quota/#limitations ,
> > > > > >> "Quotas must be configured carefully when used with path-based mount
> > > > > >> restrictions. The client needs to have access to the directory inode
> > > > > >> on which quotas are configured in order to enforce them. If the client
> > > > > >> has restricted access to a specific path (e.g., /home/user) based on
> > > > > >> the MDS capability, and a quota is configured on an ancestor directory
> > > > > >> they do not have access to (e.g., /home), the client will not enforce
> > > > > >> it. When using path-based access restrictions be sure to configure the
> > > > > >> quota on the directory the client is restricted too (e.g., /home/user)
> > > > > >> or something nested beneath it. "
> > > > > >>
> > > > > >
> > > > > > Hi Ryan,
> > > > > >
> > > > > > I think you maybe actually hitting this
> > > > > > https://tracker.ceph.com/issues/55090 . Are you facing this issue with
> > > > > > the FUSE client?
> > > > > >
> > > > > > -Ramana
> > > > > >
> > > > > >>>
> > > > > >>> I tried setting the  ceph.quota.max_bytes  xattr on a subdirectory but it did not help.
> > > > > >>>
> > > > > >>
> > > > > >> You can't set quota xattr if your ceph client user ID doesn't have 'p'
> > > > > >> flag in its MDS capabilities,
> > > > > >> https://docs.ceph.com/en/latest/cephfs/client-auth/#layout-and-quota-restriction-the-p-flag
> > > > > >> .
> > > > > >>
> > > > > >> -Ramana
> > > > > >>
> > > > > >>> I'm not sure if the issue is in cephfs or Manila, but what would be required to get the right size and usage stats to be reported by df when a subpath of a share is mounted?
> > > > > >>>
> > > > > >>>
> > > > > >>> Thanks!
> > > > > >>>
> > > > > >>> -rt
> > > > > >>>
> > > > > >>>
> > > > > >>> Ryan Taylor
> > > > > >>> Research Computing Specialist
> > > > > >>> Research Computing Services, University Systems
> > > > > >>> University of Victoria
> > > > > >>> _______________________________________________
> > > > > >>> ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > > >>>
> > > > > >
> > > > > > _______________________________________________
> > > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > > >
> > > > > _______________________________________________
> > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux