Re: [Cephfs] Can't get snapshot under a subvolume

Alexander Patrakov <patrakov@xxxxxxxxx> · Sun, 2 Mar 2025 17:57:03 +0800

Hello Gürkan,

On Sun, Mar 2, 2025 at 4:20 PM Gürkan G <ceph@xxxxxxxxx> wrote:
>
> Hi again Alexander,
>
> Thanks for taking the time.
>
>  > Subvolumes exist to implement a notion of managed mountable
> directories with a given maximum size, as required, e.g., by
> Kubernetes RWX Persistent Volumes.
>
> I highly doubt that the main reason was this, since (afaik) snapshots
> feature predate Kubernetes.
>
>  > However, if Ceph permitted snapshots at arbitrary points within
> the volume, a malicious pod could have created a snapshot, deleted
> everything (not for real, "thanks" to the snapshot), written new
> files, and thus evaded the quota.
>
> Then the admin can opt to not use allow_new_snaps in that FS, or provide
> key without "s" flag so client would be unable to create snapshots.
>
> I am not proficient with cpp, but looks like even MDS code has special
> handling of snapshots+subvolumes, even within quota-restricted ones.
>
> But again, this might be an oversight and conflict of features on MDS side.
>
>  > And yes, the documentation you mention does need to be corrected.
>
> Could you point for the place to report this to the documentation? I
> would've designed my implementation completely different if this was not
> implied.

The proper place to report documentation bugs is
https://pad.ceph.com/p/Report_Documentation_Bugs, linked from the top
of  every Ceph documentation page.

>
> Thanks again,
>
> Gürkan
>
> On 02/03/2025 02.10, Alexander Patrakov wrote:
> > Hello Gürkan,
> >
> > Let me clarify and correct my answer.
> >
> > I incorrectly assumed that you use Kubernetes, because its CSI driver
> > is, by far, the main consumer of subvolumes. Still, let me explain
> > this use case, as the limitations you observe naturally follow from
> > it.
> >
> > Subvolumes exist to implement a notion of managed mountable
> > directories with a given maximum size, as required, e.g., by
> > Kubernetes RWX Persistent Volumes. In Kubernetes, the CSI driver, when
> > it needs to, creates a subvolume, sets a quota on it, creates a dummy
> > subdirectory, and mounts it in the pod that needs the Persistent
> > Volume. As such, the pod has no access to the top directory of the
> > persistent volume and thus cannot increase the quota by changing the
> > xattr. However, if Ceph permitted snapshots at arbitrary points within
> > the volume, a malicious pod could have created a snapshot, deleted
> > everything (not for real, "thanks" to the snapshot), written new
> > files, and thus evaded the quota. Thus, the only point where snapshots
> > are allowed is the top directory of the subvolume, where the CSI
> > driver can do it.
> >
> > Therefore, the answer to your original question is: if you want
> > client-managed snapshots, do not use subvolumes, they are the wrong
> > abstraction for you. Just create plain old directories outside of the
> > /volumes path and mount them on the client.
> >
> > And yes, the documentation you mention does need to be corrected.
> >
> > On Sun, Mar 2, 2025 at 5:44 AM Gürkan G <ceph@xxxxxxxxx> wrote:
> >> Hi,
> >>
> >>> This is deliberate, as otherwise they would become a mechanism for quota evasion.
> >> This.. does not make much sense. If I give the setfattr command, everything works fine. Plus the documentation says following:
> >>
> >> Arbitrary subtrees. Snapshots are created within any directory you choose, and cover all data in the file system under that directory.
> >> Ref: https://docs.ceph.com/en/squid/dev/cephfs-snapshots/
> >>
> >>> In any case, please also try asking in Kubernetes forums. On Ceph side, unfortunately, everything works as intended.
> >> I am also not using Kubernetes. This is a deployment over Debian bookworm VMs. Never mentioned a pod, the client is another Debian VM.
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >

-- 
Alexander Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx