Re: why does cephfs have dentry leases at all?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2019-03-11 at 14:44 -0700, Gregory Farnum wrote:
> On Mon, Mar 11, 2019 at 2:11 PM Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
> > This may, on its face, sound like a stupid question, but I'm going over
> > the rules to ensure that our cap handling rules will be able to properly
> > support buffered creates/unlinks.
> 
> It's not stupid. I had trouble with these too; they don't show up in
> many places; and I've forgotten them again.
> 

Thanks for the responses, Greg. 

FWIW, I'm particularly interested in the "whys" in these questions.
Knowing the rationale behind the design makes it easier to ensure that
we don't regress in the future.

> > Dentry leases seem to be really poorly documented. The "rules" for them
> > are unclear, but also the rationale. What was the supposed benefit of
> > the dentry leases in ceph over just relying on appropriate caps on the
> > parent directory?
> > 
> > In principle I suppose it would allow you to continue caching most of
> > your dentries when only some small subset changes. Was that it or was
> > there some other reason to add them?
> 
> Yeah. I *think* it's that leases allow us to satisfy an "ls" (but not
> an "ls -l" or anything requiring any details!) — and skimming through
> the client code maybe also to map from dentries to inode numbers?

Makes sense. It's a revocable lease on a dentry, so it covers enough to
do a lookup and not much else.

> But
> that should be governed by Ls — quickly and cheaply without requiring
> any caps that hold up other operations.

My understanding is that link caps cover only hardlinks to that inode.
So with Ls on a directory inode, you know that no one can rmdir the
directory itself, but that says nothing about the directory contents.

> The caps on the parent
> directory are pretty expensive in comparison, right?

Right, that's pretty coarse-grained. Any link, unlink or rename inside
that directory would make you lose all of its cached dentries. It makes
sense that ceph would want that to be more granular.

All of that however, sort of seems to indicate that FILE caps on
directories are pretty meaningless currently. I guess they might cover
rstats (in theory) but I'm not sure that works as expected today.

> Holding shared
> caps requires notification on any timestamp updates; holding exclusive
> caps requires that whole revoke dance before anybody else can do
> something; with leases you only need to get notified on actual
> directory content changes and we don't have a good mapping for that in
> the cap system.
> 

So to be clear, leases are somewhat orthogonal to parent directory caps,
and even if you lose shared caps on a directory you get to keep any
leases that weren't recalled?

> If that sounds like a weird optimization that doesn't do much good (I
> actually think it might?), keep in mind that there are some silly
> behaviors in the HPC world that CephFS was originally targeted at.
> 

I think it might too. I buy that this helps a lot of workloads,
actually. Huge directories are absolutely a pessimal case on NFS, which
depends heavily on parent attributes for dentry revalidation.

Being able to change a directory but allow the clients to preserve
entries in it could be a huge win for some workloads.

> Or maybe I'm misremembering this and Sage or Zheng will respond
> quickly to my terrible, terrible lies. ;)
> 
> 
> > If that was the reason, what was the rationale for making them time-
> > based (they all have a lease_ttl)?


Now that I've thought about it, I assume the ttl is all about ensuring
you don't build up huge numbers of dentry leases that need to be revoked
(use em or lose em...). Is that the main rationale?
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux