Re: directory delegations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 3, 2019 at 9:06 PM bfields@xxxxxxxxxxxx
<bfields@xxxxxxxxxxxx> wrote:
>
> On Wed, Apr 03, 2019 at 12:56:24PM -0400, Bradley C. Kuszmaul wrote:
> > This proposal does look like it would be helpful.   How does this
> > kind of proposal play out in terms of actually seeing the light of
> > day in deployed systems?
>
> We need some people to commit to implementing it.
>
> We have 2-3 testing events a year, so ideally we'd agree to show up with
> implementations at one of those to test and hash out any issues.
>
> We revise the draft based on any experience or feedback we get.  If
> nothing else, it looks like it needs some updates for v4.2.
>
> The on-the-wire protocol change seems small, and my feeling is that if
> there's running code then documenting the protocol and getting it
> through the IETF process shouldn't be a big deal.
>
> --b.
>
> > On 4/2/19 10:07 PM, bfields@xxxxxxxxxxxx wrote:
> > >On Wed, Apr 03, 2019 at 02:02:54AM +0000, Trond Myklebust wrote:
> > >>The create itself needs to be sync, but the attribute delegations mean
> > >>that the client, not the server, is authoritative for the timestamps.
> > >>So the client now owns the atime and mtime, and just sets them as part
> > >>of the (asynchronous) delegreturn some time after you are done writing.
> > >>
> > >>Were you perhaps thinking about this earlier proposal?
> > >>https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dmyklebust-2Dnfsv4-2Dunstable-2Dfile-2Dcreation-2D01&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=YIKOmJLMLfe5wQR3VJI7jGjCnepZlMwumApzvaKItrY&m=qlAJ6dZPGjbcTzNIpkTyk-RTii6lWw1CLIjF6jp3P2Y&s=aTTFNJlRH-dXrQmE4cSYEUd8Kv3ij5cqTJtvgIixMa8&e=
> > >That's it, thanks!
> > >
> > >Bradley is concerned about performance of something like untar on a
> > >backend filesystem with particularly high-latency metadata operations,
> > >so something like your unstable file createion proposal (or actual write
> > >delegations) seems like it should help.
> > >
> > >--b.

The serialized create with something like an untar is a
performance-killer though.

FWIW, I'm working on something similar right now for Ceph. If a ceph
client has adequate caps [1] for a directory and the dentry inode,
then we should (in principle) be able to buffer up directory morphing
operations and flush them out to the server asynchronously.

I'm starting with unlink (mostly because it's simpler), and am mainly
just returning early when we do have the right caps -- after issuing
the call but before the reply comes in. We should be able to do the
same for link, rename and create too. Create will require the Ceph MDS
to delegate out a range of inode numbers (and that bit hasn't been
implemented yet).

My thinking with all of this is that the buffering of directory
morphing operations is not as helpful as something like a pagecache
write is, as we aren't that interested in merging operations that
change the same dentry. However, being able to do them asynchronously
should work really well. That should allow us to better parallellize
create/link/unlink/rename on different dentries even when they are
issued serially by a single task.

RFC5661 doesn't currently provide for writeable directory delegations,
AFAICT, but they could eventually be implemented in a similar way.

[1]: cephfs capabilies (aka caps) are like a delegation for a subset
of inode metadata
--
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux