Re: directory delegations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 4, 2019, at 11:09 AM, Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
> 
> On Wed, Apr 3, 2019 at 9:06 PM bfields@xxxxxxxxxxxx
> <bfields@xxxxxxxxxxxx> wrote:
>> 
>> On Wed, Apr 03, 2019 at 12:56:24PM -0400, Bradley C. Kuszmaul wrote:
>>> This proposal does look like it would be helpful.   How does this
>>> kind of proposal play out in terms of actually seeing the light of
>>> day in deployed systems?
>> 
>> We need some people to commit to implementing it.
>> 
>> We have 2-3 testing events a year, so ideally we'd agree to show up with
>> implementations at one of those to test and hash out any issues.
>> 
>> We revise the draft based on any experience or feedback we get.  If
>> nothing else, it looks like it needs some updates for v4.2.
>> 
>> The on-the-wire protocol change seems small, and my feeling is that if
>> there's running code then documenting the protocol and getting it
>> through the IETF process shouldn't be a big deal.
>> 
>> --b.
>> 
>>> On 4/2/19 10:07 PM, bfields@xxxxxxxxxxxx wrote:
>>>> On Wed, Apr 03, 2019 at 02:02:54AM +0000, Trond Myklebust wrote:
>>>>> The create itself needs to be sync, but the attribute delegations mean
>>>>> that the client, not the server, is authoritative for the timestamps.
>>>>> So the client now owns the atime and mtime, and just sets them as part
>>>>> of the (asynchronous) delegreturn some time after you are done writing.
>>>>> 
>>>>> Were you perhaps thinking about this earlier proposal?
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dmyklebust-2Dnfsv4-2Dunstable-2Dfile-2Dcreation-2D01&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=YIKOmJLMLfe5wQR3VJI7jGjCnepZlMwumApzvaKItrY&m=qlAJ6dZPGjbcTzNIpkTyk-RTii6lWw1CLIjF6jp3P2Y&s=aTTFNJlRH-dXrQmE4cSYEUd8Kv3ij5cqTJtvgIixMa8&e=
>>>> That's it, thanks!
>>>> 
>>>> Bradley is concerned about performance of something like untar on a
>>>> backend filesystem with particularly high-latency metadata operations,
>>>> so something like your unstable file createion proposal (or actual write
>>>> delegations) seems like it should help.
>>>> 
>>>> --b.
> 
> The serialized create with something like an untar is a
> performance-killer though.
> 
> FWIW, I'm working on something similar right now for Ceph. If a ceph
> client has adequate caps [1] for a directory and the dentry inode,
> then we should (in principle) be able to buffer up directory morphing
> operations and flush them out to the server asynchronously.
> 
> I'm starting with unlink (mostly because it's simpler), and am mainly
> just returning early when we do have the right caps -- after issuing
> the call but before the reply comes in. We should be able to do the
> same for link, rename and create too. Create will require the Ceph MDS
> to delegate out a range of inode numbers (and that bit hasn't been
> implemented yet).
> 
> My thinking with all of this is that the buffering of directory
> morphing operations is not as helpful as something like a pagecache
> write is, as we aren't that interested in merging operations that
> change the same dentry. However, being able to do them asynchronously
> should work really well. That should allow us to better parallellize
> create/link/unlink/rename on different dentries even when they are
> issued serially by a single task.

What happens if an asynchronous directory change fails (eg. ENOSPC)?


> RFC5661 doesn't currently provide for writeable directory delegations,
> AFAICT, but they could eventually be implemented in a similar way.
> 
> [1]: cephfs capabilies (aka caps) are like a delegation for a subset
> of inode metadata
> --
> Jeff Layton <jlayton@xxxxxxxxxxxxxxx>

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux