Re: reply: rename PR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

thanks for the more detailed info, inline

----- Original Message -----
> From: "陈敏" <chenmin@xxxxxxxx>
> To: "Matt Benjamin" <mbenjamin@xxxxxxxxxx>
> Cc: "The Sacred Order of the Squid Cybernetic" <ceph-devel@xxxxxxxxxxxxxxx>
> Sent: Friday, September 9, 2016 12:05:37 PM
> Subject: reply: rename PR
> 
> Hi Matt
> 
> 	I have noticed rgw file rename is not POSIX strictly, because src file and
> 	dest file own different inode number (hash of bucket + object).I have not
> 	tested on nfs-ganesha upstream for some reason, and will checkout to master
> 	to test FSAL_RGW later.
> 	For NFS scaling, the main problem is that different NFS-ganesha server keep
> 	its inode cache in local memory and there is no global view of inode
> 	cache(directory tree) in the NFS-ganesha cluster connectted to the same rgw
> 	bucket.

the idea is that there should be no need to do so, with the intended/current update strategy--the file id strategy actually helps with this;  broadly, we see the RGW NFS interface as intentionally divergent from POSIX, but with some room for flexibility as regards just how;  certainly, in the namespace, we don't want to chase Unix semantics badly

 For I have tested NFS-ganesha HA with pacemaker+corosync wachted
> 	and found inode cache cannot be shared between primary and backup.

I think the invalidate changes substantially address this, but we're actually going to be validating/working on HA next, so we'll be able to dig more into it and don't have specifics yet

 For
> 	NFSv4, session state and lock state should be persist to storage, so the
> 	NFS-ganesha cluster can share them.

a bunch of choices there;  currently, we don't support lock operations, and the primary reason was that again, currently, the only update strategy RGW supports is atomic overwrite;  we have speculated on opening up other options and even pNFS, but that's pretty blue sky;]

the current ganesha ha options keep track of most protocol state (e.g., sessions), and don't expose it to the fsals (whereas locks can);  it might be helpful if you joined the nfs-ganesha-devel mailing list to discuss further?

as regards file locking...

> 	In addition, what is the plan of flock in rgw file, for it is important to
> 	NFS cluster.

it's pretty simple to implement (and materialize) locks the way we do other attrs;  are they useful in the current atomic update model--and if so, which ones (whole-file?), and with what semantics (e.g., would such locks be mandatory [permitted in NFSv4.1+], and would they block renames?));  btw, on that point, I have little interest in implementing the messy edges of NFS vs. posix semantics, in general;

xattrs ARE coming too, as there is an IETF draft and prototype implementation of protocol xattrs which would use it;  on the xattr topic, while we're on it, at least one nfs-s3 implementation I'm aware of does things with attributes with an extra-protocol mechanism--we haven't thought really at all about that, have you folks?

> 
> Chen Min
> 
> -----邮件原件-----
> 发件人: Matt Benjamin [mailto:mbenjamin@xxxxxxxxxx]
> 发送时间: 2016年9月9日 22:17
> 收件人: 陈敏 <chenmin@xxxxxxxx>
> 抄送: The Sacred Order of the Squid Cybernetic <ceph-devel@xxxxxxxxxxxxxxx>
> 主题: rename PR
> 
> Hi Chen,
> 
> I wanted to let you know, I merged your exact-match PR.  Now, I suspect that
> you're also not running a recent-enough version of nfs-ganesha, because I
> think that the rename issue you fixed wouldn't easily reproduce if you were.
> 
> A key point I wanted to highlight is that it's part of the scaling (and ha,
> and...) strategy that our nfs file handles are name-stable, rather than
> arbitrary values.  One implication of that is that when a file is renamed,
> the renamed object has a different file id and hence NFS file handle value
> than it did before the rename.  We use the parent directory's change
> attribute to ensure that clients that had the vnode cached see an
> invalidate.  (Nothing in your PR contradicts that, of course.)
> 
> Another strategic decision we made is, we don't rename directories (just
> stating it for posterity). :)
> 
> Cheers,
> 
> Matt
> 
> --
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
> 
> http://www.redhat.com/en/technologies/storage
> 
> tel.  734-707-0660
> fax.  734-769-8938
> cel.  734-216-5309
> 

-- 
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-707-0660
fax.  734-769-8938
cel.  734-216-5309
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux