Re: NFSv4/pNFS possible POSIX I/O API standards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 05:35 AM 11/29/2006, Matthew Wilcox wrote:
On Wed, Nov 29, 2006 at 05:23:13AM -0700, Matthew Wilcox wrote:
> On Wed, Nov 29, 2006 at 09:04:50AM +0000, Christoph Hellwig wrote:
> >  - openg/sutoc
> >
> >     No way.  We already have a very nice file descriptor abstraction.
> >     You can pass file descriptors over unix sockets just fine.
>
> Yes, but it behaves like dup().  Gary replied to me off-list (which I
> didn't notice and continued replying to him off-list).  I wrote:
>
> Is this for people who don't know about dup(), or do they need
> independent file offsets?  If the latter, I think an xdup() would be
> preferable (would there be a security issue for OSes with revoke()?)
> Either that, or make the key be useful for something else.

There is a business case at the Open Group Web site.  It is not a full use case
document though.

For a very tiny amount of background.
It seems from the discussion that others (at least those working in clustered file systems)
have seen the need for a statlite and readdir+
type function, what ever they might be called or how ever they might be implemented. As for openg, the gains have been seen in clustered file systems where you have 10s of thousands of processes spread out over thousands of machines. All 100k processes may open the same file and offset different amounts, sometimes strided sometimes not strided through the file. The opens all fire within a few milliseconds or less. This is a problem for large clustered file systems, open times have been seen in the minutes or worse. The writes all come at once as well quite often. Often they are complicated scatter gather operations spread out across the entire distributed
memory of thousands of machines, not even in a completely uniform manner.
A little knowledge about the intent of the application
goes a long way when you are dealing with 100k parallelism. Additionally, having some notion of groups of processes collaborating at the file system level is useful for trying to make informed decisions about determinism and quality of service you might want to provide, how strictly
you want to enforce rules on collaborating processes, etc.

As for NFS acl's.
This was going to be a separate extension volume, not associated with the performance portion. It comes up because many of the users of high end/clustered file system technology are also in often secure environments and have need to know issues. We were trying to be helpful to the NFSv4 community which has been kind enough to have these security features
in their product.

Additionally, this entire effort is being proposed as an extension, not as a change to the base POSIX I/O API.

We certainly have no religion about how we make progress to assist the
cluster file systems people and the NFSv4 people be better able to serve their communities, so all these
comments are very welcomed.

Thanks
Gary

I further wonder if these people would see appreciable gains from doing
sutoc rather than doing openat(dirfd, "basename", flags, mode);

If they did, I could also see openat being extended to allow dirfd to
be a file fd, as long as pathname were NULL or a pointer to NUL.

But with all the readx stuff being proposed, I bet they don't really
need independent file offsets.  That's, like, so *1970*s.


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux