Re: NFSv4/pNFS possible POSIX I/O API standards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Hellwig wrote:
Ulrich, this in reply to these API proposals:

I know the documents. The HECWG was actually supposed to submit an actual draft to the OpenGroup-internal working group but I haven't seen anything yet. I'm not opposed to getting real-world experience first.


So other than this "lite" version of the readdirplus() call, and this idea of making the flags indicate validity rather than accuracy, are there other comments on the directory-related calls? I understand that they might or might not ever make it in, but assuming they did, what other changes would you like to see?

I don't think an accuracy flag is useful at all. Programs don't want to use fuzzy information. If you want a fast 'ls -l' then add a mode which doesn't print the fields which are not provided. Don't provide outdated information. Similarly for other programs.


statlite needs to separate the flag for valid fields from the actual
stat structure and reuse the existing stat(64) structure.  stat lite
needs to at least get a better name, even better be folded into *statat*,
either by having a new AT_VALID_MASK flag that enables a new
unsigned int valid argument or by folding the valid flags into the AT_
flags.

Yes, this is also my pet peeve with this interface. I don't want to have another data structure. Especially since programs might want to store the value in places where normal stat results are returned.

And also yes on 'statat'. I strongly suggest to define only a statat variant. In the standards group I'll vehemently oppose the introduction of yet another superfluous non-*at interface.

As for reusing the existing statat interface and magically add another parameter through ellipsis: no. We need to become more type-safe. The userlevel interface needs to be a new one. For the system call there is no such restriction. We can indeed extend the existing syscall. We have appropriate checks for the validity of the flags parameter in place which make such calls backward compatible.



I think having a stat lite variant is pretty much consensus, we just need
to fine tune the actual API - and of course get a reference implementation.
So if you want to get this going try to implement it based on
http://marc.theaimsgroup.com/?l=linux-fsdevel&m=115487991724607&w=2.
Bonus points for actually making use of the flags in some filesystems.

I don't like that approach. The flag parameter should be exclusively an output parameter. By default the kernel should fill in all the fields it has access to. If access is not easily possible then set the bit and clear the field. There are of course certain fields which always should be added. In the proposed man page these are already identified (i.e., those before the st_litemask member).


At the actual
C prototype level I would rename d_stat_err to d_stat_errno for consistency
and maybe drop the readdirplus() entry point in favour of readdirplus_r
only - there is no point in introducing new non-reenetrant APIs today.

No, readdirplus should be kept (and yes, readdirplus_r must be added). The reason is that the readdir_r interface is only needed if multiple threads use the _same_ DIR stream. This is hardly ever the case. Forcing everybody to use the _r variant means that we unconditionally have to copy the data in the user-provided buffer. With readdir there is the possibility to just pass back a pointer into the internal buffer read into by getdents. This is how readdir works for most kernel/arch combinations.

This requires that the dirent_plus structure matches so it's important to get it right. I'm not comfortable with the current proposal. Yes, having ordinary dirent and stat structure in there is a plus. But we have overlap:

-  d_ino and st_ino

-  d_type and parts of st_mode

And we have superfluous information

- st_dev, the same for all entries, at least this is what readdir
  assumes

I haven't made up my mind yet whether this is enough reason to introduce a new type which isn't made up of the the two structures.


And one last point: I haven't seen any discussion why readdirplus should do the equivalent of stat and there is no 'statlite' variant. Are all places for readdir is used non-critical for performance or depend on accurate information?

--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux