Re: RFC: return d_type for non-plus READDIR

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Geert -

> On Mar 23, 2021, at 9:47 PM, Geert Jansen <gerardu@xxxxxxxxxx> wrote:
> 
> On Tue, Mar 23, 2021 at 03:26:02PM +0000, Chuck Lever III wrote:
> 
>>> Since not all file servers may be able to produce the directory entry type
>>> efficiently, this could be implemented as a mount option that defaults off.
>> 
>> Can you say more about the impact of requesting this attribute
>> from servers that cannot efficiently provide it? Which servers
>> and filesystems find it a problem, and how much of a problem is
>> it?
> 
> The ability to satisfy a non-plus READDIR by reading just the directory
> pages, instead of having to read all dirent inodes as well, can be worth it
> for certain use cases (especially those with large directories). If a file
> system does not store d_type in the directory, and the client would always
> request the type attribute even for non-plus READDIR, then you lose the
> ability to make this optimization.
> 
> From a review of the man pages, most local file systems appear to be able to
> store d_type within the directory, including ext4, xfs and zfs. Both ext4
> and xfs have options to turn this behavior off. If you'd export such a file
> system using nfsd, then this would cause additional IO on the file system if
> we would always request the type attribute.
> 
> I do not know how other commercial servers handle this.

"How much of a problem is it" -- I guess what I really want to
see is some quantification of the problem, in numbers.

- Exactly which workloads benefit from having the DT information?
- How much do they improve?
- Which workloads are negatively impacted, and how much?
- How are workloads impacted if the client requests DT
information from servers that cannot support it efficiently?

Seems to me there will be some caching effects -- there are at
least two caches between the server's persistent storage and the
application. So I expect this will be a complex situation, at
best.

I totally agree that directory operations are a performance
and scalability sore spot for NFS, so I personally am interested
in hearing any and all suggestions in this area. In this case,
the proposed mechanism is intriguing and sensible, but I would
suggest that without measurement data, the proposal seems
incomplete so far.


>> I'd rather avoid adding another administrative knob unless it is
>> absolutely necessary... are there other options for controlling
>> whether the client requests this attribute?
>> 
>> For example, is there a way for a server to decide not to provide
>> it if it would be burdensome to do so? ie, the client always asks,
>> but it would be up to the server to provide it if it can do so.
> 
> I looked in the RFCs but I am not sure if there is a way today? Both 4.0 and
> 4.1 define "type" as a required attribute that needs to be returned if the
> client asks for it. There also does not appear to be an enum value
> corresponding to DT_UNKNOWN. Were you thinking about something specifically?

I wasn't thinking of a particular protocol mechanism, though that
is certainly a possibility. I'm more interested in seeing if there
are ways to enable the proposed improvement without adding more
administrative complexity. Yet one more thing that can be set
incorrectly and has to be maintained in perpetuity.

So, alternatives might be:
- Always requesting the DT information
- Leveraging an existing mount option, like lookupcache=
- A sysfs setting or a module parameter
- A heuristic to guess when requesting the information is harmful
- Enabling the request based on directory size or some other static
feature of the directory
- If this information is of truly great benefit, approaching server
vendors to support it efficiently, and then have it always enabled
on clients

Adding an administrative knob means we don't have a good understanding
of how this setting is going to work. As an experimental feature, this
is a great way to go, but for a permanent, long-term thing, let's keep
in mind that client administration is a resource that has to scale
well to cohorts of 100s of thousands of systems. The simpler and more
automatic we can make it, the better off it will be for everyone.


> If there's no way to do this today, then I guess a per-file system attribute
> that indicates support for "can produce file type efficient when reading a
> directory" would would be a relatively clean solution. I presume it would
> require an RFC to define this attribute. Would you have a recommendation given
> your your experience with the RFC process?

My recommendation is to look for other alternatives first ;-)

It can't hurt to ask for advice from the nfsv4 working group, but
I would go in armed with some performance numbers.

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux