Hi Jan, Thanks for the review. On Tue, Jan 14, 2025 at 6:09 PM Jan Kara <jack@xxxxxxx> wrote: > > Hello! > > On Tue 14-01-25 09:08:38, Shyam Prasad N wrote: > > The Linux kernel does buffered reads and writes using the page cache > > layer, where the filesystem reads and writes are offloaded to the > > VM/MM layer. The VM layer does a predictive readahead of data by > > optionally asking the filesystem to read more data asynchronously than > > what was requested. > > > > The VFS layer maintains a dentry cache which gets populated during > > access of dentries (either during readdir/getdents or during lookup). > > This dentries within a directory actually forms the address space for > > the directory, which is read sequentially during getdents. For network > > filesystems, the dentries are also looked up during revalidate. > > > > During sequential getdents, it makes sense to perform a readahead > > similar to file reads. Even for revalidations and dentry lookups, > > there can be some heuristics that can be maintained to know if the > > lookups within the directory are sequential in nature. With this, the > > dentry cache can be pre-populated for a directory, even before the > > dentries are accessed, thereby boosting the performance. This could > > give even more benefits for network filesystems by avoiding costly > > round trips to the server. > > > > NFS client already does a simplistic form of this readahead by > > maintaining an address space for the directory inode and storing the > > dentry records returned by the server in this space. However, this > > dentry access mechanism is so generic that I feel that this can be a > > part of the VFS/VM layer, similar to buffered reads of a file. Also, > > VFS layer is better equipped to store heuristics about dentry access > > patterns. > > Interesting idea. Note that individual filesystems actually do directory > readahead on their own. They just don't readahead 'struct dentry' but > rather issue readahead for metadata blocks to get into cache which is what > takes most time. Readahead makes the most sense for readdir() (or > getdents() as you call it) calls where the filesystem driver has all the > information it needs (unlike VFS) for performing efficient readahead. So > here I'm not sure there's much need for a change. I agree that the filesystem driver can do this. But the logic for "advising" how many dentries to readahead may be something that depends on the workload rather than the filesystem itself. Most of the practical use cases would readdir the entire directory. But there could be use cases where a partial directory could be read too. > > I'm not against some form of readahead for ->lookup calls but we'd have to > very carefully design the heuristics for detecting some kind of pattern of > ->lookup calls so that we know which entry is going to be the next one > looked up and evaluate whether it is actually an overall win or not. So > for this the discussion would need a more concrete proposal to be useful I > think. Acked. Simplistically, the whole directory could be read when the number of dentry revalidations or lookups that missed the cache, but was successfully loaded from the backend exceeds a certain number (I can see how this number could be filesystem specific). There could be other more sophisticated implementations. Let me think through this further (and read the other comments) and see if I can refine this further. > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -- Regards, Shyam