Re: readahead on directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/21/2010 12:12 PM, Jamie Lokier wrote:
> Asynchronous is available: Use clone or pthreads.

Synchronous in another process is not the same as async.  It seems I'm
going to have to do this for now as a workaround, but one of the reasons
that aio was created was to avoid the inefficiencies this introduces.
Why create a new thread context, switch to it, put a request in the
queue, then sleep, when you could just drop the request in the queue in
the original thread and move on?

> A quick skim of fs/{ext3,ext4}/dir.c finds a call to
> page_cache_sync_readahead.  Doesn't that do any reading ahead? :-)

Unfortunately it does not help when it is synchronous.  The process
still sleeps until it has fetched the blocks it needs.  I believe that
code just ends up doing a single 4kb read if the directory is no larger
than that, or if it is, then it reads up to readahead_size.  It puts the
request in the queue then sleeps until all the data has been read, even
if only the first 4kb was required before readdir() could return.

This means that a single thread calling readdir() is still going to
block reading the directory before it can move on to trying to read
other directories that are also needed.

> I/O is the probably the biggest cost, so it's more important to get
> the I/O pattern you want than worrying about return values you'll discard.

True, but it would be nice not to waste cpu cycles copying unneeded data
around.

> If not, fs/ext4/namei.c:ext4_dir_inode_operations points to
> ext4_fiemap.  So you may have luck calling FIEMAP or FIBMAP on the
> directory, and then reading blocks using the block device.  I'm not
> sure if the cache loaded via the block device (when mounted) will then
> be used for directory lookups.

Yes, I had considered that.  ureadahead already makes use of ext2fslibs
to open the block device and read the inode tables so they are already
in the cache for later use.  It seems a bit silly to do that though,
when that is exactly what readahead() SHOULD do for you.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux