[RFC PATCH] implement orangefs_readahead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A few weeks ago Matthew Wilcox helped me see how
mm/readahead.c/read_pages was dropping down into
some code designed to take over for filesystems
that didn't implement ->readahead, and how this "failover"
code was screwing over the readahead-like code I'd put
into orangefs_readpage.

I studied a bunch of readahead code in fs and mm and other
filesystems and came up with this patch that seems to work
in the tests I've done so far. Sometimes code I like is
instead irritating to Linus or Al Viro :-), so hopefully some
of y'all will look over what I've got here. There's a
couple of printk's I've left in orangefs_readpage that don't
belong upstream, they help me now for testing though...
Besides the diff at the end of this message, the code is
in: https://github.com/hubcapsc/linux/tree/readahead

I wish I knew how to specify _nr_pages in the readahead_control
structure so that all the extra pages I need could be obtained
in readahead_page instead of part there and the rest in my
open-coded stuff in orangefs_readpage. But it looks to me as
if values in the readahead_control structure are set heuristically
outside of my control over in ondemand_readahead?

[root@vm3 linux]# git diff master..readahead
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 48f0547d4850..682a968cb82a 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -244,6 +244,25 @@ static int orangefs_writepages(struct
address_space *mapping,

 static int orangefs_launder_page(struct page *);

+/*
+ * Prefill the page cache with some pages that we're probably
+ * about to need...
+ */
+static void orangefs_readahead(struct readahead_control *rac)
+{
+       pgoff_t index = readahead_index(rac);
+       struct page *page;
+
+       while ((page = readahead_page(rac))) {
+               prefetchw(&page->flags);
+               put_page(page);
+               unlock_page(page);
+               index++;
+       }
+
+       return;
+}
+
 static int orangefs_readpage(struct file *file, struct page *page)
 {
        struct inode *inode = page->mapping->host;
@@ -260,11 +279,16 @@ static int orangefs_readpage(struct file *file,
struct page *page)
        int remaining;

        /*
-        * Get up to this many bytes from Orangefs at a time and try
-        * to fill them into the page cache at once. Tests with dd made
-        * this seem like a reasonable static number, if there was
-        * interest perhaps this number could be made setable through
-        * sysfs...
+        * Orangefs isn't a good fit for reading files one page at
+        * a time. Get up to "read_size" bytes from Orangefs at a time and
+        * try to fill them into the page cache at once. Readahead code in
+        * mm already got us some extra pages by calling orangefs_readahead,
+        * but it doesn't know how many we actually wanted, so we'll
+        * get some more after we use up the extra ones we got from
+        * orangefs_readahead. Tests with dd made "read_size" seem
+        * like a reasonable static number of bytes to get from orangefs,
+        * if there was interest perhaps "read_size" could be made
+        * setable through sysfs or something...
         */
        read_size = 524288;

@@ -302,31 +326,19 @@ static int orangefs_readpage(struct file *file,
struct page *page)
                slot_index = 0;
                while ((remaining - PAGE_SIZE) >= PAGE_SIZE) {
                        remaining -= PAGE_SIZE;
-                       /*
-                        * It is an optimization to try and fill more than one
-                        * page... by now we've already gotten the single
-                        * page we were after, if stuff doesn't seem to
-                        * be going our way at this point just return
-                        * and hope for the best.
-                        *
-                        * If we look for pages and they're already there is
-                        * one reason to give up, and if they're not there
-                        * and we can't create them is another reason.
-                        */

                        index++;
                        slot_index++;
-                       next_page = find_get_page(inode->i_mapping, index);
+                       next_page = find_lock_page(inode->i_mapping, index);
                        if (next_page) {
-                               gossip_debug(GOSSIP_FILE_DEBUG,
-                                       "%s: found next page, quitting\n",
-                                       __func__);
-                               put_page(next_page);
-                               goto out;
+                               printk("%s: found readahead page\n", __func__);
+                       } else {
+                               next_page =
+                                       find_or_create_page(inode->i_mapping,
+                                                               index,
+                                                               GFP_KERNEL);
+                               printk("%s: alloced my own page\n", __func__);
                        }
-                       next_page = find_or_create_page(inode->i_mapping,
-                                                       index,
-                                                       GFP_KERNEL);
                        /*
                         * I've never hit this, leave it as a printk for
                         * now so it will be obvious.
@@ -659,6 +671,7 @@ static ssize_t orangefs_direct_IO(struct kiocb *iocb,
 /** ORANGEFS2 implementation of address space operations */
 static const struct address_space_operations orangefs_address_operations = {
        .writepage = orangefs_writepage,
+       .readahead = orangefs_readahead,
        .readpage = orangefs_readpage,
        .writepages = orangefs_writepages,
        .set_page_dirty = __set_page_dirty_nobuffers,
diff --git a/fs/orangefs/orangefs-mod.c b/fs/orangefs/orangefs-mod.c
index 74a3d6337ef4..cd7297815f91 100644
--- a/fs/orangefs/orangefs-mod.c
+++ b/fs/orangefs/orangefs-mod.c
@@ -31,7 +31,7 @@ static ulong module_parm_debug_mask;
 __u64 orangefs_gossip_debug_mask;
 int op_timeout_secs = ORANGEFS_DEFAULT_OP_TIMEOUT_SECS;
 int slot_timeout_secs = ORANGEFS_DEFAULT_SLOT_TIMEOUT_SECS;
-int orangefs_cache_timeout_msecs = 50;
+int orangefs_cache_timeout_msecs = 500;
 int orangefs_dcache_timeout_msecs = 50;
 int orangefs_getattr_timeout_msecs = 50;



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux