Greetings... I hope some of you will suffer through reading this long message :-) ... Orangefs isn't built to do small IO. Reading a big file in page cache sized chunks is slow and painful. I tried to write orangefs_readpage so that it would do a reasonable sized hard IO, fill the page that was being called for, and then go ahead and fill a whole bunch of the following pages into the page cache with the extra data in the IO buffer. Anywho... I thought this code was working pretty much like I designed it to work, but on closer inspection I see that it is not, and I thought I'd ask for some help or suggestions. Here's the core of the loop in orangefs_readpage that tries to fill extra pages, and what follows is a description of how it is not working the way I designed it to work: while there's still data in the IO buffer { index++; slot_index++; next_page = find_get_page(inode->i_mapping, index); if (next_page) { gossip_debug(GOSSIP_FILE_DEBUG, "%s: found next page, quitting\n", __func__); put_page(next_page); goto out; } next_page = find_or_create_page(inode->i_mapping, index, GFP_KERNEL); /* * I've never hit this, leave it as a printk for * now so it will be obvious. */ if (!next_page) { printk("%s: can't create next page, quitting\n", __func__); goto out; } kaddr = kmap_atomic(next_page); orangefs_bufmap_page_fill(kaddr, buffer_index, slot_index); kunmap_atomic(kaddr); SetPageUptodate(next_page); unlock_page(next_page); put_page(next_page); } So... my design was that orangefs_readpage would get called, the needed page would be supplied and a bunch of the following pages would get filled as well. That way if more pages were needed, they would be in the page cache already. My plan "kind of" works when a file is read a page at a time: /pvfsmnt/nine is nine pages long. -rwxr-xr-x. 1 root root 36864 Dec 29 11:09 /pvfsmnt/nine dd if=/pvfsmnt/nine of=/tmp/nine bs=4096 count=9 orangefs_readpage gets called for the first four pages and then my prefill kicks in and fills the next pages and the right data ends up in /tmp/nine. I, of course, wished and planned for orangefs_readpage to only get called once, I don't understand why it gets called four times, which results in three extraneous expensive hard IOs. A nine page file is just an example, in general when files are read a page at a time, orangefs_readpage gets called four times and the rest of the pages (up to the design limit) are pre-filled. When a file gets read all at once, though, my design fails in a different way... dd if=/pvfsmnt/nine of=/tmp/nine bs=36864 count=1 In the above, orangefs_readpage gets called nine times, with eight extraneous expensive hard IOs. Further investigation into larger and larger block sizes shows a pattern. I hope it is apparent to some of you why my page-at-a-time reads don't start pre-filling until after four calls to orangefs_readpage. Below are some more examples that show what happens with larger and larger block sizes, hopefully the pattern there will be suggestive as well. /pvfsmnt/N is a file exactly N pages long. Key: orangefs_readpage->X times foo, bar, baz, ..., qux X = number of calls to orangefs_readpage. foo = number of bytes fetched from Orangefs on the first read. bar = number of bytes fetched from Orangefs on the extraneous 2nd read. baz = number of bytes fetched from Orangefs on the extraneous 3rd read. qux = number of bytes fetched from Orangefs on the extraneous last read. dd if=/pvfsmnt/32 of=/tmp/32 bs=131072 count=1 orangefs_readpage->32 times 131072, 126976, 122880, ..., 4096 orangefs_bufmap_page_fill->0 times dd if=/pvfsmnt/33 of=/tmp/33 bs=135168 count=1 orangefs_readpage->32 times 135168, 131072, 126976, ..., 8192 orangefs_bufmap_page_fill->1 time dd if=/pvfsmnt/34 of=/tmp/34 bs=139264 count=1 orangefs_readpage->32 times 139264, 135168, 131072, ..., 12288 orangefs_bufmap_page_fill->2 times dd if=/pvfsmnt/35 of=/tmp/35 bs=143360 count=1 orangefs_readpage->32 times 143360, 139264, 135168, ..., 16384 orangefs_bufmap_page_fill->3 times dd if=/pvfsmnt/36 of=/tmp/36 bs=147456 count=1 orangefs_readpage->32 times 147456, 143360, 139264, ..., 20480 orangefs_bufmap_page_fill->4 times dd if=/pvfsmnt/37 of=/tmp/37 bs=151552 count=1 orangefs_readpage->32 times 151552, 147456, 143360, ..., 24576 orangefs_bufmap_page_fill->5 times dd if=/pvfsmnt/38 of=/tmp/38 bs=155648 count=1 orangefs_readpage->32 times 155648, 151552, 147456, ..., 28672 orangefs_bufmap_page_fill->6 times dd if=/pvfsmnt/39 of=/tmp/39 bs=159744 count=1 orangefs_readpage->32 times 159744, 155648, 151552, ..., 32768 orangefs_bufmap_page_fill->7 times dd if=/pvfsmnt/40 of=/tmp/40 bs=163840 count=1 orangefs_readpage->32 times 163840, 159744, 155648, ..., 36864 orangefs_bufmap_page_fill->8 times dd if=/pvfsmnt/41 of=/tmp/41 bs=167936 count=1 orangefs_readpage->32 times 167936, 163840, 159744, ..., 40960 orangefs_bufmap_page_fill->9 times . . . dd if=/pvfsmnt/47 of=/tmp/47 bs=192512 count=1 orangefs_readpage->32 times 192512, 188416, 184320, ..., 65536 orangefs_bufmap_page_fill->15 times dd if=/pvfsmnt/48 of=/tmp/48 bs=196608 count=1 orangefs_readpage->32 times 196608, 192512, 188416, ..., 69632 orangefs_bufmap_page_fill->16 times dd if=/pvfsmnt/49 of=/tmp/49 bs=200704 count=1 orangefs_readpage->32 times 200704, 196608, 192512, ..., 73728 orangefs_bufmap_page_fill->17 times . . . dd if=/pvfsmnt/63 of=/tmp/63 bs=258048 count=1 orangefs_readpage->32 times 258048, 253952, 249856, ..., 131072 orangefs_bufmap_page_fill->31 times dd if=/pvfsmnt/64 of=/tmp/64 bs=262144 count=1 orangefs_readpage->32 times 262144, 258048, 253952, ..., 135168 orangefs_bufmap_page_fill->32 times dd if=/pvfsmnt/65 of=/tmp/65 bs=266240 count=1 orangefs_readpage->32 times 266240, 262144, 258048, ..., 139264 orangefs_bufmap_page_fill->33 times . . . dd if=/pvfsmnt/127 of=/tmp/127 bs=520192 count=1 orangefs_readpage->32 times 520192, 516096, 512000, ..., 393216 orangefs_bufmap_page_fill->95 times dd if=/pvfsmnt/128 of=/tmp/128 bs=524288 count=1 orangefs_readpage->32 times 524288, 520192, 516096, ..., 397312 orangefs_bufmap_page_fill->96 times It kind of starts over here, since the hard IOs are all 524288 bytes. # grep 524288 fs/orangefs/inode.c read_size = 524288; Thanks for any help y'all can give, I'll of course keep on trying to understand what is going on. -Mike