Fwd: problem with orangefs readpage...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oops... my non-work email doesn't default to text only, so this
bounced to the list...

---------- Forwarded message ---------
From: Mike Marshall <hubcapsc@xxxxxxxxx>
Date: Fri, Jan 1, 2021 at 5:15 PM
Subject: Re: problem with orangefs readpage...
To: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Mike Marshall <hubcap@xxxxxxxxxxxx>, linux-fsdevel
<linux-fsdevel@xxxxxxxxxxxxxxx>



Hi Matthew... Thanks so much for the suggestions!

> This is some new version of orangefs_readpage(), right?

No, that code has been upstream for a while... that readahead_control
thing looks very interesting :-) ...

-Mike

On Thu, Dec 31, 2020 at 11:08 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Thu, Dec 31, 2020 at 04:51:53PM -0500, Mike Marshall wrote:
> > Greetings...
> >
> > I hope some of you will suffer through reading this long message :-) ...
>
> Hi Mike!  Happy New Year!
>
> > Orangefs isn't built to do small IO. Reading a
> > big file in page cache sized chunks is slow and painful.
> > I tried to write orangefs_readpage so that it would do a reasonable
> > sized hard IO, fill the page that was being called for, and then
> > go ahead and fill a whole bunch of the following pages into the
> > page cache with the extra data in the IO buffer.
>
> This is some new version of orangefs_readpage(), right?  I don't see
> anything resembling this in the current codebase.  Did you disable
> orangefs_readpages() as part of this work?  Because the behaviour you're
> describing sounds very much like what the readahead code might do to a
> filesystem which implements readpage and neither readahead nor readpages.
>
> > orangefs_readpage gets called for the first four pages and then my
> > prefill kicks in and fills the next pages and the right data ends
> > up in /tmp/nine. I, of course, wished and planned for orangefs_readpage
> > to only get called once, I don't understand why it gets called four
> > times, which results in three extraneous expensive hard IOs.
>
> I might suggest some judicious calling of dump_stack() to understand
> exactly what's calling you.  My suspicion is that it's this loop in
> read_pages():
>
>                 while ((page = readahead_page(rac))) {
>                         aops->readpage(rac->file, page);
>                         put_page(page);
>                 }
>
> which doesn't test for PageUptodate before calling you.
>
> It'd probably be best if you implemented ->readahead, which has its own
> ideas about which pages would be the right ones to read.  It's not always correct, but generally better to have that logic in the VFS than in each filesystem.
>
> You probably want to have a look at Dave Howells' work to allow
> the filesystem to expand the ractl:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
>
> specifically this patch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=fscache-iter&id=f582790b32d5d1d8b937df95a8b2b5fdb8380e46



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux