Re: problem with orangefs readpage...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 01, 2021 at 05:15:32PM -0500, Mike Marshall wrote:
> Hi Matthew... Thanks so much for the suggestions!
> > This is some new version of orangefs_readpage(), right?
> No, that code has been upstream for a while... that readahead_control
> thing looks very interesting :-) ...

Oh, my, I was looking at a tree from before 2018 that still had
orangefs_readpages.  So, yes, I think what's happening is that
orangefs_readpage() is being called from the readahead code.
You'll hit this path:

                        next_page = find_get_page(inode->i_mapping, index);
                        if (next_page) {
                                gossip_debug(GOSSIP_FILE_DEBUG,
                                        "%s: found next page, quitting\n",
                                        __func__);
                                put_page(next_page);
                                goto out;

because readahead already allocated those pages for you and is trying
to fill them one-at-a-time.

Implementing ->readahead, even without dhowells' patch to expand
the ractl will definitely help you!

> -Mike
> 
> On Thu, Dec 31, 2020 at 11:08 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> 
> > On Thu, Dec 31, 2020 at 04:51:53PM -0500, Mike Marshall wrote:
> > > Greetings...
> > >
> > > I hope some of you will suffer through reading this long message :-) ...
> >
> > Hi Mike!  Happy New Year!
> >
> > > Orangefs isn't built to do small IO. Reading a
> > > big file in page cache sized chunks is slow and painful.
> > > I tried to write orangefs_readpage so that it would do a reasonable
> > > sized hard IO, fill the page that was being called for, and then
> > > go ahead and fill a whole bunch of the following pages into the
> > > page cache with the extra data in the IO buffer.
> >
> > This is some new version of orangefs_readpage(), right?  I don't see
> > anything resembling this in the current codebase.  Did you disable
> > orangefs_readpages() as part of this work?  Because the behaviour you're
> > describing sounds very much like what the readahead code might do to a
> > filesystem which implements readpage and neither readahead nor readpages.
> >
> > > orangefs_readpage gets called for the first four pages and then my
> > > prefill kicks in and fills the next pages and the right data ends
> > > up in /tmp/nine. I, of course, wished and planned for orangefs_readpage
> > > to only get called once, I don't understand why it gets called four
> > > times, which results in three extraneous expensive hard IOs.
> >
> > I might suggest some judicious calling of dump_stack() to understand
> > exactly what's calling you.  My suspicion is that it's this loop in
> > read_pages():
> >
> >                 while ((page = readahead_page(rac))) {
> >                         aops->readpage(rac->file, page);
> >                         put_page(page);
> >                 }
> >
> > which doesn't test for PageUptodate before calling you.
> >
> > It'd probably be best if you implemented ->readahead, which has its own
> > ideas about which pages would be the right ones to read.  It's not always
> > correct, but generally better to have that logic in the VFS than in each
> > filesystem.
> >
> > You probably want to have a look at Dave Howells' work to allow
> > the filesystem to expand the ractl:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter
> >
> > specifically this patch:
> >
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/commit/?h=fscache-iter&id=f582790b32d5d1d8b937df95a8b2b5fdb8380e46
> >



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux