Re: [RFC PATCH v2] implement orangefs_readahead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK... I used David's suggestion, and also put I it right in
orangefs_readahead, orangefs_readahead_cleanup is gone.

It seems to me to work great, I used it some with some printks
in it and watched it do like I think it ought to.

Here's an example of what's upstream in
5.11.8-200.fc33.x86_64:

# dd if=/pvfsmnt/z1 of=/dev/null bs=4194304 count=30
30+0 records in
30+0 records out
125829120 bytes (126 MB, 120 MiB) copied, 5.77943 s, 21.8 MB/s

And here's this version of orangefs_readahead on top of
5.12.0-rc4:

# dd if=/pvfsmnt/z1 of=/dev/null bs=4194304 count=30
30+0 records in
30+0 records out
125829120 bytes (126 MB, 120 MiB) copied, 0.325919 s, 386 MB/s

So now we're getting somewhere :-). I hope readahead_expand
will be upstream soon.

I plan to use inode->i_size and offset to decide how much
expansion is needed on each call to orangefs_readahead,
I hope looking at i_size isn't one of those race condition
things I'm always screwing up on.

If y'all think the orangefs_readahead below is an OK starting
point, I'll add in the i_size/offset logic so I can get
fullsized orangefs gulps of readahead all the way up to
the last whatever sized fragment of the file and run
xfstests on it to see if it still seems like it is doing right.

One day when it is possible I wish I could figure out how to use
huge pages or something, copying 1024 pages at a time out of the orangefs
internal buffer into the page cache is probably slower than if
I could figure out a way to copy 4194304 bytes out of our buffer
into the page cache at once...

Matthew>> but given that we're talking about doing I/O, probably
Matthew>> not enough to care about.

With orangefs that almost ALL we care about.

Thanks for your help!

-Mike

static void orangefs_readahead(struct readahead_control *rac)
{
unsigned int npages;
loff_t offset;
struct iov_iter iter;
struct file *file = rac->file;
struct inode *inode = file->f_mapping->host;

struct xarray *i_pages;
pgoff_t index;
struct page *page;

int ret;

loff_t new_start = readahead_index(rac) * PAGE_SIZE;
size_t new_len = 524288;
readahead_expand(rac, new_start, new_len);

npages = readahead_count(rac);
offset = readahead_pos(rac);
i_pages = &file->f_mapping->i_pages;

iov_iter_xarray(&iter, READ, i_pages, offset, npages * PAGE_SIZE);

/* read in the pages. */
ret = wait_for_direct_io(ORANGEFS_IO_READ, inode, &offset, &iter,
npages * PAGE_SIZE, inode->i_size, NULL, NULL, file);

/* clean up. */
while ((page = readahead_page(rac))) {
page_endio(page, false, 0);
put_page(page);
}
}

On Sat, Mar 27, 2021 at 9:57 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Sat, Mar 27, 2021 at 08:31:38AM +0000, David Howells wrote:
> > However, in Mike's orangefs_readahead_cleanup(), he could replace:
> >
> >       rcu_read_lock();
> >       xas_for_each(&xas, page, last) {
> >               page_endio(page, false, 0);
> >               put_page(page);
> >       }
> >       rcu_read_unlock();
> >
> > with:
> >
> >       while ((page = readahead_page(ractl))) {
> >               page_endio(page, false, 0);
> >               put_page(page);
> >       }
> >
> > maybe?
>
> I'd rather see that than open-coded use of the XArray.  It's mildly
> slower, but given that we're talking about doing I/O, probably not enough
> to care about.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux