Re: [PATCH 3/4] libceph: use kvmalloc to allocate page vector

"Yan, Zheng" <ukernel@xxxxxxxxx> · Fri, 28 Sep 2018 20:58:07 +0800

On Fri, Sep 28, 2018 at 7:50 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>
> On Fri, Sep 28, 2018 at 11:45 AM Yan, Zheng <zyan@xxxxxxxxxx> wrote:
> >
> > large read may require allocating large page vector
> >
> > Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx>
> > ---
> >  net/ceph/pagevec.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
> > index d3736f5bffec..b1c47ba0c38a 100644
> > --- a/net/ceph/pagevec.c
> > +++ b/net/ceph/pagevec.c
> > @@ -62,7 +62,7 @@ void ceph_release_page_vector(struct page **pages, int num_pages)
> >
> >         for (i = 0; i < num_pages; i++)
> >                 __free_pages(pages[i], 0);
> > -       kfree(pages);
> > +       kvfree(pages);
> >  }
> >  EXPORT_SYMBOL(ceph_release_page_vector);
> >
> > @@ -74,7 +74,7 @@ struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags)
> >         struct page **pages;
> >         int i;
> >
> > -       pages = kmalloc_array(num_pages, sizeof(*pages), flags);
> > +       pages = kvmalloc_array(num_pages, sizeof(*pages), flags);
> >         if (!pages)
> >                 return ERR_PTR(-ENOMEM);
> >         for (i = 0; i < num_pages; i++) {
>
> Have you considered rewriting ceph_sync_read()?  In the end, it does
> a synchronous OSD request per object, it's not atomic in any way, so in
> theory it shouldn't need more than object_size worth of pages.
>

Yes, it's a better. But I'm a little busy recently. No idea when I
will have time.

Regards
Yan, Zheng

> Fengguang's application is doing ~100M reads, and the one he reported
> was likely a ~2G read.  The page pointer array itself is only part of
> problem, allocating 2G worth of pages when we don't actually need all
> of them at once is wrong.
>
> Thanks,
>
>                 Ilya