On Fri, Sep 28, 2018 at 7:50 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > On Fri, Sep 28, 2018 at 11:45 AM Yan, Zheng <zyan@xxxxxxxxxx> wrote: > > > > large read may require allocating large page vector > > > > Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> > > --- > > net/ceph/pagevec.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c > > index d3736f5bffec..b1c47ba0c38a 100644 > > --- a/net/ceph/pagevec.c > > +++ b/net/ceph/pagevec.c > > @@ -62,7 +62,7 @@ void ceph_release_page_vector(struct page **pages, int num_pages) > > > > for (i = 0; i < num_pages; i++) > > __free_pages(pages[i], 0); > > - kfree(pages); > > + kvfree(pages); > > } > > EXPORT_SYMBOL(ceph_release_page_vector); > > > > @@ -74,7 +74,7 @@ struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags) > > struct page **pages; > > int i; > > > > - pages = kmalloc_array(num_pages, sizeof(*pages), flags); > > + pages = kvmalloc_array(num_pages, sizeof(*pages), flags); > > if (!pages) > > return ERR_PTR(-ENOMEM); > > for (i = 0; i < num_pages; i++) { > > Have you considered rewriting ceph_sync_read()? In the end, it does > a synchronous OSD request per object, it's not atomic in any way, so in > theory it shouldn't need more than object_size worth of pages. > Yes, it's a better. But I'm a little busy recently. No idea when I will have time. Regards Yan, Zheng > Fengguang's application is doing ~100M reads, and the one he reported > was likely a ~2G read. The page pointer array itself is only part of > problem, allocating 2G worth of pages when we don't actually need all > of them at once is wrong. > > Thanks, > > Ilya