On Thu, Aug 8, 2024 at 2:02 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > On Thu, Aug 08, 2024 at 01:15:52PM GMT, Andrii Nakryiko wrote: > > On Thu, Aug 8, 2024 at 11:40 AM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > > > > > On Wed, Aug 07, 2024 at 04:40:25PM GMT, Andrii Nakryiko wrote: > > > > Extend freader with a flag specifying whether it's OK to cause page > > > > fault to fetch file data that is not already physically present in > > > > memory. With this, it's now easy to wait for data if the caller is > > > > running in sleepable (faultable) context. > > > > > > > > We utilize read_cache_folio() to bring the desired folio into page > > > > cache, after which the rest of the logic works just the same at folio level. > > > > > > > > Suggested-by: Omar Sandoval <osandov@xxxxxx> > > > > Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx> > > > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > --- > > > > lib/buildid.c | 44 ++++++++++++++++++++++++++++---------------- > > > > 1 file changed, 28 insertions(+), 16 deletions(-) > > > > > > > > diff --git a/lib/buildid.c b/lib/buildid.c > > > > index 5e6f842f56f0..e1c01b23efd8 100644 > > > > --- a/lib/buildid.c > > > > +++ b/lib/buildid.c > > > > @@ -20,6 +20,7 @@ struct freader { > > > > struct folio *folio; > > > > void *addr; > > > > loff_t folio_off; > > > > + bool may_fault; > > > > }; > > > > struct { > > > > const char *data; > > > > @@ -29,12 +30,13 @@ struct freader { > > > > }; > > > > > > > > static void freader_init_from_file(struct freader *r, void *buf, u32 buf_sz, > > > > - struct address_space *mapping) > > > > + struct address_space *mapping, bool may_fault) > > > > { > > > > memset(r, 0, sizeof(*r)); > > > > r->buf = buf; > > > > r->buf_sz = buf_sz; > > > > r->mapping = mapping; > > > > + r->may_fault = may_fault; > > > > } > > > > > > > > static void freader_init_from_mem(struct freader *r, const char *data, u64 data_sz) > > > > @@ -63,6 +65,11 @@ static int freader_get_folio(struct freader *r, loff_t file_off) > > > > freader_put_folio(r); > > > > > > > > r->folio = filemap_get_folio(r->mapping, file_off >> PAGE_SHIFT); > > > > + > > > > + /* if sleeping is allowed, wait for the page, if necessary */ > > > > + if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) > > > > + r->folio = read_cache_folio(r->mapping, file_off >> PAGE_SHIFT, NULL, NULL); > > > > > > Willy's network fs comment is bugging me. If we pass NULL for filler, > > > the kernel will going to use fs's read_folio() callback. I have checked > > > read_folio() for fuse and nfs and it seems like for at least these two > > > filesystems the callback is accessing file->private_data. So, if the elf > > > file is on these filesystems, we might see null accesses. > > > > > > > Isn't that just a huge problem with the read_cache_folio() interface > > then? That file is optional, in general, but for some specific FS > > types it's not. How generic code is supposed to know this? > > > > Or maybe it's a bug with the nfs_read_folio() and fuse_read_folio() > > implementation that they can't handle NULL file argument? > > netfs_read_folio(), for example, seems to be working with file == NULL > > just fine. > > If you go a bit down in netfs_alloc_request() there is the following > code: > > if (rreq->netfs_ops->init_request) { > ret = rreq->netfs_ops->init_request(rreq, file); > ... > ... > > I think this init_request is pointing to nfs_netfs_init_request which > calls nfs_file_open_context(file) and access filp->private_data. That's "nfs", which we know requires a file. For netfs implementations (cifs_init_request() and v9fs_init_request()), they both treat file as optional consistently. But regardless, that's just pointless code archeology, I'll just pass the file reference unconditionally. > > > > > Matthew, can you please advise what's the right approach here? I can, > > of course, always get file refcount, but most of the time it will be > > just an unnecessary overhead, so ideally I'd like to avoid that. But > > if I have to check each read_folio callback implementation to know > > whether it's required or not, then that's not great... > > I don't think we will need file refcnt. We have mmap lock in read mode > in this context because we are accessing vma and this vma has reference > to the file. So, this file can not go away under us here. Yep, good point, then it's not a problem, thanks! Will update.