On Thu, Aug 8, 2024 at 1:58 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > On Thu, Aug 8, 2024 at 10:16 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > On Thu, Aug 8, 2024 at 11:40 AM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > > > > > On Wed, Aug 07, 2024 at 04:40:25PM GMT, Andrii Nakryiko wrote: > > > > Extend freader with a flag specifying whether it's OK to cause page > > > > fault to fetch file data that is not already physically present in > > > > memory. With this, it's now easy to wait for data if the caller is > > > > running in sleepable (faultable) context. > > > > > > > > We utilize read_cache_folio() to bring the desired folio into page > > > > cache, after which the rest of the logic works just the same at folio level. > > > > > > > > Suggested-by: Omar Sandoval <osandov@xxxxxx> > > > > Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx> > > > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > --- > > > > lib/buildid.c | 44 ++++++++++++++++++++++++++++---------------- > > > > 1 file changed, 28 insertions(+), 16 deletions(-) > > > > > > > > diff --git a/lib/buildid.c b/lib/buildid.c > > > > index 5e6f842f56f0..e1c01b23efd8 100644 > > > > --- a/lib/buildid.c > > > > +++ b/lib/buildid.c > > > > @@ -20,6 +20,7 @@ struct freader { > > > > struct folio *folio; > > > > void *addr; > > > > loff_t folio_off; > > > > + bool may_fault; > > > > }; > > > > struct { > > > > const char *data; > > > > @@ -29,12 +30,13 @@ struct freader { > > > > }; > > > > > > > > static void freader_init_from_file(struct freader *r, void *buf, u32 buf_sz, > > > > - struct address_space *mapping) > > > > + struct address_space *mapping, bool may_fault) > > > > { > > > > memset(r, 0, sizeof(*r)); > > > > r->buf = buf; > > > > r->buf_sz = buf_sz; > > > > r->mapping = mapping; > > > > + r->may_fault = may_fault; > > > > } > > > > > > > > static void freader_init_from_mem(struct freader *r, const char *data, u64 data_sz) > > > > @@ -63,6 +65,11 @@ static int freader_get_folio(struct freader *r, loff_t file_off) > > > > freader_put_folio(r); > > > > > > > > r->folio = filemap_get_folio(r->mapping, file_off >> PAGE_SHIFT); > > > > + > > > > + /* if sleeping is allowed, wait for the page, if necessary */ > > > > + if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) > > > > + r->folio = read_cache_folio(r->mapping, file_off >> PAGE_SHIFT, NULL, NULL); > > > > > > Willy's network fs comment is bugging me. If we pass NULL for filler, > > > the kernel will going to use fs's read_folio() callback. I have checked > > > read_folio() for fuse and nfs and it seems like for at least these two > > > filesystems the callback is accessing file->private_data. So, if the elf > > > file is on these filesystems, we might see null accesses. > > > > > > > Isn't that just a huge problem with the read_cache_folio() interface > > then? That file is optional, in general, but for some specific FS > > types it's not. How generic code is supposed to know this? > > I think you have to think about it the other way around. The file is Fair enough: > @file: Passed to filler function, may be NULL if not required. But then you look at mapping_read_folio_gfp() which *always* unconditionally passes NULL for filler and file, and that makes you think that file is some special *extra* parameter. But regardless, as you pointed out, I won't have to take extra ref, so my concerns about performance are wrong. I'll pass the file. > required, unless you know the filler function that will be used > doesn't use the file. Which you don't know when you're coming from > generic code, so generic code has to pass in a file. > > As far as I can tell, most of the callers of read_cache_folio() (via > read_mapping_folio()) are inside filesystem implementations, not > generic code, so they know what the filler function will do. You're > generic code, so I think you have to pass in a file. > Yep, I guess this is a bit of trailblazing use case. I was confused by some other helpers passing NULL for file unconditionally, which made me think that NULL is a supported default use case. Clearly I was wrong. > > Or maybe it's a bug with the nfs_read_folio() and fuse_read_folio() > > implementation that they can't handle NULL file argument? > > netfs_read_folio(), for example, seems to be working with file == NULL > > just fine. > > > > Matthew, can you please advise what's the right approach here? I can, > > of course, always get file refcount, but most of the time it will be > > just an unnecessary overhead, so ideally I'd like to avoid that. But > > if I have to check each read_folio callback implementation to know > > whether it's required or not, then that's not great... > > Why would you need to increment the file refcount? As far as I can > tell, all your accesses to the file would happen under > __build_id_parse(), which is borrowing the refcounted reference from > vma->vm_file; the file can't go away as long as your caller is holding > the mmap lock. Yep, agreed.