Hi Joanne, On 12/2/24 21:40, Joanne Koong wrote: > On Sat, Nov 30, 2024 at 12:22 AM Bernd Schubert > <bernd.schubert@xxxxxxxxxxx> wrote: >> >> On 11/30/24 07:51, Nihar Chaithanya wrote: > > Hi Nihar and Bernd, > >>> The bug KASAN: null-ptr-deref is triggered due to *val being >>> dereferenced when it is null in fuse_copy_do() when performing >>> memcpy(). > > It's not clear to me that syzbot's "null-ptr-deref" complaint is about > *val being dereferenced when val is NULL. > > The stack trace [1] points to the 2nd memcpy in fuse_copy_do(): > > /* Do as much copy to/from userspace buffer as we can */ > static int fuse_copy_do(struct fuse_copy_state *cs, void **val, unsigned *size) > { > unsigned ncpy = min(*size, cs->len); > if (val) { > void *pgaddr = kmap_local_page(cs->pg); > void *buf = pgaddr + cs->offset; > > if (cs->write) > memcpy(buf, *val, ncpy); > else > memcpy(*val, buf, ncpy); > > kunmap_local(pgaddr); > *val += ncpy; > } > ... > } > > but AFAICT, if val is NULL then we never try to deref val since it's > guarded by the "if (val)" check. The function takes &val in fuse_copy_one(). The NULL check is more for passing NULL from fuse_copy_page(). > > It seems like syzbot is either complaining about buf being NULL / *val > being NULL and then trying to deference those inside the memcpy call, > or maybe it actually is (mistakenly) complaining about val being NULL. I don't think it is 'buf', because of ==> Write of size 5 at addr 0000000000000000 If it would be buf, it would be a read. With the knowledge that the line number is correct, as it goes through fuse_dev_write(). Although I have to admit that cs->write is really confusing - just the other way around of fuse_dev_do_write / fuse_dev_do_read. > > It's not clear to me either how the "fuse: convert direct io to use > folios" patch (on the fuse tree, it's commit 3b97c36) [2] directly > causes this. > > If I'm remembering correctly, it's possible to add debug printks to a > patch and syzbot will print out the debug messages as it triggers the > issue? It'd be interesting to see which request opcode triggers this, > and what exactly is being deref-ed here that is NULL. I need to look > at this more deeply but so far, nothing stands out as to what could be > the culprit. Yeah, I was just thinking the same and just reading through syzbot doku. I had tried to reproduce in my lokal VM on master/6.13 - no luck. Thanks, Bernd