On Tue, Dec 06, 2016 at 10:43:57AM +0100, Dmitry Vyukov wrote: > On Tue, Dec 6, 2016 at 10:32 AM, Johannes Thumshirn <jthumshirn@xxxxxxx> wrote: > > On Mon, Dec 05, 2016 at 07:03:39PM +0000, Al Viro wrote: > >> On Mon, Dec 05, 2016 at 04:17:53PM +0100, Johannes Thumshirn wrote: > >> > 633 hp = &srp->header; > >> > [...] > >> > 646 hp->dxferp = (char __user *)buf + cmd_size; > >> > >> > So the memory for hp->dxferp comes from: > >> > 633 hp = &srp->header; > >> > >> ???? > >> > >> > >From my debug instrumentation I see that the dxferp ends up in the > >> > iovec_iter's kvec->iov_base and the faulting address is always dxferp + n * > >> > 4k with n in [1, 16] (and we're copying 16 4k pages from the iovec into the > >> > bio). > >> > >> _Address_ of hp->dxferp comes from that assignment; the value is 'buf' > >> argument of sg_write() + small offset. In this case, it should point > >> inside a pipe buffer, which is, indeed, at a kernel address. Who'd > >> allocated srp is irrelevant. > > > > Yes I realized that as well when I had enough distance between me and the > > code... > > > >> > >> And if you end up dereferencing more than one page worth there, you do have > >> a problem - pipe buffers are not going to be that large. Could you slap > >> WARN_ON((size_t)input_size > count); > >> right after the calculation of input_size in sg_write() and see if it triggers > >> on your reproducer? > > > > I did and it didn't trigger. What triggers is (as expected) a > > WARN_ON((size_t)mxsize > count); > > We have count at 80 and mxsize (which ends in hp->dxfer_len) at 65499. But the > > 65499 bytes are the len of the data we're suppost to be copying in via the > > iov. I'm still rather confused what's happening here, sorry. > > > I think the critical piece here is some kind of race or timing > condition. Note that the test program executes all of > memfd_create/write/open/sendfile twice. Second time the calls race > with each other, but they also can race with the first execution of > the calls. FWIW I've just run the reproducer once instead of looping it to check how it would normally behave and it bailes out at: 604 if (count < (SZ_SG_HEADER + 6)) 605 return -EIO; /* The minimum scsi command length is 6 bytes. */ That means, weren't going down the copy_form_iter() road at all. Usually, but sometimes we do. And then we try to copy 16 pages from the pipe buffer (is this correct?). The reproducer does: sendfile("/dev/sg0", memfd, offset_in_memfd, 0x10000); I don't see how we get there? Could it be random data from the mmap() we point the memfd to? This bug is confusing to be honest. -- Johannes Thumshirn Storage jthumshirn@xxxxxxx +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html