Re: fs: out of bounds on stack in iov_iter_advance

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Fri, 6 Nov 2015 01:34:02 +0000

On Wed, Sep 30, 2015 at 05:30:17PM -0400, Sasha Levin wrote:

> > So I've traced this all the way back to dax_io(). I can trigger this with:
> > 
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 93bf2f9..2cdb8a5 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -178,6 +178,7 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
> >         if (need_wmb)
> >                 wmb_pmem();
> > 
> > +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
> >         return (pos == start) ? retval : pos - start;
> >  }
> > 
> > So it seems that iter gets moved twice here: once in dax_io(), and once again
> > back at generic_file_read_iter().
> > 
> > I don't see how it ever worked. Am I missing something?

This:
                        struct iov_iter data = *iter;
                        retval = mapping->a_ops->direct_IO(iocb, &data, pos);
                }

                if (retval > 0) {
                        *ppos = pos + retval;
                        iov_iter_advance(iter, retval);

The iterator advanced in ->direct_IO() is a _copy_, not the original.
The contents of *iter as seen by generic_file_read_iter() is not
modifiable by ->direct_IO(), simply because its address is nowhere to
be found.  And checking iov_iter_count(iter) at the end of dax_io() is
pointless - from the POV of generic_file_read_iter() it's data.count,
and while it used to be equal to iter->count, it's already been modified.
By the time we call iov_iter_advance() in generic_file_read_iter() that
value will be already discarded, along with rest of struct iov_iter data.

Wait a minute - you are triggering _what_???
> > +       WARN_ON((pos == start) && (pos - start > iov_iter_count(iter)));
With '&&'?  iov_iter_count() is size_t, while pos and start are loff_t,
so you are seeing equal values in pos and start (as integers) *and*
(loff_t)0 > (size_t)something.  loff_t is a signed type, size_t - unsigned.
6.3.1.8[1] says that
	* if rank of size_t is greater or equal to rank of loff_t, the
latter gets converted to size_t.  And conversion of zero should be zero,
i.e. (size_t) 0 > (size_t)something, which is impossible (we compare them
as non-negative integers).
	* if loff_t can represent all values of size_t, size_t value gets
converted to loff_t.  Result of conversion should have the same (in particular,
non-negative) value.  Again, comparison can't be true.
	* otherwise both values are converted to unsigned counterpart of
loff_t.  Again, zero converts to 0 and in any unsigned type 0 > x is
impossible.

I don't see any way for that condition to evaluate true.

Assuming that it's a misquoted ||...  I don't see any way for pos to
get greater than start + original iov_iter_count().  However, I *do*
see a way for bad things to happen in a different way.  Look:
	// first pass through the loop, pos == start (and so's max)
                                retval = dax_get_addr(bh, &addr, blkbits);
	// got a positive value
                                if (retval < 0)
                                        break;
	// nope, keep going
                                if (buffer_unwritten(bh) || buffer_new(bh)) {
                                        dax_new_buf(addr, retval, first, pos,
                                                                        end);
                                        need_wmb = true;
                                }
                                addr += first;
                                size = retval - first;
	// OK...
                        }
                        max = min(pos + size, end);
	// OK...
                }

                if (iov_iter_rw(iter) == WRITE) {
                        len = copy_from_iter_pmem(addr, max - pos, iter);
                        need_wmb = true;
                } else if (!hole)
                        len = copy_to_iter((void __force *)addr, max - pos,
                                        iter);
                else
                        len = iov_iter_zero(max - pos, iter);
	// too bad - we'd hit an unmapped memory area.  len is 0...
	// and retval is fucking positive.
                if (!len)
                        break;

	return (pos == start) ? retval : pos - start;
	// will return a bloody big positive value

Could you try to reproduce it with this:

dax_io(): don't let non-error value escape via retval instead of EFAULT

Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
---

diff --git a/fs/dax.c b/fs/dax.c
index a86d3cc..7b653e9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -169,8 +169,10 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
 		else
 			len = iov_iter_zero(max - pos, iter);
 
-		if (!len)
+		if (!len) {
+			retval = -EFAULT;
 			break;
+		}
 
 		pos += len;
 		addr += len;

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html