[bug report] dead loop in generic_perform_write() //Re: [PATCH v7 07/12] iov_iter: Convert iterate*() to inline funcs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David, Jens:

Kindly ping...

Thanks.
Tong.


在 2024/2/18 11:13, Tong Tiangen 写道:
Hi David, Jens:

Recently, I tested the x86 coredump function of the user process in the
mainline (6.8-rc1) and found an deadloop issue related to this patch.

Let's discuss it.

1. Test step:
----------------------------
   a. Start a user process.
   b. Use EINJ to inject a hardware memory error into a page of
      the this user process.
   c. Send SIGBUS to this user process.
   d. After receiving the signal, a coredump file is configured to be
      written to tmpfs.

2. Root cause:
----------------------------
The deadloop occurs in generic_perform_write(), the call path:

elf_core_dump()
   -> dump_user_range()
     -> dump_emit_page()
       -> iov_iter_bvec()  //iter type set to BVEC
         -> iov_iter_set_copy_mc(&iter);  //support copy mc
           -> __kernel_write_iter()
             -> shmem_file_write_iter()
               -> generic_perform_write()

ssize_t generic_perform_write(...)
{
     [...]
     do {
         [...]
     again:
         //[4]
         if (unlikely(fault_in_iov_iter_readable(i, bytes) ==
                              bytes)) {
             status = -EFAULT;
             break;
         }
         //[5]
         if (fatal_signal_pending(current)) {
             status = -EINTR;
             break;
         }

             [...]

         //[1]
         copied = copy_page_from_iter_atomic(page, offset, bytes,
                          i);
         [...]

         //[2]
         status = a_ops->write_end(...);
         if (unlikely(status != copied)) {
             iov_iter_revert(i, copied - max(status, 0L));
             if (unlikely(status < 0))
                 break;
         }
         cond_resched();

         if (unlikely(status == 0)) {
             /*
             * A short copy made ->write_end() reject the
             * thing entirely.  Might be memory poisoning
             * halfway through, might be a race with munmap,
             * might be severe memory pressure.
             */
             if (copied)
                 bytes = copied;
             //----[3]
             goto again;
         }
         [...]
     } while (iov_iter_count(i));
     [...]
}

[1]Before this patch:
   copy_page_from_iter_atomic()
     -> iterate_and_advance()
        -> __iterate_and_advance(..., ((void)(K),0))
          ->iterate_bvec macro
            -> left = ((void)(K),0)

With CONFIG_ARCH_HAS_COPY_MC, the K() is copy_mc_to_kernel() which
return "bytes not copied".

When a memory error occurs during K(), the value of "left" must be 0.
Therefore, the value of "copied" returned by
copy_page_from_iter_atomic() is not 0, and the loop of
generic_perform_write() can be ended normally.


After this patch:
   copy_page_from_iter_atomic()
     -> iterate_and_advance2()
       -> iterate_bvec()
         -> remain = step()

With CONFIG_ARCH_HAS_COPY_MC, the step() is copy_mc_to_kernel() which
return "bytes not copied".

When a memory error occurs during step(), the value of "left" equal to
the value of "part" (no one byte is copied successfully). In this case,
iterate_bvec() returns 0, and copy_page_from_iter_atomic() also returns
0. The callback shmem_write_end()[2] also returns 0. Finally,
generic_perform_write() goes to "goto again"[3], and the loop restarts.
4][5] cannot enter and exit the loop, then deadloop occurs.

Thanks.
Tong


在 2023/9/25 20:03, David Howells 写道:
Convert the iov_iter iteration macros to inline functions to make the code
easier to follow.

The functions are marked __always_inline as we don't want to end up with
indirect calls in the code.  This, however, leaves dealing with ->copy_mc
in an awkard situation since the step function (memcpy_from_iter_mc())
needs to test the flag in the iterator, but isn't passed the iterator.
This will be dealt with in a follow-up patch.

The variable names in the per-type iterator functions have been harmonised
as much as possible and made clearer as to the variable purpose.

The iterator functions are also moved to a header file so that other
operations that need to scan over an iterator can be added.  For instance,
the rbd driver could use this to scan a buffer to see if it is all zeros
and libceph could use this to generate a crc.

Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
cc: Jens Axboe <axboe@xxxxxxxxx>
cc: Christoph Hellwig <hch@xxxxxx>
cc: Christian Brauner <christian@xxxxxxxxxx>
cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
cc: David Laight <David.Laight@xxxxxxxxxx>
cc: linux-block@xxxxxxxxxxxxxxx
cc: linux-fsdevel@xxxxxxxxxxxxxxx
cc: linux-mm@xxxxxxxxx
Link: https://lore.kernel.org/r/3710261.1691764329@xxxxxxxxxxxxxxxxxxxxxx/ # v1 Link: https://lore.kernel.org/r/855.1692047347@xxxxxxxxxxxxxxxxxxxxxx/ # v2 Link: https://lore.kernel.org/r/20230816120741.534415-1-dhowells@xxxxxxxxxx/ # v3
---

Notes:
     Changes
     =======
     ver #5)
      - Merge in patch to move iteration framework to a header file.
      - Move "iter->count - progress" into individual iteration subfunctions.

  include/linux/iov_iter.h | 274 ++++++++++++++++++++++++++
  lib/iov_iter.c           | 416 ++++++++++++++++-----------------------
  2 files changed, 449 insertions(+), 241 deletions(-)
  create mode 100644 include/linux/iov_iter.h

diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h
new file mode 100644
index 000000000000..270454a6703d
--- /dev/null
+++ b/include/linux/iov_iter.h
@@ -0,0 +1,274 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* I/O iterator iteration building functions.
+ *
+ * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@xxxxxxxxxx)
+ */
+
+#ifndef _LINUX_IOV_ITER_H
+#define _LINUX_IOV_ITER_H
+
+#include <linux/uio.h>
+#include <linux/bvec.h>
+
+typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len,
+                 void *priv, void *priv2);
+typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len,
+                  void *priv, void *priv2);
+
+/*
+ * Handle ITER_UBUF.
+ */
+static __always_inline
+size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+            iov_ustep_f step)
+{
+    void __user *base = iter->ubuf;
+    size_t progress = 0, remain;
+
+    remain = step(base + iter->iov_offset, 0, len, priv, priv2);
+    progress = len - remain;
+    iter->iov_offset += progress;
+    iter->count -= progress;
+    return progress;
+}
+
+/*
+ * Handle ITER_IOVEC.
+ */
+static __always_inline
+size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+             iov_ustep_f step)
+{
+    const struct iovec *p = iter->__iov;
+    size_t progress = 0, skip = iter->iov_offset;
+
+    do {
+        size_t remain, consumed;
+        size_t part = min(len, p->iov_len - skip);
+
+        if (likely(part)) {
+            remain = step(p->iov_base + skip, progress, part, priv, priv2);
+            consumed = part - remain;
+            progress += consumed;
+            skip += consumed;
+            len -= consumed;
+            if (skip < p->iov_len)
+                break;
+        }
+        p++;
+        skip = 0;
+    } while (len);
+
+    iter->nr_segs -= p - iter->__iov;
+    iter->__iov = p;
+    iter->iov_offset = skip;
+    iter->count -= progress;
+    return progress;
+}
+
+/*
+ * Handle ITER_KVEC.
+ */
+static __always_inline
+size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+            iov_step_f step)
+{
+    const struct kvec *p = iter->kvec;
+    size_t progress = 0, skip = iter->iov_offset;
+
+    do {
+        size_t remain, consumed;
+        size_t part = min(len, p->iov_len - skip);
+
+        if (likely(part)) {
+            remain = step(p->iov_base + skip, progress, part, priv, priv2);
+            consumed = part - remain;
+            progress += consumed;
+            skip += consumed;
+            len -= consumed;
+            if (skip < p->iov_len)
+                break;
+        }
+        p++;
+        skip = 0;
+    } while (len);
+
+    iter->nr_segs -= p - iter->kvec;
+    iter->kvec = p;
+    iter->iov_offset = skip;
+    iter->count -= progress;
+    return progress;
+}
+
+/*
+ * Handle ITER_BVEC.
+ */
+static __always_inline
+size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+            iov_step_f step)
+{
+    const struct bio_vec *p = iter->bvec;
+    size_t progress = 0, skip = iter->iov_offset;
+
+    do {
+        size_t remain, consumed;
+        size_t offset = p->bv_offset + skip, part;
+        void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE);
+
+        part = min3(len,
+               (size_t)(p->bv_len - skip),
+               (size_t)(PAGE_SIZE - offset % PAGE_SIZE));
+        remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2);
+        kunmap_local(kaddr);
+        consumed = part - remain;
+        len -= consumed;
+        progress += consumed;
+        skip += consumed;
+        if (skip >= p->bv_len) {
+            skip = 0;
+            p++;
+        }
+        if (remain)
+            break;
+    } while (len);
+
+    iter->nr_segs -= p - iter->bvec;
+    iter->bvec = p;
+    iter->iov_offset = skip;
+    iter->count -= progress;
+    return progress;
+}
+
+/*
+ * Handle ITER_XARRAY.
+ */
+static __always_inline
+size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+              iov_step_f step)
+{
+    struct folio *folio;
+    size_t progress = 0;
+    loff_t start = iter->xarray_start + iter->iov_offset;
+    pgoff_t index = start / PAGE_SIZE;
+    XA_STATE(xas, iter->xarray, index);
+
+    rcu_read_lock();
+    xas_for_each(&xas, folio, ULONG_MAX) {
+        size_t remain, consumed, offset, part, flen;
+
+        if (xas_retry(&xas, folio))
+            continue;
+        if (WARN_ON(xa_is_value(folio)))
+            break;
+        if (WARN_ON(folio_test_hugetlb(folio)))
+            break;
+
+        offset = offset_in_folio(folio, start + progress);
+        flen = min(folio_size(folio) - offset, len);
+
+        while (flen) {
+            void *base = kmap_local_folio(folio, offset);
+
+            part = min_t(size_t, flen,
+                     PAGE_SIZE - offset_in_page(offset));
+            remain = step(base, progress, part, priv, priv2);
+            kunmap_local(base);
+
+            consumed = part - remain;
+            progress += consumed;
+            len -= consumed;
+
+            if (remain || len == 0)
+                goto out;
+            flen -= consumed;
+            offset += consumed;
+        }
+    }
+
+out:
+    rcu_read_unlock();
+    iter->iov_offset += progress;
+    iter->count -= progress;
+    return progress;
+}
+
+/*
+ * Handle ITER_DISCARD.
+ */
+static __always_inline
+size_t iterate_discard(struct iov_iter *iter, size_t len, void *priv, void *priv2,
+              iov_step_f step)
+{
+    size_t progress = len;
+
+    iter->count -= progress;
+    return progress;
+}
+
+/**
+ * iterate_and_advance2 - Iterate over an iterator
+ * @iter: The iterator to iterate over.
+ * @len: The amount to iterate over.
+ * @priv: Data for the step functions.
+ * @priv2: More data for the step functions.
+ * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
+ * @step: Function for other iterators; given kernel addresses.
+ *
+ * Iterate over the next part of an iterator, up to the specified length.  The + * buffer is presented in segments, which for kernel iteration are broken up by
+ * physical pages and mapped, with the mapped address being presented.
+ *
+ * Two step functions, @step and @ustep, must be provided, one for handling + * mapped kernel addresses and the other is given user addresses which have the
+ * potential to fault since no pinning is performed.
+ *
+ * The step functions are passed the address and length of the segment, @priv, + * @priv2 and the amount of data so far iterated over (which can, for example, + * be added to @priv to point to the right part of a second buffer). The step + * functions should return the amount of the segment they didn't process (ie. 0
+ * indicates complete processsing).
+ *
+ * This function returns the amount of data processed (ie. 0 means nothing was
+ * processed and the value of @len means processes to completion).
+ */
+static __always_inline
+size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv,
+                void *priv2, iov_ustep_f ustep, iov_step_f step)
+{
+    if (unlikely(iter->count < len))
+        len = iter->count;
+    if (unlikely(!len))
+        return 0;
+
+    if (likely(iter_is_ubuf(iter)))
+        return iterate_ubuf(iter, len, priv, priv2, ustep);
+    if (likely(iter_is_iovec(iter)))
+        return iterate_iovec(iter, len, priv, priv2, ustep);
+    if (iov_iter_is_bvec(iter))
+        return iterate_bvec(iter, len, priv, priv2, step);
+    if (iov_iter_is_kvec(iter))
+        return iterate_kvec(iter, len, priv, priv2, step);
+    if (iov_iter_is_xarray(iter))
+        return iterate_xarray(iter, len, priv, priv2, step);
+    return iterate_discard(iter, len, priv, priv2, step);
+}
+
+/**
+ * iterate_and_advance - Iterate over an iterator
+ * @iter: The iterator to iterate over.
+ * @len: The amount to iterate over.
+ * @priv: Data for the step functions.
+ * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
+ * @step: Function for other iterators; given kernel addresses.
+ *
+ * As iterate_and_advance2(), but priv2 is always NULL.
+ */
+static __always_inline
+size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv,
+               iov_ustep_f ustep, iov_step_f step)
+{
+    return iterate_and_advance2(iter, len, priv, NULL, ustep, step);
+}
+
+#endif /* _LINUX_IOV_ITER_H */
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 227c9f536b94..65374ee91ecd 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -13,189 +13,69 @@
  #include <net/checksum.h>
  #include <linux/scatterlist.h>
  #include <linux/instrumented.h>
+#include <linux/iov_iter.h>
-/* covers ubuf and kbuf alike */
-#define iterate_buf(i, n, base, len, off, __p, STEP) {        \
-    size_t __maybe_unused off = 0;                \
-    len = n;                        \
-    base = __p + i->iov_offset;                \
-    len -= (STEP);                        \
-    i->iov_offset += len;                    \
-    n = len;                        \
-}
-
-/* covers iovec and kvec alike */
-#define iterate_iovec(i, n, base, len, off, __p, STEP) {    \
-    size_t off = 0;                        \
-    size_t skip = i->iov_offset;                \
-    do {                            \
-        len = min(n, __p->iov_len - skip);        \
-        if (likely(len)) {                \
-            base = __p->iov_base + skip;        \
-            len -= (STEP);                \
-            off += len;                \
-            skip += len;                \
-            n -= len;                \
-            if (skip < __p->iov_len)        \
-                break;                \
-        }                        \
-        __p++;                        \
-        skip = 0;                    \
-    } while (n);                        \
-    i->iov_offset = skip;                    \
-    n = off;                        \
-}
-
-#define iterate_bvec(i, n, base, len, off, p, STEP) {        \
-    size_t off = 0;                        \
-    unsigned skip = i->iov_offset;                \
-    while (n) {                        \
-        unsigned offset = p->bv_offset + skip;        \
-        unsigned left;                    \
-        void *kaddr = kmap_local_page(p->bv_page +    \
-                    offset / PAGE_SIZE);    \
-        base = kaddr + offset % PAGE_SIZE;        \
-        len = min(min(n, (size_t)(p->bv_len - skip)),    \
-             (size_t)(PAGE_SIZE - offset % PAGE_SIZE));    \
-        left = (STEP);                    \
-        kunmap_local(kaddr);                \
-        len -= left;                    \
-        off += len;                    \
-        skip += len;                    \
-        if (skip == p->bv_len) {            \
-            skip = 0;                \
-            p++;                    \
-        }                        \
-        n -= len;                    \
-        if (left)                    \
-            break;                    \
-    }                            \
-    i->iov_offset = skip;                    \
-    n = off;                        \
-}
-
-#define iterate_xarray(i, n, base, len, __off, STEP) {        \
-    __label__ __out;                    \
-    size_t __off = 0;                    \
-    struct folio *folio;                    \
-    loff_t start = i->xarray_start + i->iov_offset;        \
-    pgoff_t index = start / PAGE_SIZE;            \
-    XA_STATE(xas, i->xarray, index);            \
-                                \
-    len = PAGE_SIZE - offset_in_page(start);        \
-    rcu_read_lock();                    \
-    xas_for_each(&xas, folio, ULONG_MAX) {            \
-        unsigned left;                    \
-        size_t offset;                    \
-        if (xas_retry(&xas, folio))            \
-            continue;                \
-        if (WARN_ON(xa_is_value(folio)))        \
-            break;                    \
-        if (WARN_ON(folio_test_hugetlb(folio)))        \
-            break;                    \
-        offset = offset_in_folio(folio, start + __off);    \
-        while (offset < folio_size(folio)) {        \
-            base = kmap_local_folio(folio, offset);    \
-            len = min(n, len);            \
-            left = (STEP);                \
-            kunmap_local(base);            \
-            len -= left;                \
-            __off += len;                \
-            n -= len;                \
-            if (left || n == 0)            \
-                goto __out;            \
-            offset += len;                \
-            len = PAGE_SIZE;            \
-        }                        \
-    }                            \
-__out:                                \
-    rcu_read_unlock();                    \
-    i->iov_offset += __off;                    \
-    n = __off;                        \
-}
-
-#define __iterate_and_advance(i, n, base, len, off, I, K) {    \
-    if (unlikely(i->count < n))                \
-        n = i->count;                    \
-    if (likely(n)) {                    \
-        if (likely(iter_is_ubuf(i))) {            \
-            void __user *base;            \
-            size_t len;                \
-            iterate_buf(i, n, base, len, off,    \
-                        i->ubuf, (I))     \
-        } else if (likely(iter_is_iovec(i))) {        \
-            const struct iovec *iov = iter_iov(i);    \
-            void __user *base;            \
-            size_t len;                \
-            iterate_iovec(i, n, base, len, off,    \
-                        iov, (I))    \
-            i->nr_segs -= iov - iter_iov(i);    \
-            i->__iov = iov;                \
-        } else if (iov_iter_is_bvec(i)) {        \
-            const struct bio_vec *bvec = i->bvec;    \
-            void *base;                \
-            size_t len;                \
-            iterate_bvec(i, n, base, len, off,    \
-                        bvec, (K))    \
-            i->nr_segs -= bvec - i->bvec;        \
-            i->bvec = bvec;                \
-        } else if (iov_iter_is_kvec(i)) {        \
-            const struct kvec *kvec = i->kvec;    \
-            void *base;                \
-            size_t len;                \
-            iterate_iovec(i, n, base, len, off,    \
-                        kvec, (K))    \
-            i->nr_segs -= kvec - i->kvec;        \
-            i->kvec = kvec;                \
-        } else if (iov_iter_is_xarray(i)) {        \
-            void *base;                \
-            size_t len;                \
-            iterate_xarray(i, n, base, len, off,    \
-                            (K))    \
-        }                        \
-        i->count -= n;                    \
-    }                            \
-}
-#define iterate_and_advance(i, n, base, len, off, I, K) \
-    __iterate_and_advance(i, n, base, len, off, I, ((void)(K),0))
-
-static int copyout(void __user *to, const void *from, size_t n)
+static __always_inline
+size_t copy_to_user_iter(void __user *iter_to, size_t progress,
+             size_t len, void *from, void *priv2)
  {
      if (should_fail_usercopy())
-        return n;
-    if (access_ok(to, n)) {
-        instrument_copy_to_user(to, from, n);
-        n = raw_copy_to_user(to, from, n);
+        return len;
+    if (access_ok(iter_to, len)) {
+        from += progress;
+        instrument_copy_to_user(iter_to, from, len);
+        len = raw_copy_to_user(iter_to, from, len);
      }
-    return n;
+    return len;
  }
-static int copyout_nofault(void __user *to, const void *from, size_t n)
+static __always_inline
+size_t copy_to_user_iter_nofault(void __user *iter_to, size_t progress,
+                 size_t len, void *from, void *priv2)
  {
-    long res;
+    ssize_t res;
      if (should_fail_usercopy())
-        return n;
-
-    res = copy_to_user_nofault(to, from, n);
+        return len;
-    return res < 0 ? n : res;
+    from += progress;
+    res = copy_to_user_nofault(iter_to, from, len);
+    return res < 0 ? len : res;
  }
-static int copyin(void *to, const void __user *from, size_t n)
+static __always_inline
+size_t copy_from_user_iter(void __user *iter_from, size_t progress,
+               size_t len, void *to, void *priv2)
  {
-    size_t res = n;
+    size_t res = len;
      if (should_fail_usercopy())
-        return n;
-    if (access_ok(from, n)) {
-        instrument_copy_from_user_before(to, from, n);
-        res = raw_copy_from_user(to, from, n);
-        instrument_copy_from_user_after(to, from, n, res);
+        return len;
+    if (access_ok(iter_from, len)) {
+        to += progress;
+        instrument_copy_from_user_before(to, iter_from, len);
+        res = raw_copy_from_user(to, iter_from, len);
+        instrument_copy_from_user_after(to, iter_from, len, res);
      }
      return res;
  }
+static __always_inline
+size_t memcpy_to_iter(void *iter_to, size_t progress,
+              size_t len, void *from, void *priv2)
+{
+    memcpy(iter_to, from + progress, len);
+    return 0;
+}
+
+static __always_inline
+size_t memcpy_from_iter(void *iter_from, size_t progress,
+            size_t len, void *to, void *priv2)
+{
+    memcpy(to + progress, iter_from, len);
+    return 0;
+}
+
  /*
   * fault_in_iov_iter_readable - fault in iov iterator for reading
   * @i: iterator
@@ -312,23 +192,29 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
          return 0;
      if (user_backed_iter(i))
          might_fault();
-    iterate_and_advance(i, bytes, base, len, off,
-        copyout(base, addr + off, len),
-        memcpy(base, addr + off, len)
-    )
-
-    return bytes;
+    return iterate_and_advance(i, bytes, (void *)addr,
+                   copy_to_user_iter, memcpy_to_iter);
  }
  EXPORT_SYMBOL(_copy_to_iter);
  #ifdef CONFIG_ARCH_HAS_COPY_MC
-static int copyout_mc(void __user *to, const void *from, size_t n)
-{
-    if (access_ok(to, n)) {
-        instrument_copy_to_user(to, from, n);
-        n = copy_mc_to_user((__force void *) to, from, n);
+static __always_inline
+size_t copy_to_user_iter_mc(void __user *iter_to, size_t progress,
+                size_t len, void *from, void *priv2)
+{
+    if (access_ok(iter_to, len)) {
+        from += progress;
+        instrument_copy_to_user(iter_to, from, len);
+        len = copy_mc_to_user(iter_to, from, len);
      }
-    return n;
+    return len;
+}
+
+static __always_inline
+size_t memcpy_to_iter_mc(void *iter_to, size_t progress,
+             size_t len, void *from, void *priv2)
+{
+    return copy_mc_to_kernel(iter_to, from + progress, len);
  }
  /**
@@ -361,22 +247,20 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
          return 0;
      if (user_backed_iter(i))
          might_fault();
-    __iterate_and_advance(i, bytes, base, len, off,
-        copyout_mc(base, addr + off, len),
-        copy_mc_to_kernel(base, addr + off, len)
-    )
-
-    return bytes;
+    return iterate_and_advance(i, bytes, (void *)addr,
+                   copy_to_user_iter_mc, memcpy_to_iter_mc);
  }
  EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
  #endif /* CONFIG_ARCH_HAS_COPY_MC */
-static void *memcpy_from_iter(struct iov_iter *i, void *to, const void *from,
-                 size_t size)
+static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
+                  size_t len, void *to, void *priv2)
  {
-    if (iov_iter_is_copy_mc(i))
-        return (void *)copy_mc_to_kernel(to, from, size);
-    return memcpy(to, from, size);
+    struct iov_iter *iter = priv2;
+
+    if (iov_iter_is_copy_mc(iter))
+        return copy_mc_to_kernel(to + progress, iter_from, len);
+    return memcpy_from_iter(iter_from, progress, len, to, priv2);
  }
  size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
@@ -386,30 +270,46 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
      if (user_backed_iter(i))
          might_fault();
-    iterate_and_advance(i, bytes, base, len, off,
-        copyin(addr + off, base, len),
-        memcpy_from_iter(i, addr + off, base, len)
-    )
-
-    return bytes;
+    return iterate_and_advance2(i, bytes, addr, i,
+                    copy_from_user_iter,
+                    memcpy_from_iter_mc);
  }
  EXPORT_SYMBOL(_copy_from_iter);
+static __always_inline
+size_t copy_from_user_iter_nocache(void __user *iter_from, size_t progress,
+                   size_t len, void *to, void *priv2)
+{
+    return __copy_from_user_inatomic_nocache(to + progress, iter_from, len);
+}
+
  size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
  {
      if (WARN_ON_ONCE(!i->data_source))
          return 0;
-    iterate_and_advance(i, bytes, base, len, off,
-        __copy_from_user_inatomic_nocache(addr + off, base, len),
-        memcpy(addr + off, base, len)
-    )
-
-    return bytes;
+    return iterate_and_advance(i, bytes, addr,
+                   copy_from_user_iter_nocache,
+                   memcpy_from_iter);
  }
  EXPORT_SYMBOL(_copy_from_iter_nocache);
  #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+static __always_inline
+size_t copy_from_user_iter_flushcache(void __user *iter_from, size_t progress,
+                      size_t len, void *to, void *priv2)
+{
+    return __copy_from_user_flushcache(to + progress, iter_from, len);
+}
+
+static __always_inline
+size_t memcpy_from_iter_flushcache(void *iter_from, size_t progress,
+                   size_t len, void *to, void *priv2)
+{
+    memcpy_flushcache(to + progress, iter_from, len);
+    return 0;
+}
+
  /**
   * _copy_from_iter_flushcache - write destination through cpu cache
   * @addr: destination kernel address
@@ -431,12 +331,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
      if (WARN_ON_ONCE(!i->data_source))
          return 0;
-    iterate_and_advance(i, bytes, base, len, off,
-        __copy_from_user_flushcache(addr + off, base, len),
-        memcpy_flushcache(addr + off, base, len)
-    )
-
-    return bytes;
+    return iterate_and_advance(i, bytes, addr,
+                   copy_from_user_iter_flushcache,
+                   memcpy_from_iter_flushcache);
  }
  EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
  #endif
@@ -508,10 +405,9 @@ size_t copy_page_to_iter_nofault(struct page *page, unsigned offset, size_t byte
          void *kaddr = kmap_local_page(page);
          size_t n = min(bytes, (size_t)PAGE_SIZE - offset);
-        iterate_and_advance(i, n, base, len, off,
-            copyout_nofault(base, kaddr + offset + off, len),
-            memcpy(base, kaddr + offset + off, len)
-        )
+        n = iterate_and_advance(i, bytes, kaddr,
+                    copy_to_user_iter_nofault,
+                    memcpy_to_iter);
          kunmap_local(kaddr);
          res += n;
          bytes -= n;
@@ -554,14 +450,25 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
  }
  EXPORT_SYMBOL(copy_page_from_iter);
-size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
+static __always_inline
+size_t zero_to_user_iter(void __user *iter_to, size_t progress,
+             size_t len, void *priv, void *priv2)
  {
-    iterate_and_advance(i, bytes, base, len, count,
-        clear_user(base, len),
-        memset(base, 0, len)
-    )
+    return clear_user(iter_to, len);
+}
-    return bytes;
+static __always_inline
+size_t zero_to_iter(void *iter_to, size_t progress,
+            size_t len, void *priv, void *priv2)
+{
+    memset(iter_to, 0, len);
+    return 0;
+}
+
+size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
+{
+    return iterate_and_advance(i, bytes, NULL,
+                   zero_to_user_iter, zero_to_iter);
  }
  EXPORT_SYMBOL(iov_iter_zero);
@@ -586,10 +493,9 @@ size_t copy_page_from_iter_atomic(struct page *page, size_t offset,
          }
          p = kmap_atomic(page) + offset;
-        iterate_and_advance(i, n, base, len, off,
-            copyin(p + off, base, len),
-            memcpy_from_iter(i, p + off, base, len)
-        )
+        n = iterate_and_advance2(i, n, p, i,
+                     copy_from_user_iter,
+                     memcpy_from_iter_mc);
          kunmap_atomic(p);
          copied += n;
          offset += n;
@@ -1180,32 +1086,64 @@ ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
  }
  EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
+static __always_inline
+size_t copy_from_user_iter_csum(void __user *iter_from, size_t progress,
+                size_t len, void *to, void *priv2)
+{
+    __wsum next, *csum = priv2;
+
+    next = csum_and_copy_from_user(iter_from, to + progress, len);
+    *csum = csum_block_add(*csum, next, progress);
+    return next ? 0 : len;
+}
+
+static __always_inline
+size_t memcpy_from_iter_csum(void *iter_from, size_t progress,
+                 size_t len, void *to, void *priv2)
+{
+    __wsum *csum = priv2;
+
+    *csum = csum_and_memcpy(to + progress, iter_from, len, *csum, progress);
+    return 0;
+}
+
  size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
                     struct iov_iter *i)
  {
-    __wsum sum, next;
-    sum = *csum;
      if (WARN_ON_ONCE(!i->data_source))
          return 0;
-
-    iterate_and_advance(i, bytes, base, len, off, ({
-        next = csum_and_copy_from_user(base, addr + off, len);
-        sum = csum_block_add(sum, next, off);
-        next ? 0 : len;
-    }), ({
-        sum = csum_and_memcpy(addr + off, base, len, sum, off);
-    })
-    )
-    *csum = sum;
-    return bytes;
+    return iterate_and_advance2(i, bytes, addr, csum,
+                    copy_from_user_iter_csum,
+                    memcpy_from_iter_csum);
  }
  EXPORT_SYMBOL(csum_and_copy_from_iter);
+static __always_inline
+size_t copy_to_user_iter_csum(void __user *iter_to, size_t progress,
+                  size_t len, void *from, void *priv2)
+{
+    __wsum next, *csum = priv2;
+
+    next = csum_and_copy_to_user(from + progress, iter_to, len);
+    *csum = csum_block_add(*csum, next, progress);
+    return next ? 0 : len;
+}
+
+static __always_inline
+size_t memcpy_to_iter_csum(void *iter_to, size_t progress,
+               size_t len, void *from, void *priv2)
+{
+    __wsum *csum = priv2;
+
+    *csum = csum_and_memcpy(iter_to, from + progress, len, *csum, progress);
+    return 0;
+}
+
  size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
                   struct iov_iter *i)
  {
      struct csum_state *csstate = _csstate;
-    __wsum sum, next;
+    __wsum sum;
      if (WARN_ON_ONCE(i->data_source))
          return 0;
@@ -1219,14 +1157,10 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
      }
      sum = csum_shift(csstate->csum, csstate->off);
-    iterate_and_advance(i, bytes, base, len, off, ({
-        next = csum_and_copy_to_user(addr + off, base, len);
-        sum = csum_block_add(sum, next, off);
-        next ? 0 : len;
-    }), ({
-        sum = csum_and_memcpy(base, addr + off, len, sum, off);
-    })
-    )
+
+    bytes = iterate_and_advance2(i, bytes, (void *)addr, &sum,
+                     copy_to_user_iter_csum,
+                     memcpy_to_iter_csum);
      csstate->csum = csum_shift(sum, csstate->off);
      csstate->off += bytes;
      return bytes;






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux