On Fri, 24 Feb 2012 14:34:31 +0100 Daniel Vetter <daniel at ffwll.ch> wrote: > > > --- a/include/linux/pagemap.h > > > +++ b/include/linux/pagemap.h > > > @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); > > > static inline int fault_in_pages_writeable(char __user *uaddr, int size) > > > { > > > int ret; > > > + char __user *end = uaddr + size - 1; > > > > > > if (unlikely(size == 0)) > > > return 0; > > > @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) > > > * Writing zeroes into userspace here is OK, because we know that if > > > * the zero gets there, we'll be overwriting it. > > > */ > > > - ret = __put_user(0, uaddr); > > > + while (uaddr <= end) { > > > + ret = __put_user(0, uaddr); > > > + if (ret != 0) > > > + return ret; > > > + uaddr += PAGE_SIZE; > > > + } > > > > The callsites in filemap.c are pretty hot paths, which is why this > > thing remains explicitly inlined. I think it would be worth adding a > > bit of code here to avoid adding a pointless test-n-branch and larger > > cache footprint to read() and write(). > > > > A way of doing that is to add another argument to these functions, say > > "bool multipage". Change the code to do > > > > if (multipage) { > > while (uaddr <= end) { > > ... > > } > > } > > > > and change the callsites to pass in constant "true" or "false". Then > > compile it up and manually check that the compiler completely removed > > the offending code from the filemap.c callsites. > > > > Wanna have a think about that? If it all looks OK then please be sure > > to add code comments explaining why we did this. > > I wasn't really happy with the added branch either, but failed to come up > with a trick to avoid it. Imho adding new _multipage variants of these > functions instead of adding a constant argument is simpler because the > functions don't really share much thanks to the block below. I'll see what > it looks like (and obviously add a comment explaining what's going on). well... that's just syntactic sugar: static inline int __fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { ... } static inline int fault_in_pages_writeable(char __user *uaddr, int size) { return __fault_in_pages_writeable(uaddr, size, false); } static inline int fault_in_multipages_writeable(char __user *uaddr, int size) { return __fault_in_pages_writeable(uaddr, size, true); } which I don't think is worth bothering with given the very small number of callsites.