On Thu, Sep 5, 2013 at 1:46 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > static int prepend_name(char **buffer, int *buflen, struct qstr *name) > { > const char *s = ACCESS_ONCE(name->name); > unsigned len = ACCESS_ONCE(name->len); > char *p; > > *buflen -= len; > if (*buflen < 0) > return -ENAMETOOLONG; > p = *buffer -= len; > while (len--) { > c = *s++; > if (!c) > break; > *p++ = c; > } > return 0; > } .. and appended is a *totally* untested "dentry_string_copy()" that does things a word at a time (it doesn't have your "buffer/buflen -= len" parts, this is purely the copying code). Note that the return value is only optimistic - it may be copying NUL bytes *without* returning an error. We don't care. We really only care that it not access past the end of the string using the same rules as the RCU lookup does: it *can* access aligned words. It's just meant as a "hey, if we had to go to the bother of checking for a NUL byte and we found it, we might as well let the user know so that he could choose to stop the optimistic RCU thing early". Also note that this thing depends on CONFIG_DCACHE_WORD_ACCESS (which in turn depends on little-endian), _and_ only works if the architecture supports fast unaligned stores (which is true on x86, but may not be true on ARM). And it does require that it can overshoot the destination string - it won't *change* the destination string past the end, but it will do an unaligned store there needs padding. And that might be the biggest problem with this routine and the thing that scuttles it. We could do that final store as a series of byte writes instead, I guess. So maybe it should do do { *(char *)ct++ = a; a >>= 8; } while (--tcount); at the end instead of doing the final word store. I haven't actually tested it, but it looks right (within the above constraints), and it generates asm that looks good enough. It is not entirely obvious that it's really worth it over the "stupid" character-at-a-time model, though: the loop counts are likely not really all that high, and the unaligned store together with loading the big constants to do the "test word for zero" might just add enough overhead that it's not a big win. But the word-at-a-time approach was a noticeable win for the rcu dentry lookup, so it's possible that it's noticeable here too. Linus ---- /* * The source comes from a dentry, which means that it is guaranteed to * be aligned. However, the destination is not likely to be aligned. * * The only rule is: we must *not* overstep the source by more than that * aligned word access if it has a NUL character in it. * * NOTE NOTE NOTE! This also requires that we can overshoot the destination * string by up to one unaligned word access! */ int dentry_string_copy(const unsigned char *_cs, unsigned char *_ct, unsigned tcount) { const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS; const unsigned long *cs = (const void *)_cs; unsigned long *ct = (void *)_ct; unsigned long a,mask; for (;;) { unsigned long adata; a = *cs; if (tcount < sizeof(unsigned long)) break; *ct = a; if (unlikely(has_zero(a, &adata, &constants))) return -EINVAL; cs++; ct++; tcount -= sizeof(unsigned long); if (!tcount) return 0; } /* We don't care if there's a NUL in the last word */ mask = ~(~0ul << tcount*8); *ct = (a & mask) | (~mask & *ct); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html