On Wed, Sep 2, 2020 at 8:17 AM Christophe Leroy <christophe.leroy@xxxxxxxxxx> wrote: > > > With this fix, I get > > root@vgoippro:~# time dd if=/dev/zero of=/dev/null count=1M > 536870912 bytes (512.0MB) copied, 6.776327 seconds, 75.6MB/s > > That's still far from the 91.7MB/s I get with 5.9-rc2, but better than > the 65.8MB/s I got yesterday with your series. Still some way to go thought. I don't see why this change would make any difference. And btw, why do the 32-bit and 64-bit checks even differ? It's not like the extra (single) instruction should even matter. I think the main reason is that the simpler 64-bit case could stay as a macro (because it only uses "addr" and "size" once), but honestly, that "simplification" doesn't help when you then need to have that #ifdef for the 32-bit case and an inline function anyway. So why isn't it just static inline int __access_ok(unsigned long addr, unsigned long size) { return addr <= TASK_SIZE_MAX && size <= TASK_SIZE_MAX-addr; } for both and be done with it? The "size=0" check is only relevant for the "addr == TASK_SIZE_MAX" case, and existed in the old code because it had that "-1" thing becasue "seg.seg" was actually TASK_SIZE-1. Now that we don't have any TASK_SIZE-1, zero isn't special any more. However, I suspect a bigger reason for the actual performance degradation would be the patch that makes things use "write_iter()" for writing, even when a simpler "write()" exists. For writing to /dev/null, the cost of setting up iterators and all the pointless indirection is all kinds of stupid. So I think "write()" should just go back to default to using "->write()" rather than "->write_iter()" if the simpler case exists. Linus