Re: [PATCH 1/1] x86: fix text_poke

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > performance i dont think we should be too worried about at this 
> > moment - this code is so rarely used that it should be driven by 
> > robustness i think.
> 
> That really isn't true. This isn't done just once. It's done many 
> thousands of times.
> 
> I agree that it has to be robust, but if we want to make 
> suspend/resume be instantaneous (and we do), performance does actually 
> matter. Yes, this is probably much less of a problem than waiting for 
> devices, and no, I haven't timed it, but if I counted right, we'll 
> literally be going almost ten thousand of these calls over a 
> suspend/resume cycle.
> 
> That's not "rarely used".

yeah, it's done 2800 times on my box with a distro .config.

no strong feeling either way - but i dont think there's any cross-CPU 
TLB flush done in this case within vmap()/vunmap(). Why? Because when 
alternative_instructions() runs then we have just a single CPU in 
cpu_online_map.

So i think it's only direct vmap()/vunmap() overhead, on a single CPU. 
We do a kmalloc/kfree which is rather fast - sub-microsecond. We install 
the pages in the pte's - this is rather fast as well - sub-microsecond. 
Even assuming cache-cold lines (which they are most of the time) and 
taken thousands of times that's at most a few milliseconds IMO.

In fact, most of the actual vmap() related overhead should be 
well-cached (the kmalloc bits) - the main cost should come from trashing 
through all the instruction sites and modifying them.

i just measured the actual costs, and the UP/SMP offline/online 
transition time (with Jiri's patch applied) is:

  # time echo 0 > /sys/devices/system/cpu/cpu1/online

  real    0m0.116s
  user    0m0.000s
  sys     0m0.008s

  # time echo 1 > /sys/devices/system/cpu/cpu1/online

  real    0m0.095s
  user    0m0.000s
  sys     0m0.069s

with your fixmap patch:

  # time echo 0 > /sys/devices/system/cpu/cpu1/online

  real    0m0.110s
  user    0m0.001s
  sys     0m0.003s

  # time echo 1 > /sys/devices/system/cpu/cpu1/online

  real    0m0.099s
  user    0m0.000s
  sys     0m0.072s

(i ran it multiple times and picked a representative run)

i also did a third control run with a kernel that had 
alternative_instructions() disabled. The offline/online cost is:

  # time echo 0 > /sys/devices/system/cpu/cpu1/online

  real    0m0.108s
  user    0m0.000s
  sys     0m0.000s

  # time echo 1 > /sys/devices/system/cpu/cpu1/online

  real    0m0.096s
  user    0m0.000s
  sys     0m0.068s

_perhaps_ there's a decrease in time but i couldnt say it for sure, 
because in the 'go online' case the numbers are so similar.

In the go-offline case there seems to be a gradual decrease but that 
could be statistical noise. (The user/sys times are not reliable because 
most of this happens with irqs off, but the 'real' portion should be 
reliable.)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux