Re: linux-next: add utrace tree

Ananth N Mavinakayanahalli <ananth@xxxxxxxxxx> · Wed, 27 Jan 2010 16:35:55 +0530

On Wed, Jan 27, 2010 at 11:55:16AM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-27 at 02:43 -0800, Linus Torvalds wrote:
> > 
> > On Wed, 27 Jan 2010, Peter Zijlstra wrote:
> > > 
> > > Right, so you're going to love uprobes, which does exactly that. The
> > > current proposal is overwriting the target instruction with an INT3 and
> > > injecting an extra vma into the target process's address space
> > > containing the original instruction(s) and possible jumps back to the
> > > old code stream.
> > 
> > Just out of interest, how does it handle the threading issue?
> > 
> > Last I saw, at least some CPU people were _very_ nervous about overwriting 
> > instructions if another CPU might be just about to execute them.
> > 
> > Even the "overwrite only the first byte with 'int3'" made them go "umm, I 
> > need to talk to some core CPU people to see if that's ok". They mumble 
> > about possible CPU errata, I$ coherency, instruction retry etc.
> > 
> > I realize kprobes does this very thing, but kprobes is esoteric stuff and 
> > doesn't have much choice. In user space, you _could_ do the modification 
> > on a different physical page and then just switch the page table entry 
> > instead, and not get into the whole D$/I$ coherency thing at all.
> 
> Right, so there's two aspects:
> 
>  1) concurrency when inserting the probe
>  2) concurrency when hitting the probe
> 
> 1) used to be dealt with by using utrace to stop all threads in the
> process and then writing the instruction. I suggested to CoW the page,
> modify the instruction, set the pagetable and flush tlbs at full speed
> -- the very thing you suggest here.
> 
> 2) so traditionally (and the intel arch manual describes this) is to
> replace the instruction, single step it, and write the probe back. This
> is racy for multi-threading. The current uprobes stuff solves this by
> doing single-step-out-of-line (XOL).
> 
> XOL injects a new vma into the target process and puts the old
> instruction there, then it single steps on the new location, leaving the
> original site with INT3.
> 
> This doesn't work for things like RIP relative instructions, so uprobes
> considers them un-probable.

Probing RIP-relative instructions work just fine; there are fixups that
take care of it.

> Also, I myself really object to inserting a vma in a running process,
> its like a land-lord, sure he has the key but he won't come in an poke
> through your things.
> 
> The alternative is to place the instruction in TLS or stack space, since
> each thread can only have a single trap at a time, you only need space
> for 1 instruction (plus a possible jump out to the original site). There
> is the 'problem' of marking the TLS/stack executable when being probed.
> 
> Then there is the whole emulation angle, the uprobes people basically
> say its too much effort to write a x86 emulator.

We don't need to write one. I don't know how easy it is to make the kvm
emulator less kvm-centric (vcpus, kvm_context, etc). Avi?

Ananth 
--
To unsubscribe from this list: send the line "unsubscribe linux-next" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html