On Fri, May 17, 2013 at 12:00 PM, richard -rw- weinberger <richard.weinberger@xxxxxxxxx> wrote: > On Wed, May 15, 2013 at 9:35 PM, richard -rw- weinberger > <richard.weinberger@xxxxxxxxx> wrote: >> On Wed, May 15, 2013 at 9:30 PM, Toralf Förster <toralf.foerster@xxxxxx> wrote: >>> On 05/15/2013 09:11 PM, richard -rw- weinberger wrote: >>>> On Wed, May 15, 2013 at 9:06 PM, Toralf Förster <toralf.foerster@xxxxxx> wrote: >>>>> On 05/13/2013 09:12 AM, richard -rw- weinberger wrote: >>>>>> This looks like another issue. >>>>>> Are you testing process_vm_writev() with trinity? >>>>>> Looks like it managed to overwrite the stub page of a process, which >>>>>> is not good. >>>>> nope, it is the mremap syscall. >>>>> >>>>> A command like >>>>> >>>>> $>trinity -c mremap -N 10 >>>>> >>>>> immediately after starting a 32 bit Gentoo linux guest with current kernel 3.10-rc1-... + >>>>> strnlen + stub4 patch works, but later a >>>>> >>>>> $>trinity -c mremap -N 1000 >>>>> >>>>> yields into >>>>> >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: Stub registers - >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 0 - 100000 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 1 - 300000 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 2 - 0 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 3 - 0 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 4 - 0 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 5 - 0 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 6 - 0 >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 7 - 7b >>>>> 2013-05-15T21:02:04.061+02:00 trinity kernel: 8 - 7b >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 9 - 0 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 10 - 33 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 11 - ffffffff >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 12 - 1000c3 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 13 - 73 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 14 - 10206 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 15 - 101028 >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: 16 - 7b >>>>> 2013-05-15T21:02:04.065+02:00 trinity kernel: wait_stub_done : failed to wait for SIGTRAP, pid = 15692, n = 15692, errno = 0, status = 0xb7f >>>>> >>>>> and now that process can't be killed - I had to stop the UML guest. >>>> >>>> Hmm, you've remapped the stub page and therefore the process broke. >>>> I think it would make sense to kill the process in stead of writing >>>> the "wait_stub_done ..." message. >>>> Changing the stub page is as destructive than overwriting the stack. >>> >>> Unfortunately no trinity process can be killed as soon as that happen. >>> Neither pgrep, pkill, nor "ps -efla" do return any result. >>> Killing any of those processes by its pid won't work too. >> >> Hmm, not good. >> I need to create me a reproducer for that. >> I'm unsure what exactly is going on. > > Good news, I have a reproducer for the problem and found out what the > root cause is. > UML is unable to terminate the task with the broken skas page. > A fix is on the way... Toralf, can you please append the attached patch too? It makes processes killable which corrupted their stub pages. -- Thanks, //richard
Attachment:
wait_stub_done_fix.diff
Description: Binary data