Indan Zupancic wrote: > On Thu, January 19, 2012 09:16, Chris Evans wrote: > > On Wed, Jan 18, 2012 at 4:14 PM, Indan Zupancic <indan@xxxxxx> wrote: > >> On Wed, January 18, 2012 22:13, Chris Evans wrote: > >>> On Wed, Jan 18, 2012 at 4:12 AM, Indan Zupancic <indan@xxxxxx> wrote: > >>>> On Wed, January 18, 2012 06:43, Chris Evans wrote: > >>>>> 2) Tracee traps > >>>>> 2b) Tracee could take a SIGKILL here > >>>>> 3) Tracer looks at registers; bad syscall > >>>>> 3b) Or tracee could take a SIGKILL here > >>>>> 4) The only way to stop the bad syscall from executing is to rewrite > >>>>> orig_eax (PTRACE_CONT + SIGKILL only kills the process after the > >>>>> syscall has finished) > >>>> > >>>> Yes, we rewrite it to -1. > >>>> > >>>>> 5) Disaster: the tracee took a SIGKILL so any attempt to address it by > >>>>> pid (such as PTRACE_SETREGS) fails. > >>>> > >>>> I assume that if a task can execute system calls and we get ptrace events > >>>> for that, that we can do other ptrace operations too. Are you saying that > >>>> the kernel has this ptrace gap between SIGKILL and task exit where ptrace > >>>> doesn't work but the task continues executing system calls? That would be > >>>> a huge bug, but it seems very unlikely too, as the task is stopped and > >>>> shouldn't be able to disappear till it is continued by the tracer. > >>>> > >>>> I mean, really? That would be stupid. > >> > >> Okay, I tested this scenario and you're right, we're screwed. > >> > >> What the hell guys? > > > > Steady on :) ptrace() has never been sold as a technology upon which > > its safe to build security solutions. > > Well, that can be said of pretty much all kernel functionality. > That is no excuse for crazy behaviour. > > I more or less fixed it by turning all SIGKILLs into SIGTERMs. > Perhaps I should use a more obscure signal instead. > > >> What about other PID checks in the kernel, are they still > >> safe if the process looks dead but is still active? Or is it a ptrace-only > >> problem? > >> > >>>> If true we have to work around it by disallowing SIGKILL and just sending > >>>> them ourselves within the jail. Meh. > >> > >> I guess this helps a bit. It doesn't prevent external signals, but prisoners > >> don't have control over that. > > > > Well.... a prisoner may be able to play other tricks: > > - Allocate lots of memory... kernel may start spraying around SIGKILLs > > - Sending SIGKILL via prctl() > > prctl is disallowed within our jail. Did you had PR_SET_PDEATHSIG in mind? > But doesn't the tracer become the parent when ptracing or not for this? > Or were you thinking about enabling SECCOMP and counting on the SIGKILL > being process-wide instead of thread-specific? > > > - Sending SIGKILL via fcntl() > > I haven't written the fcntl demultiplexor yet, but I missed fcntl could > be used for sending signals. I knew there was whacky stuff in there, but > didn't expect it to be that bad. Thanks. > > > - Sending SIGKILL via clone() > > How? And can you send it to another process than yourself? > > > > >> > >> Is this SIGKILL specific or is it true for all task ending signals? > > > > Can't remember - try it? > > Tried: It's safe with SIGTERM, so I assume the others are fine too. > I'll double check though... > > >> > >>>> How will you avoid file path races with BPF? > >>> > >>> There is typically no need for file-path based access control in an FTP server. > >>> Take for example anonymous FTP, which will typically be inside a > >>> chroot() to /var/ftp. Inside that filesystem tree -- if you can open() > >>> it, you can have it. > >> > >> Ah, you count on having root access. We don't. > >> > >> Do you know any more crazy security destroying holes? > > > > Try spraying SIGCONT and / or SIGSTOP at tracees. It may be possible > > to confuse the tracer about whether a SIGTRAP event is syscall entry > > or exit. > > Yes, heard about that weirdness before, but it's all ignored. We're > using PTRACE_O_TRACESYSGOOD. > > > Try doing an execve() that fails. May cause similar state confusion in > > the tracer. > > Our jailer pretty much ignores all signals and only handles syscalls > and task exits. We actually check execve's return value to know if we > have to do our stuff or not. Take a look at the file README-linux-ptrace in recent strace Git. (Thanks Denys!) It describes some *really* ugly things Linux does to ptrace on execve when there are threads: The most exciting being the return value is sent to a different tid than called execve(), and other tids magically disappear without notification. You can use PTRACE_O_TRACEEXEC to see if the execve() succeeds, btw. It has the useful side-effect of preventing the legacy behaviour of SIGTRAP being sent as a normal queued signal after successful execve(). -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html