On Fri, 2013-05-17 at 12:33 +0200, Tomas Mraz wrote: > On Thu, 2013-05-16 at 22:29 -0700, Adam Williamson wrote: > > This is a weird bug I've seen 3 or 4 times since upgrading to F19, and > > am having trouble pinning down. > > > > Occasionally, after my session has been up for some time, runs of 'su' > > start behaving oddly. After I enter the root password, it takes a long > > time - longer than the delay that's always happened when you fat-finger > > your password, even - before it succeeds. Usually, of course, it's > > instant. But when this bug happens, I just get to sit there while it > > thinks about it for like 10-15 seconds before I eventually get a root > > prompt. > > > > This only applies to my desktop session - but it applies to any terminal > > running in the desktop. At least, it applies to gnome-terminal (even if > > closed and opened again) and xterm. But if I go to ctrl-alt-f2, login, > > and run an 'su', that still returns instantly. > > > > I've tried strace'ing su, but interestingly, I can't get it to work: > > running su via strace always seems to result in "Authentication > > failure" (which doesn't display this bug; it's only _normally_ slow, the > > same slowness that has always been the case when you fail the password). > > > > So I'm kinda stuck, really. Has anyone else seen this? Any bright ideas > > for debugging it? Thanks! > > Try attaching the strace (as root) to the su process when it waits for > the password. Wow, that's a good (and obvious) idea. *smacks self on forehead* So that works indeed, and now I can see where the pause is. Relevant snippet: = 0 stat("/root/.xauth/import", 0x7fff660f8b70) = -1 ENOENT (No such file or directory) getgid() = 1001 getuid() = 1001 pipe([4, 6]) = 0 pipe([7, 8]) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f2a174b9a10) = 15192 close(4) = 0 close(8) = 0 close(6) = 0 read(7, ** PAUSE HERE ** "", 2048) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=15192, si_status=1, si_utime=0, si_stime=0} --- close(7) = 0 So it forks off a child, and pauses while that child runs. So I did it again, and straced the child process, and here's what I get: [root@adam Beta-RC2]# strace -p 22182 Process 22182 attached restart_syscall(<... resuming interrupted call ...>) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 open("/run/gdm/auth-for-adamw-szT11D/database-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = -1 EACCES (Permission denied) rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({2, 0}, 0x7fff8b56d5c0) = 0 write(2, "/usr/bin/xauth: timeout in lock"..., 91) = -1 EBADF (Bad file descriptor) write(2, "\0", 1) = -1 EBADF (Bad file descriptor) exit_group(1) = ? +++ exited with 1 +++ The actual child process that is run is: /usr/bin/xauth -f /run/gdm/auth-for-adamw-szT11D/database nlist :0 If we look at the permissions: [root@adam Beta-RC2]# ls -lZ /run/gdm/auth-for-adamw-szT11D/database -rw-------. adamw adamw system_u:object_r:xdm_var_run_t:s0 /run/gdm/auth-for-adamw-szT11D/database [root@adam Beta-RC2]# ls -dlZ /run/gdm/auth-for-adamw-szT11D/ drwx--x--x. adamw adamw system_u:object_r:xdm_var_run_t:s0 /run/gdm/auth-for-adamw-szT11D/ note, the file is just 'database', while the strace shows 'database-c'; I don't know if that's significant. SELinux doesn't seem to be involved here, or at least, 'setenforce Permissive' doesn't make the problem go away. I note in the 'broken' state some other stuff doesn't work either, though some of these may be other bugs. One that's very likely related is that I have a dumb hack set up to isolate facebook from the rest of my web use (for privacy reasons); a user account called 'facebook' and a little script called /usr/local/bin/facebook which just does this: #!/bin/sh su facebook -c "firefox http://www.facebook.com" my main user account never goes to facebook.com and has it blocked seven ways from Sunday using noscript and adblock. Anyhow, when I'm seeing this problem with a plain 'su', that dumb facebook trick doesn't work any more: after entering the 'facebook' user's password, I get a similar long pause, then: (process:22887): GLib-CRITICAL **: g_slice_set_config: assertion `sys_page_size == 0' failed No protocol specified No protocol specified Error: cannot open display: :0 that's almost certainly the same bug. One that I'm not sure is the same bug is that an attempt to loopback mount an ISO file to /mnt/temp when running as root after an 'su' command that hit this bug seems to hang and be unkillable. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora http://www.happyassassin.net -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel