On Thu, Jul 23, 2015 at 6:28 PM, Vedran Furač <vedran.furac@xxxxxxxxx> wrote: > On 07/23/2015 05:25 PM, Ilya Dryomov wrote: >> On Thu, Jul 23, 2015 at 6:02 PM, Vedran Furač <vedran.furac@xxxxxxxxx> wrote: >>> 4118 writev(377, [{"\5\356\307l\361"..., 4096}, {"\337\261\17<\257"..., >>> 4096}, {"\211&;s\310"..., 4096}, {"\370N\372:\252"..., 4096}, >>> {"\202\311/\347\260"..., 4096}, ...], 33) = ? ERESTARTSYS (To be restarted) >>> 4118 --- SIGALRM (Alarm clock) @ 0 (0) --- >>> 4118 rt_sigreturn(0xe) = -1 EINTR (Interrupted system call) >>> 4118 gettid() = 4118 >>> 4118 write(4, "2015/"..., 520) = 520 >>> 4118 close(1206) = 0 >>> 4118 unlink("/home/ceph/temp/45/45/5/0000154545") = 0 >> >> Sorry, I misread your original email and missed the nginx part >> entirely. Looks like Zheng, who commented on IRC, was right: >> >> "the ERESTARTSYS is likely caused by some timeout mechanism in nginx >> signal handler for SIGALARM does not want to restart the write syscall" > > Knowing that this might be an nginx issues as well, I've asked the same > thing on their mailing list in parallel, their response was: > > "It more looks like a bug in cephfs. writev() should never return > ERESTARTSYS." To me this looks like a writev() interrupted by a SIGALRM. I think nginx guys read your original email the same way I did, which is "write syscall *returned* ERESTARTSYS", but I'm pretty sure that is not the case here. ERESTARTSYS shows up in strace output but it is handled by the kernel, userpace doesn't see it (but strace has to be able to see it, otherwise you wouldn't know if your system call has been restarted or not). You cut the output short - I asked for the entire output for a reason, please paste it somewhere. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com