On Thu, Jul 23, 2015 at 9:34 PM, Vedran Furač <vedran.furac@xxxxxxxxx> wrote: > On 07/23/2015 06:47 PM, Ilya Dryomov wrote: >> >> To me this looks like a writev() interrupted by a SIGALRM. I think >> nginx guys read your original email the same way I did, which is "write >> syscall *returned* ERESTARTSYS", but I'm pretty sure that is not the >> case here. >> >> ERESTARTSYS shows up in strace output but it is handled by the kernel, >> userpace doesn't see it (but strace has to be able to see it, otherwise >> you wouldn't know if your system call has been restarted or not). >> >> You cut the output short - I asked for the entire output for a reason, >> please paste it somewhere. > > Might be, however I don't know why would be nginx interrupting it, all > writes are done pretty fast and timeouts are set to 10 minutes. Here are > 2 examples on 2 servers with slightly different configs (timestams > included): > > http://pastebin.com/wUAAcdT7 > > http://pastebin.com/wHyWc9U5 I don't know - looks like nginx isn't setting SA_RESTART, so it should be repeating the write()/writev() itself. That said, if it happens only on cephfs, we need to track it down. Try enabling nginx debug logging? I seem to have a vague recollection that it records return values. That should give us something to start with. Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com