Re: 10.2.4 Jewel released

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Gregory,

On Thu, Dec 8, 2016 at 12:10 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> In slightly more detail: you are clearly seeing a problem with the
> messenger, as indicated by the sock_recvmsg at the top of the CPU
> usage list. We've seen this elsewhere very rarely, which is why
> there's already a backport queued up which we didn't block on.
> The 15-minute period you're seeing is the default timeout we set on
> sockets before we start marking them closed if there's no activity.
>
> We're not quite sure why it's causing trouble now, although we have
> one or two patches we are speculating about and looking into.
>
> This didn't turn up in testing because as best we can tell it's only a
> situation you can expect to encounter when you have idle TCP
> connections between systems (or in fairly artificial failed
> networking).

For the OSD's doing 100% cpu, strace indeed shows EAGAIN a lot on some
of the sockets.
I'll try to get some packet captures if I can.

Kind regards,

Ruben
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux