Re: Multiple OSDs suicide because of client issues?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

This is one of our production clusters which is dual 40 Gb Ethernet
using VLANs for cluster and public networks. I don't think this is
unusual, not like my dev cluster which runs Infiniband and IPoIB. The
client nodes are connected at 10 GB Ethernet.

I wonder if you are talking about the system logs, not the Ceph OSD
logs. I'm attaching a snippet that includes the hour before and after.
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Nov 23, 2015 at 10:33 AM, Gregory Farnum  wrote:
> On Mon, Nov 23, 2015 at 11:27 AM, Robert LeBlanc  wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> I checked the SAR data and the disks for all the OSDs showed usual
>> performance until 20:57:32 when over the next few minutes the I/OPs,
>> bandwidth and latency all decreased. The only thing that I can think
>> of is that some replies to the client got hung up and backed up the
>> OSD process or something.
>
> That shouldn't really be possible but I seem to recall you've got a
> weird network? So maybe.
>
>> There are a couple of other backtraces in
>> the log file, but I could not trace any of them to something useful.
>>
>> Since we took the VMs off that client, we haven't had the problem show up again.
>
> Yeah, we'd really need the actual log output that gets dumped to logs
> on crash — it specifies precisely which thing failed.
> -Greg

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.2.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWU1UGCRDmVDuy+mK58QAAOa0P/RIAO06Fd3myuzyyqlYo
N2VA9bWGaq06iwTLF1mufiEmbVaIPAIAQk+GaODgv/PKSJj6ecqS1/au832d
oO2LocnreeOTLJPL/n+mdeglos63ocwyvM4LP/XpvWJJ1C694mUWjvIxlWKR
4zFXH9V5DMTmCwm3kkY4qXqNUS/FJZyd5fwOg7NnqSzuy2UHIxEOzjGaKUwf
ipgVgy8iIn5tprx/rCawrYvuY141z4nOu1jIzEkXEa+F7pxfpKsXeKFQvEnw
aax/RNuikhLKu6rbCJKCQWL3uUZzrshp6EE3T/uXDP8rMX1ojOcmL1L1bJhh
4XqNdgXYuUXlP2cJtJSfxy7RFayZIw4Htn3YnWCrg7uqzrfwf2Hh2DGAE+06
ggH7qo9Z99hg7ENTDSzpFOyE5eM+oA8OQgpn+/8X7OyNG/eNwJnBlHTT0C+f
LunPV8I4HjRAuCNpkz16ZO/+pLnMAbk/Vp1wGJ3Qcdmxwk1UQ3L+UKASrwWd
S861pU4GOGoRymcse20DDRaChbhQRmK0nxjFq4/YXIo36lbMH2gcXyuAza5z
oFvmEkGwDoYneL0JZHJdHhRqkapMMMRqODC/2YU2EXa3fYatamKCwaHqPSdp
c0BN/yRFlB74RA7szvItUHORyiROxo/MnmGKlCBUNud0cVbBoyzSwfSBwCN1
zA7x
=g7l3
-----END PGP SIGNATURE-----

Attachment: messages-20151122.snip.log.gz
Description: GNU Zip compressed data


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux