As someone suggested, I installed linux-generic-hwe-16.04 package on Ubuntu 16.04 to get kernel of 17.10, and then rebooted all VMs, here is what I observed:
- ceph monitor node froze upon reboot, in another case froze after a few minutes
- ceph OSD hosts easily froze
- ceph admin node (which runs no ceph service but ceph-deploy) never freezes
- ceph rgw nodes and ceph mgr so far so good
Here are two images I captured:
Thanks.
On Sat, Jan 20, 2018 at 7:03 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
On Fri, Jan 19, 2018 at 11:54 PM, Youzhong Yang <youzhong@xxxxxxxxx> wrote:
> I don't think it's hardware issue. All the hosts are VMs. By the way, using
> the same set of VMWare hypervisors, I switched back to Ubuntu 16.04 last
> night, so far so good, no freeze.
Too little information to make any sort of assessment I'm afraid but,
at this stage, this doesn't sound like a ceph issue.
--
>
> On Fri, Jan 19, 2018 at 8:50 AM, Daniel Baumann <daniel.baumann@xxxxxx>
> wrote:
>>
>> Hi,
>>
>> On 01/19/18 14:46, Youzhong Yang wrote:
>> > Just wondering if anyone has seen the same issue, or it's just me.
>>
>> we're using debian with our own backported kernels and ceph, works rock
>> solid.
>>
>> what you're describing sounds more like hardware issues to me. if you
>> don't fully "trust"/have confidence in your hardware (and your logs
>> don't reveal anything), I'd recommend running some burn-in tests
>> (memtest, cpuburn, etc.) on them for 24 hours/machine to rule out
>> cpu/ram/etc. issues.
>>
>> Regards,
>> Daniel
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
>
Cheers,
Brad
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com