Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ilya,

 >
 >
 >ISTR there were some anti-spam measures put in place.  Is your account
 >waiting for manual approval?  If so, David should be able to help.
 
Yes if I remember correctly I get waiting approval when I try to log in.
  
 >>
 >>
 >>
 >> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9287
 >> ffff911a9a26bd00 fail -12
 >> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9283
 >
 >
 >It is failing to allocate memory.  "low load" isn't very specific,
 >can you describe the setup and the workload in more detail?

4 nodes (osd, mon combined), the 4th node has local cephfs mount, which 
is rsync'ing some files from vm's. 'low load' I have sort of test setup, 
going to production. Mostly the nodes are below a load of 1 (except when 
the concurrent rsync starts)

 >How many snapshots do you have?

Don't know how to count them. I have script running on a 2000 dirs. If 
one of these dirs is not empty it creates a snapshot. So in theory I 
could have 2000 x 7 days = 14000 snapshots.
(btw the cephfs snapshots are in a different tree than the rsync is 
using)
 
 >Do you keep track of memory consumption on the node?

A bit, attached is nagios graph. I have 100GB in this node. Since then, 
I disabled all the hugepages (2MB, 1GB) I created there, to free up more 
memory.

 >Finally, you say "crash" in the subject.  Does the kernel actually
 >crash or perhaps it locks up?  If it actually crashes, do you have the
 >panic message?
 >
 
Whole server was gone. The logs are from the remote syslog server.

New situation is with more memory and kernel updated to 
3.10.0-1062.4.3.el7.x86_64, rsync is very slow and I have kworker 100% 
load

Attachment: c04-memory.png
Description: Binary data

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux