Re: Ceph 12.2.0 on 32bit?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Have you tried running a Luminous OSD with filestore instead of BlueStore?

As BlueStore is all new code and uses a lot of optimizations and tricks for fast and efficient use of memory, some 64-bit assumptions may have snuck in there. I'm not sure how much interest there is in making sure that works on 32-bit systems at this point, but narrowing it down to a specific component would certainly help.

On Fri, Sep 22, 2017 at 8:57 PM Dyweni - Ceph-Users <6EXbab4FYk8H@xxxxxxxxxx> wrote:

It crashes with SimpleMessenger as well  (ms_type = simple)


I've also tried with and without these two settings, but still crashes.
bluestore cache size = 536870912
bluestore cache kv max = 268435456


When using SimpleMessenger, it tells me it is crashing (Segmentation Fault) in 'thread_name:ms_pipe_write'.  This is common in all crashes under SimpleMessenger, just like 'msgr-worker-<n>' was common under AsyncMessenger.


The node I'm testing this on is running a 32bit kernel (4.12.5) and has 8GB ram (free -m).  


Per 'ps aux', VSZ and RSS never get much above 1196392 and 544024 respectively.  (One time they didn't get past 999536 and 329712 respectively.)


Also, under SimpleMessenger, gdb is reporting stack corruption in the back traces.


What other memory tuning options should I try?


 


On 2017-09-11 08:05, Gregory Farnum wrote:

You could try setting it to run with SimpleMessenger instead of AsyncMessenger -- the default changed across those releases.
I imagine the root of the problem though is that with BlueStore the OSD is using a lot more memory than it used to and so we're overflowing the 32-bit address space...which means a more permanent solution might require turning down the memory tuning options. Sage has discussed those in various places.

On Sun, Sep 10, 2017 at 11:52 PM Dyweni - Ceph-Users <6EXbab4FYk8H@xxxxxxxxxx> wrote:
Hi,

Is anyone running Ceph Luminous (12.2.0) on 32bit Linux?  Have you seen
any problems?



My setup has been 1 MON and 7 OSDs (no MDS, RGW, etc), all running Jewel
(10.2.1), on 32bit, with no issues at all.

I've upgraded everything to latest version of Jewel (10.2.9) and still
no issues.

Next I upgraded my MON to Luminous (12.2.0) and added MGR to it.  Still
no issues.

Next I removed one node from the cluster, wiped it clean, upgraded it to
Luminous (12.2.), and created a new BlueStore data area.  Now this node
crashes with segmentation fault usually within a few minutes of starting
up.  I've loaded symbols and used GDB to examine back traces.  From what
I can tell, the seg faults are happening randomly, and the stack is
corrupted, so traces from GDB are unusable (even with all symbols
installed for all packages on the system). However, in all cases, the
seg fault is occuring in the 'msgr-worker-<n>' thread.




My data is fine, just would like to get Ceph 12.2.0 running stably on
this node, so I can upgrade the remaining nodes and switch everything
over to BlueStore.



Thanks,
Dyweni
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux