Dear Hugh/Christoph/All, I have done further testing to isolate the issue & found following. 1. At the moment .... Issue only occurs with IBM hardware. (x3550/x3650). It did not occur in HP Nehalem or Sun X4600. I have only IBM/HP/Sun boxes. 2. Issue is not visible with vanilla kernel 2.6.32 or 2.6.36. SLES 11 is running with 2.6.27-45. I think I should turn to IBM/Novell for further help. I still wonder why this happens only with IBM+SLES 11 kernel ? Same HW works with later kernels ? __ Tharindu R Bamunuarachchi. On Thu, Oct 28, 2010 at 1:38 AM, Christoph Lameter <cl@xxxxxxxxx> wrote: > > On Tue, 26 Oct 2010, Tharindu Rukshan Bamunuarachchi wrote: > > > I have two node NUMA system and 100G TMPFS mount. > > > > 1. When "dd" running freely (without CPU affinity) all memory pages > > were allocated from NODE 0 and then from NODE 1. > > > > 2. When "dd" running bound (using taskset) to CPU core in NODE 1 .... > > Â Â All memory pages were allocated from NODE 1. > > Â Â BUT machine stopped responding after exhausting NODE 1. > > Â Â No memory pages were allocated from NODE 0. > > Hmmm... Strange it should fall back like under #1. Can you tell us where > it hung? > > > Do you have any comment / suggestions to try out ? > > Why "dd" cannot allocate memory from NODE 0 when it is running bound > > to NODE 1 CPU core ? > > Definitely looks like a bug somewhere. TMPFS policies are not correctly > falling over to more distant zones? > > > Core was generated by `DataWareHouseEngine Surv:1:1:DataWareHouseEngine:1'. > > Program terminated with signal 11, Segmentation fault. > > #0 Â0x00007fd924b0cf7c in write () from /lib64/libc.so.6 > > Hmmm... Kernel oops? Or a segfault because of an invalid reference by your > app? > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href