Re: TMPFS Maximum File Size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 26 Oct 2010, Tharindu Rukshan Bamunuarachchi wrote:

> Dear Hugh/Christoph/All,
> 
> After investigating further into issue I experienced abnormal memory
> allocation behavior.
> I do not know whether this is the expected behavior or due to misconfiguration.
> 
> I have two node NUMA system and 100G TMPFS mount.
> 
> 1. When "dd" running freely (without CPU affinity) all memory pages
> were allocated from NODE 0 and then from NODE 1.
> 
> 2. When "dd" running bound (using taskset) to CPU core in NODE 1 ....
>     All memory pages were allocated from NODE 1.
>     BUT machine stopped responding after exhausting NODE 1.
>     No memory pages were allocated from NODE 0.
> 
> Do you have any comment / suggestions to try out ?
> Why "dd" cannot allocate memory from NODE 0 when it is running bound
> to NODE 1 CPU core ?

Please take a look at Documentation/filesystems/tmpfs.txt in the
kernel source tree, the section "tmpfs has a mount option to set
the NUMA memory allocation policy" explaining mpol=

I hope that mounting the tmpfs with mpol=interleave, or mpol=local,
will give you the behaviour you want; but perhaps not, since I notice
it does say that the policy applied will be modified by calling task's
cpuset constraints, and it sounds like your dd is constrained to use
memory only from its node.  Documentation/cgroups/cpusets.txt may be
needed too.


(I am not the right person to advise on managing NUMA and cpusets!)

> 
> This is the back trace of core generated from our application process.
> 
> Core was generated by `DataWareHouseEngine Surv:1:1:DataWareHouseEngine:1'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fd924b0cf7c in write () from /lib64/libc.so.6

Nor do I understand why you should be getting a SIGSEGV in libc's write(),
sorry.

> (gdb) bt
> #0  0x00007fd924b0cf7c in write () from /lib64/libc.so.6
> #1  0x000000000053ed02 in NBasicFile::Write (this=0x7fd9100030c8,
> pBuf=0x7fd91c0ce050, iBufLen=29)
>     at /home/surv_3/0/src/app/SURV/libs/SurvNoraLite/29/NBasicFile.cpp:420
> #2  0x00000000005454d4 in NIndex::GenarateHeader (this=0x7fd9100030c0,
> rErr=@0x7fd91c8ccdd0)
>     at /home/surv_3/0/src/app/SURV/libs/SurvNoraLite/29/NIndex.cpp:2350
> #3  0x0000000000545a13 in NIndex::Sync (this=0x7fd9100030c0,
> oNHandle=26832031833, rErr=@0x7fd91c8ccdd0)
>     at /home/surv_3/0/src/app/SURV/libs/SurvNoraLite/29/NIndex.cpp:2440
> #4  0x0000000000486538 in MIndex::Sync (this=0x7fd9100027b0,
> roNHandleTableEnd=@0x2697f470) at
> /home/surv_3/0/src/app/SURV/libs/SSDWI/67/MIndex.C:1562
> #5  0x0000000000483ebf in MDataStore::Fix (this=0x2697f0e8) at
> /home/surv_3/0/src/app/SURV/libs/SSDWI/67/MDataStore.C:762
> #6  0x000000000047971b in SSPage::Connect (this=0x7fd90c01c320,
> iPage=0, bIsRecover=true) at
> /home/surv_3/0/src/app/SURV/libs/SSDWI/67/SSPage.cpp:1548
> #7  0x000000000046a911 in DWHEWriter::Init (this=0x9640f0) at
> /home/surv_3/0/src/app/SURV/components/DataWareHouseEngine/62/DWHEWriter.C:170
> #8  0x000000000046ae8a in DWHEWriter::Run (pPT=0x9640f0) at
> /home/surv_3/0/src/app/SURV/components/DataWareHouseEngine/62/DWHEWriter.C:97
> #9  0x00007fd924832070 in start_thread () from /lib64/libpthread.so.0
> #10 0x00007fd924b1a10d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]