notes on building fds in etch and a failed build question

rmeggins at redhat.com (Rich Megginson) · Mon, 10 Mar 2008 10:53:30 -0600



Tamas Bagyal wrote:
> Tamas Bagyal wrote:
>> Rich Megginson wrote:
>>> Tamas Bagyal wrote:
>>>> Rich Megginson wrote:
>>>>> Bagyal Tamas wrote:
>>>>>> Rich Megginson wrote:
>>>>>>> Tamas Bagyal wrote:
>>>>>>>> hello Ryan,
>>>>>>>>
>>>>>>>> you tried this version? i have two fedora-ds 1.0.4 in mmr 
>>>>>>>> configuration. i migrate one of those to 1.1 (builded by your 
>>>>>>>> and Rich's instrutctions). but i have a problem with memory 
>>>>>>>> usage of ns-slapd process. initially mem usage is 18.5% but 
>>>>>>>> after 2 hours this changed to 23.1% and growed until killed by 
>>>>>>>> kernel. (i think...)
>>>>>>>>
>>>>>>>> mostly read transactions happen (dns) with a few write (cups).
>>>>>>>> this is a debian etch, mem size is 512 mbyte (i know this is 
>>>>>>>> too low, but this is a test environment). cache size of slapd 
>>>>>>>> is 67108864.
>>>>>>> Are you using SSL?  Anything interesting in your server error log?
>>>>>>
>>>>>> I running the setupssl2.sh but not use any ssl connection. error 
>>>>>> log shows nothing, only the server start.
>>>>> The reason I ask is that older versions of the NSS crypto/SSL 
>>>>> libraries had a memory leak.  NSS 3.11.7 does not have this 
>>>>> problem.  But you would only see the problem if you were using SSL 
>>>>> connections.
>>>>
>>>> ok. I tried again from begining. fresh install, no ssl, no 
>>>> migration, used the setup-ds-admi.pl and setup the mmr with a 
>>>> fedora-ds 1.0.4. but nothing changed, memory usage growing...
>>>> All setting is default except the mmr/changelog and access.log is off.
>>>>
>>>> errors:
>>>>
>>>>  Fedora-Directory/1.1.0 B2008.059.1017
>>>>         tower.fmintra.hu:389 (/opt/dirsrv/etc/dirsrv/slapd-tower)
>>>>
>>>>
>>>> [05/Mar/2008:10:19:20 +0100] - dblayer_instance_start: pagesize: 
>>>> 4096, pages: 128798, procpages: 5983
>>>> [05/Mar/2008:10:19:20 +0100] - cache autosizing: import cache: 204800k
>>>> [05/Mar/2008:10:19:21 +0100] - li_import_cache_autosize: 50, 
>>>> import_pages: 51200, pagesize: 4096
>>>> [05/Mar/2008:10:19:21 +0100] - WARNING: Import is running with 
>>>> nsslapd-db-private-import-mem on; No other process is allowed to 
>>>> access the database
>>>> [05/Mar/2008:10:19:21 +0100] - dblayer_instance_start: pagesize: 
>>>> 4096, pages: 128798, procpages: 5983
>>>> [05/Mar/2008:10:19:21 +0100] - cache autosizing: import cache: 204800k
>>>> [05/Mar/2008:10:19:21 +0100] - li_import_cache_autosize: 50, 
>>>> import_pages: 51200, pagesize: 4096
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Beginning import 
>>>> job...
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Index buffering 
>>>> enabled with bucket size 100
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Processing file 
>>>> "/tmp/ldifZHth0D.ldif"
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Finished scanning 
>>>> file "/tmp/ldifZHth0D.ldif" (9 entries)
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Workers finished; 
>>>> cleaning up...
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Workers cleaned up.
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Cleaning up 
>>>> producer thread...
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Indexing complete. 
>>>> Post-processing...
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Flushing caches...
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Closing files...
>>>> [05/Mar/2008:10:19:21 +0100] - All database threads now stopped
>>>> [05/Mar/2008:10:19:21 +0100] - import userRoot: Import complete.  
>>>> Processed 9 entries in 0 seconds. (inf entries/sec)
>>>> [05/Mar/2008:10:19:22 +0100] - Fedora-Directory/1.1.0 
>>>> B2008.059.1017 starting up
>>>> [05/Mar/2008:10:19:22 +0100] - I'm resizing my cache now...cache 
>>>> was 209715200 and is now 8000000
>>>> [05/Mar/2008:10:19:22 +0100] - slapd started.  Listening on All 
>>>> Interfaces port 389 for LDAP requests
>>>> [05/Mar/2008:10:22:23 +0100] NSMMReplicationPlugin - changelog 
>>>> program - cl5Open: failed to open changelog
>>>> [05/Mar/2008:10:22:24 +0100] NSMMReplicationPlugin - changelog 
>>>> program - changelog5_config_add: failed to start changelog
>>>> [05/Mar/2008:10:26:49 +0100] NSMMReplicationPlugin - 
>>>> agmt="cn=replica to backup" (backup:389): Replica has a different 
>>>> generation ID than the local data.
>>>> [05/Mar/2008:10:32:00 +0100] NSMMReplicationPlugin - 
>>>> repl_set_mtn_referrals: could not set referrals for replica 
>>>> dc=fmintra,dc=hu: 32
>>>> [05/Mar/2008:10:32:00 +0100] NSMMReplicationPlugin - 
>>>> multimaster_be_state_change: replica dc=fmintra,dc=hu is going 
>>>> offline; disabling replication
>>>> [05/Mar/2008:10:32:00 +0100] - WARNING: Import is running with 
>>>> nsslapd-db-private-import-mem on; No other process is allowed to 
>>>> access the database
>>>> [05/Mar/2008:10:32:13 +0100] - import userRoot: Workers finished; 
>>>> cleaning up...
>>>> [05/Mar/2008:10:32:13 +0100] - import userRoot: Workers cleaned up.
>>>> [05/Mar/2008:10:32:13 +0100] - import userRoot: Indexing complete. 
>>>> Post-processing...
>>>> [05/Mar/2008:10:32:13 +0100] - import userRoot: Flushing caches...
>>>> [05/Mar/2008:10:32:13 +0100] - import userRoot: Closing files...
>>>> [05/Mar/2008:10:32:14 +0100] - import userRoot: Import complete.  
>>>> Processed 12242 entries in 13 seconds. (941.69 entries/sec)
>>>> [05/Mar/2008:10:32:14 +0100] NSMMReplicationPlugin - 
>>>> multimaster_be_state_change: replica dc=fmintra,dc=hu is coming 
>>>> online; enabling replication
>>>>
>>>> memory usage by top:
>>>>
>>>> top - 10:58:21 up 25 days, 22:36,  2 users,  load average: 0.01, 
>>>> 0.13, 0.22
>>>> Tasks:  61 total,   2 running,  59 sleeping,   0 stopped,   0 zombie
>>>> Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  
>>>> 0.0%si,  0.0%st
>>>> Mem:    515192k total,   189600k used,   325592k free,    36472k 
>>>> buffers
>>>> Swap:   489848k total,    18292k used,   471556k free,   106188k 
>>>> cached
>>>>
>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 27647 fds       15   0  464m  47m  25m S  0.0  9.4   1:34.57 ns-slapd
>>>>
>>>>
>>>> top - 11:23:12 up 25 days, 23:01,  2 users,  load average: 0.36, 
>>>> 0.27, 0.20
>>>> Tasks:  61 total,   2 running,  59 sleeping,   0 stopped,   0 zombie
>>>> Cpu(s):  3.0%us,  0.0%sy,  0.0%ni, 96.0%id,  1.0%wa,  0.0%hi,  
>>>> 0.0%si,  0.0%st
>>>> Mem:    515192k total,   210700k used,   304492k free,    36488k 
>>>> buffers
>>>> Swap:   489848k total,    18288k used,   471560k free,   117204k 
>>>> cached
>>>>
>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 27647 fds       15   0  473m  59m  28m S  3.0 11.9   2:52.77 ns-slapd
>>>>
>>>>
>>>> top - 11:48:26 up 25 days, 23:26,  2 users,  load average: 0.02, 
>>>> 0.08, 0.10
>>>> Tasks:  61 total,   1 running,  60 sleeping,   0 stopped,   0 zombie
>>>> Cpu(s):  3.0%us,  0.0%sy,  0.0%ni, 97.0%id,  0.0%wa,  0.0%hi,  
>>>> 0.0%si,  0.0%st
>>>> Mem:    515192k total,   222756k used,   292436k free,    36520k 
>>>> buffers
>>>> Swap:   489848k total,    18288k used,   471560k free,   118932k 
>>>> cached
>>>>
>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 27647 fds       15   0  483m  72m  30m S  0.0 14.4   4:12.04 ns-slapd
>>>>
>>>>
>>>> top - 13:31:42 up 26 days,  1:09,  2 users,  load average: 0.28, 
>>>> 0.17, 0.15
>>>> Tasks:  61 total,   2 running,  59 sleeping,   0 stopped,   0 zombie
>>>> Cpu(s):  1.1%us,  0.0%sy,  0.0%ni, 98.9%id,  0.0%wa,  0.0%hi,  
>>>> 0.0%si,  0.0%st
>>>> Mem:    515192k total,   285572k used,   229620k free,    36540k 
>>>> buffers
>>>> Swap:   489848k total,    18288k used,   471560k free,   140412k 
>>>> cached
>>>>
>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 27647 fds       15   0  523m 116m  34m S  0.0 23.3   9:35.65 ns-slapd
>>
>>> Can you post your dse.ldif to pastebin.com?  Be sure to omit or 
>>> obscure any sensitive data first.  I'd like to see what all of your 
>>> cache settings are.  Normally the server will increase in memory 
>>> usage until the caches are full, then memory usage should level 
>>> off.  The speed at which this occurs depends on usage.
>>>
>> http://www.pastebin.org/22477
>>
>> i forget a thing. i use some custom schema (ldapdns, ibm... etc.) if 
>> this is changed anything. (but i think this is not relevant info)
>>
>>> When the kernel kills your server, how much memory is it using?  Is 
>>> there anything in the server error log at around the time the kernel 
>>> kills it?
>>>
>> i'm not sure, but at the time use the maximum as possible (512ram + 
>> 512 swap available) i think around 940mb, the kernel first kill some 
>> other processes, like mc, and after these the ns-slapd. I can't see 
>> anything in the log file, just the server start.
>>
>>> Finally, if you are convinced that there is a real memory leak in 
>>> the server, would it be possible for you to run it under valgrind?  
>>> Just running it under valgrind for 30 minutes or so should reveal 
>>> any memory leaks in normal usage.
>>
>> http://www.pastebin.org/22484
>>
>> I can't understand this output, I never used valgrind before. I hope 
>> used the right options for valgrind.
>>
>
> can you tell me what mean the valgrind's output?
I'm not sure.  The output is truncated, and valgrind is producing a lot 
of spurious errors, or at least errors not in directory server code.  I 
guess pastebin is not going to like a several hundred thousand byte 
output file - is there somewhere else you can post the entire output?
>
> thanks,
>
> KEeF
>
>
> -- 
> Fedora-directory-users mailing list
> Fedora-directory-users at redhat.com
> https://www.redhat.com/mailman/listinfo/fedora-directory-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3245 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.fedoraproject.org/pipermail/389-users/attachments/20080310/7863a291/attachment.bin