Re: 389 vs Sun DS ldapmodify performance

Rich Megginson <rmeggins@xxxxxxxxxx> · Tue, 24 Apr 2012 07:53:47 -0600



    On 04/23/2012 12:20 PM, Russell Beall wrote:
    

        On Apr 23, 2012, at 10:28 AM, Rich Megginson wrote:
        

           That's very
            interesting.  Does Sun DS have some sort of tuning parameter
            for number of values?  That is, they may have some threshold
            for number of values in an attribute - once the number hits
            that threshold, they may switch to using some sort of ADT to
            store the values, like a AVL tree or hash table, rather than
            the simple linked list used by default.

          
        I've compared the dse.ldif of both servers looking
          specifically for attributes that I should transfer from our
          production environment to 389.  The configurations for major
          components are virtually identical and I have seen no
          attribute that relates to the number of values in a
          multi-valued attribute.  I expect that the optimization is a
          behind-the-scenes code improvement.
        

              Also during this testing I have noticed a memory leak
                when running large quantities of ldapmodify operations.
                 When I set up a loop to delete and then re-add the
                eduPersonEntitlement attribute across 100K entries, I
                found that memory consumption continuously increased and
                the server crashed after the fifth iteration of this
                loop.  (And this one really is with ldapmodify and is
                not related to my earlier issues with excessively
                creating tombstones by deleting and adding entire
                entries).  Before digging into this too deeply and
                making another ticket, I wanted to ask if this had been
                noticed and fixed in the 1.2.10 release?  I am using the
                default 1.2.9.16 release. I'm guessing it hasn't since I
                didn't see it in the release notes.
            
            
            Try increasing your nsslapd-cachememsize and monitoring it
            closely.  Using the size of id2entry.db4 is a good place to
            start, but that will not be enough.

          
        Early on in the process of setting up 389 I optimized the
          cachememsize.  I configured a 12G cache, and the cache usage
          after loading all 600K entries is just under 10G.  While the
          ldapmodify operations are in progress, I am pretty sure I did
          not have an increase in the cacheentryusage monitor attribute
          under cn=config, but I'd have to re-check to be sure.
      
    
    You will see an increase due to replication metadata and possibly
    other factors.

    
        Unfortunately, with valgrind attached, the server uses much
          extra memory on startup and does not complete the startup
          operation before running out of memory on my 32GB machine.  I
          have had to reduce the cachememsize so that it will start.
           It's been starting up for two hours and finally stopped
          allocating more memory at 24G (with only a 3G cachememsize
          configured).  I'll probably have to delete out a large
          quantity of entries to run the test within the bounds of the
          cachememsize.
      
    
    Ok - so valgrind is probably not an option.

    
          http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Monitoring_Server_and_Database_Activity-Monitoring_Database_Activity.html

            
            See also https://fedorahosted.org/389/ticket/51
            and https://bugzilla.redhat.com/show_bug.cgi?id=697701
            

        This bug appears very different from what I am looking at.
           The ldapmodify I run makes a single connection and transmits
          a large file of operations to perform value deletions on 100K
          entries, followed by a new connection to transmit value
          additions to 100K entries contained within a single large
          file, and then loop around to do the same thing again.  This
          emulates the behavior of our directory synchronization script
          which calculates large quantities of necessary modifications
          and then submits them all in an ldif file.
      
    
    The thing in common is this - when the cache usage hits the cache
    max size, you see unbounded memory growth.

    
              I am starting up the server with the valgrind command
                you recommended a few messages back to see if I can spot
                the leak, though of course with valgrind in the mix, the
                overhead and runtimes are, as might be expected, much
                increased.
            
            
            Yes, and valgrind will report many false positives that are
            hard to weed through.

            
            The issue you are seeing may not be a memory leak per se -
            see the ticket/bug above.

          
        Ok.  I'll see if there is anything I can pull from the rough.
      

      Regards,
      Russ.
      

      --
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users
    
    
--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users