[389-users] Re: Inconsistent Ldap connection issues

Thierry Bordaz via 389-users <389-users@xxxxxxxxxxxxxxxxxxxxxxx> · Wed, 16 Oct 2024 10:02:09 +0200



    On 10/16/24 2:26 AM, William Brown via
      389-users wrote:

    
              These errors are only shown on the
                client, yes? Is there any evidence of a failed
                connection in the access log? 
              Correct, those 2 different contacting ldap error issues. I
              have searched for various things in the logs, but I havent
              read it line by line. I dont see "err=1", no fd errors, or
              "Not listening for new connections - too many fds open". 

            
        So, that means the error is happening *before* 389-ds gets
          a chance to accept on the connection. 
        

        Are there any routers, middlewares, firewalls, idp's etc
          between the client/ldap server? Load balancer? 
        

              We encountered a similar issue
                recently with another load test, where the load tester
                wasn't averaging it's connections, it would launch
                10,000 connections at once and hope they all worked.
                With your load test, is it actually spreading it's
                connections out, or is it bursting?
              It's a ramp up of 500 users logging in and starting their
              searches, the initial ramp up is 60 seconds, but the
              searches and login/logouts is over 6 minutes.  I just
              spliced up the logs to see what that first minute was
              like:
               Peak Concurrent Connections:   689

                Total Operations:              18770

                Total Results:                 18769

                Overall Performance:           100.0%

                
                Total Connections:             2603         
                (21.66/sec)  (1299.40/min)

                 - LDAP Connections:           2603         
                (21.66/sec)  (1299.40/min)

                 - LDAPI Connections:          0             (0.00/sec) 
                (0.00/min)

                 - LDAPS Connections:          0             (0.00/sec) 
                (0.00/min)

                 - StartTLS Extended Ops:      2571         
                (21.39/sec)  (1283.42/min)

                
                Searches:                      13596        
                (113.12/sec)  (6787.01/min)

                Modifications:                 0             (0.00/sec) 
                (0.00/min)

                Adds:                          0             (0.00/sec) 
                (0.00/min)

                Deletes:                       0             (0.00/sec) 
                (0.00/min)

                Mod RDNs:                      0             (0.00/sec) 
                (0.00/min)

                Compares:                      0             (0.00/sec) 
                (0.00/min)

                Binds:                         2603         
                (21.66/sec)  (1299.40/min)

                
              With these settings below, the test results are in,
                they still get 1 ldap error per test.

              
    Any chance that you can get a tcp-dump over the 6 minutes and try
      to find the syn without ack around the time of the failure ?
    

              net.ipv4.tcp_max_syn_backlog = 8192

              
              net.core.somaxconn = 8192
              Suggestions ? Should I bump these up more ? 

              
        We still don't know what the cause *is* so just tweaking values
        won't help. We need to know what layer is triggering the error
        before we make changes. 
      

      Reading these numbers, this doesn't look like the server
        should be under any stress at all - I have tested with 2cpu /
        4gb ram and can easily get 10,000 simultaneous connections
        launched and accepted by 389-ds.  
      

      My thinking at this point is there is something in between
        the client and 389 that is not coping. 
      

            -- 

              Sincerely,

              
              William Brown

              
              Senior Software Engineer,

              Identity and Access Management

              SUSE Labs, Australia
          
        
-- 
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue