Well, I did another try to move to 2.5 branch. I make some preparation this time: 1. took 2.5.10 and put twoskip & skiplist patches from master [PATCH] cyrusdb: add "CYRUSDB_NOCOMPACT" open flag to avoid [PATCH] twoskip: release the readlock in foreach every 256 misses 2. put mailboxes.db to /dev/shm 3. boot with lazytime mount option and do some tune of disks 4. use latest kernel and gcc to compile cyrus I managed to update cyrus at Fri evening (using twoskip for mailboxes.db and skiplist for other tables, improved_mboxlist_sort:1) and it ran till Mon morning. Ran not well - list command ran up to 3 sec when it take ~ 0.5 with skiplist on 2.4 - no long locks at least. CPU usage was strange - few core were loaded up to 100% time to time while others like 25-30% with ~40 simultaneous imap connections, mostly short from webmail client. overal load was 2-3 on 8 core cpu There is a strace output from one of the imapd process https://justpaste.it/10waf Looks like there is a issue with finding last record from maiboxes.db - that took a lot of time and locking attempts Things went realy bad at Mon 10am when number of connections increased up to 120 and begun to grow. imap processes started to lock with 100% cpu usage. I tried to set limit for 100 of imapd count - not helped. Tried to convert to skiplist without success: Nov 28 12:04:29 srv1 imap[15926]: skiplist: longlock /var/imap/mailboxes.db for 259.4 seconds Nov 28 12:04:29 srv imap[15779]: skiplist: longlock /var/imap/mailboxes.db for 263.1 seconds output of atop: http://pastebin.com/raw/QVm4hkK8 vmstat: http://pastebin.com/raw/geP3NnqL so i went back to 2.4 aonce again. Load dropped to ~2 on 8 core cpu in half of hour. I did test on same hardware with same mailboxes.db file running following commands in loop in parallel for 400 concurrent sessions for random users for few days without any performance degradation: 0 login 0 list "" "*" 0 CREATE $fldr 0 SELECT $fldr 0 logout list completed in 0.001 sec mostly load average: 49.70, 41.49, 38.72 # netstat -anp | grep EST | grep 143 | wc -l 248 It looks like there is problem of locking mailboxes.db in code not in LIST command. May be new mailboxes.db traversing code has some pitfalls ? Bron, which are major differences with mailboxes.db usage since 2.4 ? I would like to do more test, can you direct me ? Deniss On 2016.11.18. 2:07, Bron Gondwana wrote: > On Fri, 18 Nov 2016, at 10:51, Wolfgang Breyha via Info-cyrus wrote: >> On 17/11/16 14:00, Deniss via Info-cyrus wrote: >>> Any ideas or suggestion for investigation ? >> >> I already filed a bug >> https://github.com/cyrusimap/cyrus-imapd/issues/43 >> but no response so far. I directly asked Bron, but no response as well. > > Sorry, I really don't have a clue. 2.5 does have a different mailboxes.db format, so it's a bit more CPU intensive. The real massive win for CPU usage is going to come with reverse ACLs: > > https://blog.fastmail.com/2015/12/05/reverse-acls-making-imap-list-fast/ > > But to get there, we need to solve reverse ACLs for groups. I did ask about it here: > > https://lists.andrew.cmu.edu/pipermail/info-cyrus/2015-November/038628.html > > But then didn't follow up to add group reverse ACL support in Cyrus, so reverse ACLs are broken if you're using groups. > > Bron. > ---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus