Re: Cyrus 2.5, xapian, Sphinx and index sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bron,

thanks a lot for this detailed description of your setup! I have added a few questions inline below ...

--On 23. September 2014 21:32:04 +1000 Bron Gondwana <brong@xxxxxxxxxxx> wrote:

On Tue, Sep 23, 2014, at 06:58 PM, Sebastian Hagedorn wrote:
Hi,

as I mentioned a few days ago, we're considering metapartions on SSD
drives  in order to optimize IMAP search performance. We have yet to run
a full  analysis on how much storage that would require, but a first
guesstimate  points towards about 20% of the net mail data for all the
cyrus.* files  when using SQUAT.

Bron mentioned support for xapian in 2.5, so I took a look at the branch
and noticed that there isn't only support for xapian, but actually a
choice  of SQUAT, xapian and Sphinx. Eventually I'd like to learn the
pros and cons  of the various choices, but right now I have mainly one
concern:

Will index files be larger with xapian or Sphinx? Will they also be
stored  on the metapartions? My concern is that we might run out of
space on those  metapartitions if we choose a different indexer ...
what's the operational  experience regarding that at Fastmail?

So Sphinx was just too IO intensive, we had to ditch it entirely, but we
didn't kill the code.  It's probably stale though - I wouldn't use it
without doing a ton of testing.

OK.

10% is a reasonable estimate for search.  We run a 3Tb search partition
for 20T of email storage, and it's nowhere near full.  Here's one with 20
slots, 18 of which are in use:

/dev/mapper/md2                                         2.7T  988G  1.8T
36% /mnt/i32d2search /dev/mapper/sdb1
917G  573G  298G  66% /mnt/i32d2t01 /dev/mapper/sdb2
917G  576G  295G  67% /mnt/i32d2t02 /dev/mapper/sdb3
917G  571G  300G  66% /mnt/i32d2t03 /dev/mapper/sdb4
917G  573G  298G  66% /mnt/i32d2t04 /dev/mapper/sdb5
917G  702G  169G  81% /mnt/i32d2t05 /dev/mapper/sdb6
917G  743G  128G  86% /mnt/i32d2t06 /dev/mapper/sdb7
917G  697G  174G  81% /mnt/i32d2t07 /dev/mapper/sdb8
917G  760G  111G  88% /mnt/i32d2t08 /dev/mapper/sdb9
917G  763G  108G  88% /mnt/i32d2t09 /dev/mapper/sdb10
917G  727G  144G  84% /mnt/i32d2t10 /dev/mapper/sdb11
917G  754G  117G  87% /mnt/i32d2t11 /dev/mapper/sdb12
917G  757G  114G  87% /mnt/i32d2t12 /dev/mapper/sdb13
917G  706G  165G  82% /mnt/i32d2t13 /dev/mapper/sdb14
917G  746G  125G  86% /mnt/i32d2t14 /dev/mapper/sdb15
917G   72M  870G   1% /mnt/i32d2t15 /dev/mapper/sdb16
917G   72M  870G   1% /mnt/i32d2t16 /dev/mapper/sdb17
917G  704G  167G  81% /mnt/i32d2t17 /dev/mapper/sdb18
917G  774G   97G  89% /mnt/i32d2t18 /dev/mapper/sdb19
917G  722G  149G  83% /mnt/i32d2t19 /dev/mapper/sdb20
917G  741G  130G  86% /mnt/i32d2t20 /dev/md1
367G  249G  118G  68% /mnt/ssd32d2

sdb1-20 are LUKS encrypted partitions on a single hardware RAID6 volume
with 12 x 2Tb WD RE4 drives. md2 is also LUKS encrypted, but it's a
software RAID1e with 3 x 2Tb WD RE4 drives. md1 is 400Gb Intel DC3700
drives in software RAID1.  It's not using LUKS because the drives support
encryption on-disk, so we're using that.

What do you use LUKS for? My best guess would be to make it easier to toss out broken drives without having to worry about personal data remaining on them?

So how do we structure our search?  It's complicated.  There are 4
"tiers" of storage.  The first tier is tmpfs, the second is ssd (it's not
used much though), the third is on the search partition, and the 4th is
ALSO on the search partition, but it's there for archive purposes, so we
can compact most of the long-term search down to a single index without
having to rewrite it every week.

So you only use fast storage for writing? Isn't there a big performance hit for searches on the data and archive partitions? I wonder why you don't use SSDs for those.

Xapian supports reading from multiple databases.

So the config on my server (we're moving to another machine here) is:

search_engine: xapian
search_index_headers: no
search_batchsize: 8192
defaultpartition: default
defaultsearchtier: temp
tempsearchpartition-default: /var/run/cyrus/search-sloti30t01
metasearchpartition-default: /mnt/ssd30/sloti30t01/store23/search
datasearchpartition-default: /mnt/i30search/sloti30t01/store23/search
archivesearchpartition-default:
/mnt/i30search/sloti30t01/store23/search-archive

(layout is similar, but imap30 is a smaller machine, with just a single
set)

So by default it always indexes to temp, which gets us close-to-realtime
indexing with a squatter that watches the sync_log directory for changes,
and without causing too much random IO.

Compress is run from cron:

# Any time the disk gets over 50%, compress -o single down to data
13 *  * * * /home/mod_perl/hm/scripts/xapian_compact.pl -a -o -d 50 temp
data
# Copy the temporary search databases down to data during the week
43 1  * * 1,2,3,4,5,6 /home/mod_perl/hm/scripts/xapian_compact.pl -a
temp,meta data
# Sundays repack the entire data directory with filtering of deleted
# messages
43 1  * * 0 /home/mod_perl/hm/scripts/xapian_compact.pl -a -F
temp,meta,data data

I'll attach the xapian_compact.pl script to this email.

Why is there no job for archiving? You don't really do that manually, I suppose?

($Slot->RunCommand is pretty much system with a ton of magic around it)

With this layout, we get a few different search indexes throughout the
week, we check every hour that we don't waste too much memory on tmpfs,
and we get IO efficiency with the compacts being in the quieter times.

The xapian compact code in Cyrus does clever locking to allow it to
compact all the existing databases while creating a brand new temp
database to index new messages.

[brong@imap30 hm]$ du -s /var/run/cyrus/search-sloti30t01/b/user/brong/*
79944	/var/run/cyrus/search-sloti30t01/b/user/brong/xapian.225

[brong@imap30 hm]$ du -s
/mnt/i30search/sloti30t01/store23/search*/b/user/brong/* 1739980
	/mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian
21516
	/mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.1
1392840	/mnt/i30search/sloti30t01/store23/search/b/user/brong/xapian.218
63676	/mnt/i30search/sloti30t01/store23/search/b/user/brong/xapian.219
385936	/mnt/i30search/sloti30t01/store23/search/b/user/brong/xapian.220

Wow, it looks like I'm due for an archiving!

[brong@imap30 hm]$ sudo -u cyrus /usr/cyrus/bin/squatter -C
/etc/cyrus/imapd-sloti30t01.conf -v -i -z archive -t
temp,meta,data,archive -u brong compressing
temp:225,archive:0,archive:1,data:218,data:219,data:220 to archive:2 for
user.brong (active
temp:225,archive:0,archive:1,data:218,data:219,data:220) adding new
initial search location temp:226
compacting databases
sloti30t01/squatter[2365398]: twoskip: checkpointed
/mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.2.NE
W/cyrus.indexed.db (107 records, 17240 => 10600 bytes) in 0.003 seconds
Compressing messages for brong
done
/mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.2.NEW
renaming tempdir into place
finished compact of user.brong (active temp:226,archive:2)

That took a few minutes, and now:

[brong@imap30 hm]$ du -s
/mnt/i30search/sloti30t01/store23/search*/b/user/brong/* 3365336
	/mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.2
[brong@imap30 hm]$ du -s /var/run/cyrus/search-sloti30t01/b/user/brong/*
168	/var/run/cyrus/search-sloti30t01/b/user/brong/xapian.226

I just have the one search index, nicely and efficiently compacted - plus
a tiny new one with new messages being indexed.

Thanks
Sebastian
--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
                .:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

Attachment: p7sDgBO_HF3Hz.p7s
Description: S/MIME cryptographic signature

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux