On Mon, Feb 4, 2019, at 22:00, Michael Menge wrote:
Quoting Bron Gondwana <brong@xxxxxxxxxxx>:> On Mon, Feb 4, 2019, at 20:21, Michael Menge wrote:>> Hi,>>>> Quoting Bron Gondwana <brong@xxxxxxxxxxxxxxxx>:>>>> > Hi Michael,>> >>> > Sorry about the delay in looking at this - I was mad crazy busy>> > getting ready to go overseas. At Fosdem now, about to give a talk>> > about JMAP!>> >>> > OK, let's start with the things that give me a little bit of hives...>> >>> > configdirectory: /srv/cyrus-be>> > partition-default: /srv/cyrus-be>> > partition-ssd: /srv/cyrus-be/ssd-part>> >>> > Ouch. There's a couple of things I wouldn't do there - having the>> > partition be the same as the config directory, and having a separate>> > partition be a subdirectory of a partition. They're both asking for>> > trouble. I would probably lay my system out like:>> >>> > configdirectory: /srv/cyrus-be/conf>> > partition-default: /srv/cyrus-be/default-part>> > partition-ssd: /srv/cyrus-be/ssd-part>> >>>>> partition-default isn't used any more. To use the metapartition we moved>> all accounts form the default partition to the ssd partition which is the>> the new defaultpartition ("defaultpartition: ssd")>> Right - that makes sense.>>> > And then each tree would only have one type of thing in it.>> >>> > Anyway, I don't think that would break anything.>> >>> > metapartition-ssd: /srv/cyrus-ssd-be/meta/ssd-part>> > metapartition_files: header index cache expunge squat annotations>> > lock dav archivecache>> >>> > Ooh, I haven't tested having cache and archivecache on the same>> > location. That's really interesting. Again, I'd be in favour of>> > separation here, give them different paths. That might be tricky>> > with ssd though, the way this is laid out. I assume you have some>> > kind of symlink farm going on?>> >>>>> I didn't know that there could be a problem with cache and archivecache.>> At the time we decided on the configuration for cyrus 3.0 I looked at the>> imapd.conf man page and for metapartition_files decided that I want all>> meta files on the ssd storage. There was no indication in the man page>> that there could be a problem.>> Fair. I'd have to test that to see if it works correctly. I would> hope so, but I haven't tested that configuration. This is the> downside with having lots of different ways to do things!>>> How do I separate location of archivecache from the other>> metapartition path?>> And fix the cache and archivecache files?>> This I don't know a good answer for. I will test if having the same> path for cache and archivecache could fail. I THINK that I made the> code safe for it, but I'm not sure that it's been tested.>>> No there is no sysmlink farm. We have mounted different iSCSI volumes to>> /srv/cyrus-ssd-be, /srv/cyrus-hdd-be and /srv/cyrus-be>> Right. That makes sense.>>> > Otherwise it all looks OK. Are you getting other IOERRORs in your>> > log files which could show things aborting? It really looks like>> > your conversations DB is getting out of sync due to other failures.>>>> I found a few instances of 3 related errors.>>>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR archive>> user.XXXX 2185 failed to copyfile>> (/srv/cyrus-be/ssd-part/L/user/XXXX/2185. =>>> /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.): Unknown code>> ____ 255>>> Ouch. Yeah, that could have been caused by a bug in delivery, and> would definitely cause conversations DB corruption if the index file> was updated but the conversations DB wasn't or vice versa.>>> The file was already at /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.>> Right! I do wonder if there are some bugs in 3.0.x which are fixed> on master around delivery to archive partition. We definitely had> bugs on master, but I thought they were newly introduced on master> as well, which is why the fixes weren't backported. But if you're> having files be in the wrong location, maybe there are bugs on 3.0.x> as well.>> Do you have the syslog lines at the time that email was delivered?I dont' have the log, for that message, but I will search for amore recent example.
Great.
From the mail headers i know that it was not dilivered to the archivepartitionbut moved by cyr_expire. The conversation db was not used at that time.
OK - that shouldn't matter then - because the conversations rebuild should have found it.
PS.: the timesamp of the file is not the internal date but the timethe mail was moved to the archive partition. I was wondering about the reason.
Hmm, yeah:
r = cyrus_copyfile(srcname, destname, COPYFILE_MKDIR);
That's how the file is moved. It only does a hardlink if it's the same filesystem. Interestingly, it does NOT set the timestamp correctly. This is clearly a bug.
Bron.
--
Bron Gondwana, CEO, FastMail Pty Ltd
brong@xxxxxxxxxxxxxxxx
---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus