Re: Migration from cyrus 3.4.3 to 3.6.0~beta2-1 failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

(Replying to both messages in one...)

On Sun, 20 Mar 2022, at 6:53 PM, fr.hamelin+cyrus@xxxxxxxxx wrote:
Hello,
Using debian bookworm, the cyrus-imapd was updated from 3.4.3 to 3.6.0~beta2-1 this Friday. There was no warning about any migration or post check to perform before applying the updated packages.
The update procedure didn't went well but I didn't realized it.
process type:START name:recover path:/usr/sbin/cyrus age:0.000s pid:678146 signaled to death by signal 11 (Segmentation fault)

I see you've reported this on the Debian bug tracker too, great!  Here's a link for anyone else reading this, so the two conversations don't occur in isolation: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1007965

You may wish to also update the bug report to include a link back to this discussion.  There's a "Permalink" link at the bottom of the message that will work for this.  For this thread, the link is: https://cyrus.topicbox.com/groups/info/T3e85440ddbb44ec6-Maf769decd0dd45d4572145b8

I don't suppose you got a core dump from that segmentation fault?  It would be very helpful, but I guess probably not, otherwise it would have said something like "(Segmentation fault - core dumped)".

I restarted cyrus via systemctl and saw that I couldn't access to mail,
despite the fact that all the mails are still in the usual (legacy?) /var/spool/cyrus/mail/f/fuser.
I also saw that a new user was created under uuid folder and this user is receiving my new mails.

The first start should have upgraded mailboxes.db to support storing mailboxes by uuid or by name.  This part is supposed to be seamless, but it sounds like something in your mailboxes.db confused it and it crashed out, and now mailboxes.db is in an unknown or partially-converted state... Hmm.  And then on the second start, because of the now-weird mailboxes.db it didn't know where to find your existing user, mistook it for a new user instead, and created it in the new storage style.

I would really like to get to the bottom of that first crash, because everything you describe since sounds like a consequence of that.

I tried to downgrade to previous version but cyrus couldn't find my legacy user's mailbox.

Off the top of my head, I'm not sure if the older version can read a post-upgrade mailboxes.db.  Though even if it could, if your mailboxes.db is broken then that goes out the window...

I reinstalled the beta2 and checked on your website. I tried to perfrom the relocate_by_id command but I could not find it in any debian packages.

Oof, yeah that seems like a packaging bug.  I see you've already reported it.  The other new tools in 3.6 are cyr_ls and cyr_cd.sh.  Do you have those, or are they missing too?  If they're also missing, please report that on the Debian tracker too.

relocate_by_id is for converting your existing mailboxes to the new uuid storage.  Theoretically you shouldn't need to use it until sometime after you've upgraded, when you're ready to switch your existing mailboxes to the new storage.

I would refer you to the upgrading docs, but since you've discovered relocate_by_id (and discovered that it's missing) then I think that means you already found them.  Though, for other readers' benefit, I'll link them here anyway: https://www.cyrusimap.org/3.6/imap/download/upgrade.html

I copied all mails from my /var/spool/cyrus/mail/f/fuser to the uuid folder and performed several reconstruct commands.
I managed to get my mails back under the user with uuid but all the sublfolders failed to reconstruct (segmentation fault)

cyrus/reconstruct[54080]: IOERROR: lock failed: mailbox=<user.fuser.myfolder> error=<Invalid mailbox name> syserror=<No such file or directory> func=<mailbox_open_advanced>

and, yes, fuser.myfolder exists under /var/spool/cyrus/mail/uuid/

Do you mean:
* there is a uuid directory on disk for the "myfolder" mailbox?
* or, the uuid directory on disk for the "fuser" mailbox contains a "myfolder" subdirectory?

The former is correct, the latter won't work.  I assume you mean the latter, since you wouldn't be able to generate correct uuids for the subfolders by copying files around.

In that case, at this point, there's a few things wrong:

* mailboxes.db is in an unknown state, from the initial crash
* mailboxes on disk are in the wrong places, from the file system copying

So further crashes and errors from here wouldn't surprise me...

Could you help me recover my subfolders and I suppose my sieve, addresbooks and calendars?
I still have the backup of /var/spool/cyrus and /var/spool/sieve I have made after the restart due to initial segmentation fault (which means update process was started)

Do I understand correctly that you took this backup _after_ the failed upgrade, and therefore it is also corrupted?  I'm not sure that's useful for anything, but maybe it's better than whatever you have now after tinkering.

Do you have a backup from before the upgrade?  If so, you could remove 3.6, reinstall 3.4, and restore your data from that...

Thank you so much in advance for your help.

Sorry this isn't more helpful...

On Tue, 22 Mar 2022, at 1:05 AM, Andy Dorman wrote:
I think this may have happened to us also.

We are running Debian/Postfix/Cyrus on 14 servers and just updated to
3.6.0~beta2-1+b1.

During the update process, 3 of the 14 servers reported a segfault
during the update. After the apt process was finished I was able to
restart cyrus without any problem.

Well, I guess it's reassuring that 11 of the servers succeeded!  This suggests a missed edge case rather than something just being fundamentally broken.  Silver linings...

This morning we are beginning to see postfix LMTP bounces from the 3
servers that had the segfault to addresses that we have confirmed with
do exist.
Postfix LMTP error
--------------------
550-Mailbox unknown.  Either there is no mailbox associated with this
550-name or you do not have authorization to see it. 550 5.1.1 User
unknown (in reply to RCPT TO command))

Confirmation mailbox exists
------------------------------
cyradm -user cyrus localhost
Password:
localhost.ironicdesign.com> lm
user/grandkids@xxxxxxxxxxxx (\HasNoChildren)
...

Do the LMTP bounces happen for all users on those 3 servers, or only some?

If it's only some that are bouncing, I guess the mailboxes.db records for the bouncing mailboxes must have been missed by the conversion process when it crashed out.  I'm not sure what it means if it's all mailboxes bouncing.

Not sure how to confirm that the grandkids mailbox is in uuid except I
would assume that cyradm is searching the uuid storage space.

The mbpath(8) tool will tell you where it thinks they are.

Any thoughts on how to try recovering the mail for these addresses?

I've asked some others to chime in on this thread who might have experience/insight into this, so we will see what they say when they do.  Sorry I do not!

Cheers,

ellie

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux