Hi Ellie,
Thanks for helping me look at this.
On 2019-10-09 16:17, ellie timoney
wrote:
Does the same problem occur if you use sync_client (on the master server, as the cyrus user) to replicate the shared mailbox to the backup server (rather than using XBACKUP over IMAP)? Something like "sync_client -n rsync -m support@xxxxxxxxxxxxxxx" I think?
With -m I get this familiar output on the master:
# /usr/lib/cyrus/bin/sync_client -v -n rsync -m
support@xxxxxxxxxxxxxxx
MAILBOXES polyfoam.com.au!support
Error from sync_do_mailboxes(): bailing out!
and this is seen in the log on the backup server:
Oct 11 17:39:58 rsync cyrus/backupd[3969]: login:
mail-3175-1.polyfoam.com.au [10.3.244.125] rsync-mail-3175-1
DIGEST-MD5 User logged in
Oct 11 17:39:58 rsync cyrus/backupd[3969]: created decompress
buffer of 4102 bytes
Oct 11 17:39:58 rsync cyrus/backupd[3969]: created compress buffer
of 4073 bytes
Oct 11 17:39:58 rsync cyrus/backupd[3969]: decompressed 47 ->
41 bytes
Oct 11 17:39:58 rsync cyrus/master[458]: process type:SERVICE
name:backupd path:/usr/lib/cyrus/bin/backupd age:201.603s pid:3969
signaled to death by signal 11 (Segmentation fault, core dumped)
Oct 11 17:39:58 rsync cyrus/master[458]: service backupd/ipv4 pid
3969 in BUSY state: terminated abnormally
Oct 11 17:39:58 rsync cyrus/master[458]: service backupd/ipv4 now
has 0 ready workers
There, a core dump. Here is what I get from a backtrace:
# coredumpctl gdb -1
PID: 3969 (backupd)
UID: 103 (cyrus)
GID: 8 (mail)
Signal: 11 (SEGV)
Timestamp: Fri 2019-10-11 17:39:58 AEDT (2min 16s ago)
Command Line: /usr/lib/cyrus/bin/backupd
Executable: /usr/lib/cyrus/bin/backupd
Control Group: /system.slice/cyrus-imapd.service
Unit: cyrus-imapd.service
Slice: system.slice
Boot ID: c887b7eb1d734962b8bddb745df21e8f
Machine ID: facebc4e2dcd47a68a097acc9077814e
Hostname: rsync
Storage:
/var/lib/systemd/coredump/core.backupd.103.c887b7eb1d734962b8bddb745df21e8f.3969.1570775998000000.lz4
Message: Process 3969 (backupd) of user 103 dumped core.
Stack trace of thread 3969:
#0 0x00007f95c17e3206 __GI___strlen_sse2
(libc.so.6)
#1 0x00007f95c2080119 xstrdup (libcyrus_min.so.0)
#2 0x00005557e89fe440 is_mailboxes_single_user
(backupd)
#3 0x00005557e89eec88 main (backupd)
#4 0x00007f95c176f09b __libc_start_main
(libc.so.6)
#5 0x00005557e89ef34a _start (backupd)
GNU gdb (Debian 8.2.1-2+b1) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/cyrus/bin/backupd...Reading symbols
from
/usr/lib/debug/.build-id/e3/b2619440ce57c6ae7db282266976a826059cf2.debug...done.
done.
[New LWP 3969]
[Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/cyrus/bin/backupd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __strlen_sse2 () at
../sysdeps/x86_64/multiarch/../strlen.S:120
120 ../sysdeps/x86_64/multiarch/../strlen.S: No such file or
directory.
(gdb) bt
#0 __strlen_sse2 () at
../sysdeps/x86_64/multiarch/../strlen.S:120
#1 0x00007f95c2080119 in xstrdup (str=0x0) at lib/xmalloc.c:95
#2 0x00005557e89fe440 in is_mailboxes_single_user
(dl=0x5557e9472900) at backup/backupd.c:1438
#3 cmd_get (dl=0x5557e9472900) at backup/backupd.c:1489
#4 cmdloop () at backup/backupd.c:688
#5 service_main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at backup/backupd.c:282
#6 0x00005557e89eec88 in main (argc=<optimized out>,
argv=<optimized out>, envp=0x7ffc01307b38)
at master/service.c:638
(gdb) up
#1 0x00007f95c2080119 in xstrdup (str=0x0) at lib/xmalloc.c:95
95 lib/xmalloc.c: No such file or directory.
(gdb) up
#2 0x00005557e89fe440 in is_mailboxes_single_user
(dl=0x5557e9472900) at backup/backupd.c:1438
1438 backup/backupd.c: No such file or directory.
(gdb) p *dl
$2 = {name = 0x5557e9415590 "MAILBOXES", head = 0x5557e947c0b0,
tail = 0x5557e947c0b0, next = 0x0, type = 10,
sval = 0x0, nval = 0, gval = 0x0, part = 0x0}
(gdb)
> What about if you use "sync_client -n rsync -u support@xxxxxxxxxxxxxxx" instead (i.e. with -u treating the shared mailbox as a USER rather than as a -m MAILBOX)?
This doesn't crash, but it instead transfers a since-deleted full user support@xxxxxxxxxxxxxxx (from earlier dross in my database before I made support@xxxxxxxxxxxxxxx a public mailbox) to the backup server:
# /usr/lib/cyrus/bin/sync_client -v -v -n rsync -u
support@xxxxxxxxxxxxxxx
cyrus/sync_client[201108]: couldn't authenticate to backend
server: no mechanism available
>1570776887>COMPRESS DEFLATE
<1570776887<OK DEFLATE active
USER support@xxxxxxxxxxxxxxx
>1570776887>GET USER support@xxxxxxxxxxxxxxx
<1570776887<* MAILBOX %(UNIQUEID jkt2bnhgvdgxbotfkmo7jz9j
MBOXNAME polyfoam.com.au!user.support MBOXTYPE NIL LAST_UID 3
HIGHESTMODSEQ 11 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE
1570530398 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY
1570528039 PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan debbiep@xxxxxxxxxxxxxxx lrswipkxtecda "
OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0 QUOTAROOT NIL XCONVMODSEQ 0
ANNOTATIONS (%(ENTRY /comment USERID "" VALUE "Support (Polyfoam
group)") %(ENTRY /comment USERID admdebbiep VALUE Support)))
* MAILBOX %(UNIQUEID 33oidc614u2i31hqutvkywu8 MBOXNAME
polyfoam.com.au!user.support.Archive MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0 ANNOTATIONS (%(ENTRY /specialuse
USERID support@xxxxxxxxxxxxxxx VALUE {8+}
\archive)))
* MAILBOX %(UNIQUEID 5wwsybv1ftur0vwaeeuvb0pk MBOXNAME
polyfoam.com.au!user.support.Drafts MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 2 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0 ANNOTATIONS (%(ENTRY /specialuse
USERID support@xxxxxxxxxxxxxxx VALUE {5+}
\sent)))
* MAILBOX %(UNIQUEID 7k6a536yhh9zc3t3j5f9z93u MBOXNAME
polyfoam.com.au!user.support.Important MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0 ANNOTATIONS (%(ENTRY /specialuse
USERID support@xxxxxxxxxxxxxxx VALUE {10+}
\important)))
* MAILBOX %(UNIQUEID zdl5xlyfx1wtcpp9gd3ce50l MBOXNAME
polyfoam.com.au!user.support.Junk MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0 ANNOTATIONS (%(ENTRY /specialuse
USERID support@xxxxxxxxxxxxxxx VALUE {5+}
\junk)))
* MAILBOX %(UNIQUEID pi3ppdusz8pgio8lyrkj7tiw MBOXNAME
polyfoam.com.au!user.support.Sent MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0)
* MAILBOX %(UNIQUEID a888coldtyz7dwop51uuqbot MBOXNAME
polyfoam.com.au!user.support.Trash MBOXTYPE NIL LAST_UID 0
HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0
POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1570528039
PARTITION default ACL "support@xxxxxxxxxxxxxxx
lrswipkxtecdan " OPTIONS P SYNC_CRC 0 SYNC_CRC_ANNOT 0
QUOTAROOT NIL XCONVMODSEQ 0 ANNOTATIONS (%(ENTRY /specialuse
USERID support@xxxxxxxxxxxxxxx VALUE {6+}
\trash)))
* LSUB (polyfoam.com.au!user.support
polyfoam.com.au!user.support.Archive
polyfoam.com.au!user.support.Drafts
polyfoam.com.au!user.support.Junk
polyfoam.com.au!user.support.Sent
polyfoam.com.au!user.support.Trash)
OK Success
cyrus/sync_client[201108]: Inbox missing on master for
support@xxxxxxxxxxxxxxx
UNUSER support@xxxxxxxxxxxxxxx
>1570776887>APPLY UNUSER support@xxxxxxxxxxxxxxx
<1570776887<OK Success
>1570776887>EXIT
<1570776887<OK Finished
But I think that's a red herring. If I do it for a public
mailbox that has never had a matching user, I get this:
# /usr/lib/cyrus/bin/sync_client -v -v -n rsync -u
info2@xxxxxxxxxxxxxxx
cyrus/sync_client[201121]: couldn't authenticate to backend
server: no mechanism available
>1570777011>COMPRESS DEFLATE
<1570777011<OK DEFLATE active
USER info2@xxxxxxxxxxxxxxx
>1570777011>GET USER info2@xxxxxxxxxxxxxxx
<1570777011<OK Success
and running with the -m option on info2 produces the same
segfault as above.
On the backup server, what does the "ctl_backups verify -vvv -m polyfoam.com.au!support" command say about the shared mailbox?
It segfaults, not the same as above.
# /usr/lib/cyrus/bin/ctl_backups verify -vvv -m
'polyfoam.com.au!support'
Segmentation fault (core dumped)
# coredumpctl gdb -1
PID: 3927 (ctl_backups)
UID: 103 (cyrus)
GID: 8 (mail)
Signal: 11 (SEGV)
Timestamp: Fri 2019-10-11 17:26:10 AEDT (50s ago)
Command Line: /usr/lib/cyrus/bin/ctl_backups verify -vvv -m
polyfoam.com.au!support
Executable: /usr/lib/cyrus/bin/ctl_backups
Control Group: /user.slice/user-1000.slice/session-1.scope
Unit: session-1.scope
Slice: user-1000.slice
Session: 1
Owner UID: 1000 (localadmin)
Boot ID: c887b7eb1d734962b8bddb745df21e8f
Machine ID: facebc4e2dcd47a68a097acc9077814e
Hostname: rsync
Storage:
/var/lib/systemd/coredump/core.ctl_backups.103.c887b7eb1d734962b8bddb745df21e8f.3927.1570775170000000.lz4
Message: Process 3927 (ctl_backups) of user 103 dumped
core.
Stack trace of thread 3927:
#0 0x00007fc24cfdb206 __GI___strlen_sse2
(libc.so.6)
#1 0x0000557f0137c495 backup_get_paths
(ctl_backups)
#2 0x0000557f0136d678 main (ctl_backups)
#3 0x00007fc24cf6709b __libc_start_main
(libc.so.6)
#4 0x0000557f0136db6a _start (ctl_backups)
GNU gdb (Debian 8.2.1-2+b1) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/cyrus/bin/ctl_backups...Reading
symbols from
/usr/lib/debug/.build-id/11/23ce6d2f413e1384c144165f996813ad4924c0.debug...done.
done.
[New LWP 3927]
[Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/cyrus/bin/ctl_backups verify -vvv
-m polyfoam.com.au!support'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __strlen_sse2 () at
../sysdeps/x86_64/multiarch/../strlen.S:120
120 ../sysdeps/x86_64/multiarch/../strlen.S: No such file or
directory.
(gdb) bt
#0 __strlen_sse2 () at
../sysdeps/x86_64/multiarch/../strlen.S:120
#1 0x0000557f0137c495 in backup_get_paths (mbname=0x557f0300bfd0,
data_fname=0x7ffcbe2daf90, index_fname=0x0,
create=BACKUP_OPEN_NOCREATE) at backup/lcb.c:373
#2 0x0000557f0136d678 in main (argc=5, argv=0x7ffcbe2db108) at
backup/ctl_backups.c:525
(gdb) up
#1 0x0000557f0137c495 in backup_get_paths (mbname=0x557f0300bfd0,
data_fname=0x7ffcbe2daf90, index_fname=0x0,
create=BACKUP_OPEN_NOCREATE) at backup/lcb.c:373
373 backup/lcb.c: No such file or directory.
(gdb) p *mbname
$2 = {boxes = 0x557f0300c090, is_deleted = 0, localpart = 0x0,
domain = 0x557f0300c050 "polyfoam.com.au",
extns = 0x0, extuserid = 0x0, userid = 0x0, intname =
0x557f0300c030 "polyfoam.com.au!support", extname = 0x0,
recipient = 0x0}
(gdb) p userid
$4 = 0x0
(gdb)
Calling strlen() on the null userid would be the immediate cause
of the crash. I'm not familiar enough with the code to know what
leads to userid being null, or if that's also the cause of backupd
crashing.