Subject: Bounced Mail Due to Postfix/Cyrus conflict INTRO This message describes a problem between Postfix and Cyrus, and describes a workaround which worked for us. The problem is caused by a mis-communication between Postfix's lmtp client and Cyrus's lmtpd daemon, and is triggered when Cyrus's lmtpd daemon has too many 'lockers' locking its database. The result is that Postfix fails to deliver valid email, instead returning it to the sender. I'm posting this to (1) help users who are experiencing similar problems, and (2) alert Postfix and Cyrus developers to this problem. FIRST SYMPTOM For each returned email, our log showed lines like this, Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>, relay=public/lmtp[public/lmtp], delay=1, status=bounced (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command)) The mystery was, why is postfix/lmtp bouncing mail? Didn't it receive an OK message ("250 2.1.5 ok")? THE ACTUAL PROBLEM Following the on-line advice, we turned on verbose messages for lmtp, replacing "lmtp" with "lmtp -v" in the appropriate column in /etc/postfix/master.cf, and then re-examined our logs. The cause of our failed emails was then narrowed to the conversion between Postfix's lmtp client, and Cyrus's lmtpd daemon. The problem happened in two stages. In the first stage, Cyrus's lmtpd would report a problem with too many 'lockers', Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> SIZE=26455 Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 354 go ahead --C1>> Feb 13 13:05:48 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: . Feb 13 13:05:48 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 350 lockers As a result, Postfix defers the message for later delivery (note that below the status says 'bounced', but later log entries (not shown) show that this messages was retried and eventually delivered). Here's the bounce message give in this stage, Feb 13 13:05:51 HOST postfix/lmtp[11340]: 45ACD380F0: to=<...>, relay=public/lmtp[public/lmtp], delay=5, status=bounced (host public/lmtp[public/lmtp] said: 350 lockers (in reply to end of DATA command)) In the next stage, the conversation continues, and an attempt is made to deliver a second email. Below is the rest of the conversation between lmtp and lmtpd. --C2>> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RSET <<R1-- Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 Ok --C3>> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: MAIL FROM:<...> --C4>> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: RCPT TO:<...> --C5>> Feb 13 13:05:53 HOST postfix/lmtp[11340]: > public/lmtp[public/lmtp]: DATA <<R2-- Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.0.0 ok <<R3-- Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.0 ok <<R4-- Feb 13 13:05:53 HOST postfix/lmtp[11340]: < public/lmtp[public/lmtp]: 250 2.1.5 ok Feb 13 13:05:48 HOST postfix/lmtp[11370]: DBC8F384FC: to=<...>, orig_to=<...>, relay=public/lmtp[public/lmtp], delay=1, status=bounced (host public/lmtp[public/lmtp] said: 250 2.1.5 ok (in reply to DATA command)) The lines following the marks like --C1>> and <<R1-- are the commands given by lmtp, and the responses returned by lmtpd. Here, Postfix/lmtp bounces the email because in response to its DATA command (marked --C5>> above), it receives the reply "250 2.1.5 ok" (marked <<R4-- above), but it rightfully expects instead "354 go ahead". Postfix's lmtp gets the wrong response because the responses are out of sync. You can see this by following the sequence of replies. The first command from lmtp is the '.' command (--C1>> above) signaling the end of input to the DATA command. The response to this <<R1--, although it *appears* to be the repines to the RSET command. By following the sequence you can see that lmtp thinks that "250 2.1.5 ok" is an invalid response to the "DATA" command, but it's really the valid response to the "RCPT TO:" command. Note that last line above. It shows lmtp bouncing the email. While this bounce looks similar to the first bounce, these types of bounces were always returned to sender. OUR WORKAROUND Our workaround for this problem is relatively simple. We got Cyrus's lmtpd client to stop issuing the 'lockers' message by turning off 'duplicatesuppression' (it is on by default), by setting duplicatesuppression: 0 Duplicate suppression compares every incoming email to a list of previous emails in the duplicate database, and rejects those that are duplicates for any particular user. Apparently this process is what was generating the 'lockers' messages. Since we have no need of this feature, turning it off is an easy workaround for us. WHAT IS THE RIGHT FIX? Should Postfix's lmtp client re-queue the messages that it currently returns? Or, should Cyrus's lmtpd daemon do a better job with its error messages back to Postfix's lmtp? ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html