Mail being delayed for 5 minutes in active queue before being relayed

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Mail being delayed for 5 minutes in active queue before being relayed

jmct
This post was updated on .
Hi there,

I've set up a mail server that should be relaying messages to a different
cluster of Postfix boxes. When I attempt to send a message to the first box,
mail sits in the active queue for 5 minutes before being (successfully)
relayed to the cluster of Postfix boxes. which is then delivered correctly.
I'm trying to figure out why this is and have it relay immediately (as
expected).

My box I'm attempting to relay FROM is on:

mail_version = 3.3.0

Here's my (sanitized) postconf -n:

alias_database = hash:/etc/aliases
alias_maps = hash:/etc/aliases
append_dot_mydomain = no
biff = no
broken_sasl_auth_clients = yes
command_directory = /usr/sbin
compatibility_level = 2
data_directory = /var/lib/postfix
inet_interfaces = all
inet_protocols = ipv4
mail_owner = postfix
mailbox_size_limit = 0
mailq_path = /usr/bin/mailq
manpage_directory = /usr/share/man
mydestination = $mydomain, localhost, localhost.localdomain, localhost
mydomain = mail.domain.com
myhostname = mail.domain.com
mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 10.0.0.0/8 192.168.0.0/16
mynetworks_style = subnet
myorigin = domain.com
newaliases_path = /usr/bin/newaliases
queue_directory = /var/spool/postfix
recipient_delimiter = +
relayhost = elbmail.internal.lb.com:25587
sendmail_path = /usr/sbin/sendmail
setgid_group = postdrop
smtp_sasl_auth_enable = yes
smtp_sasl_mechanism_filter = plain, login
smtp_sasl_password_maps = hash:/etc/postfix/sasl-passwords
smtp_sasl_security_options =
smtp_tls_CAfile = /etc/pki/tls/certs/ca-bundle.crt
smtp_tls_note_starttls_offer = yes
smtp_tls_security_level = may
smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache
smtp_use_tls = yes
smtpd_banner = $myhostname ESMTP $mail_name
smtpd_recipient_restrictions = permit_sasl_authenticated,permit_mynetworks,reject_unauth_destination
smtpd_relay_restrictions = permit_mynetworks permit_sasl_authenticated defer_unauth_destination
smtpd_sasl_auth_enable = yes
smtpd_sasl_local_domain =
smtpd_sasl_security_options = noanonymous
smtpd_tls_CAfile = /etc/pki/tls/cacert.pem
smtpd_tls_auth_only = no
smtpd_tls_cert_file = /etc/pki/tls/certs/mail.domain.crt
smtpd_tls_key_file = /etc/pki/tls/private/mail.domain.key
smtpd_tls_loglevel = 3
smtpd_tls_received_header = yes
smtpd_tls_security_level = may
smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache
smtpd_tls_session_cache_timeout = 3600s
smtpd_use_tls = yes
tls_random_exchange_name = /var/lib/postfix/prng_exch
tls_random_source = dev:/dev/urandom

Thanks for your help.

edit: Didn't seem to like formatting in the text editor. I've removed the formatting.

--
Sent from: http://postfix.1071664.n5.nabble.com/Postfix-Users-f2.html
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Matus UHLAR - fantomas
On 20.05.18 15:08, jmct wrote:
>I've set up a mail server that should be relaying messages to a different
>cluster of Postfix boxes. When I attempt to send a message to the first box,
>mail sits in the active queue for 5 minutes before being (successfully)
>relayed to the cluster of Postfix boxes. which is then delivered correctly.
>I'm trying to figure out why this is and have it relay immediately (as
>expected).

is DNS working properly on sending and receiving side?
Is there any firewall between them?

note that you did not send the configuration.
better don't paste it here, maybe pastebin.
--
Matus UHLAR - fantomas, [hidden email] ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
REALITY.SYS corrupted. Press any key to reboot Universe.
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Wietse Venema
Matus UHLAR - fantomas:

> On 20.05.18 15:08, jmct wrote:
> >I've set up a mail server that should be relaying messages to a different
> >cluster of Postfix boxes. When I attempt to send a message to the first box,
> >mail sits in the active queue for 5 minutes before being (successfully)
> >relayed to the cluster of Postfix boxes. which is then delivered correctly.
> >I'm trying to figure out why this is and have it relay immediately (as
> >expected).
>
> is DNS working properly on sending and receiving side?
> Is there any firewall between them?
>
> note that you did not send the configuration.
> better don't paste it here, maybe pastebin.

Logging would help as well, so that we can see where the delay
is happening.

If your mail queue is on a file server, then you need to synchronize
clocks. Postfix won't deliver files with a modification time in the
future.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Postfix User-2
In reply to this post by jmct
On Sun, 20 May 2018 15:08:11 -0700 (MST), jmct stated:

>Hi there,
>
>I've set up a mail server that should be relaying messages to a different
>cluster of Postfix boxes. When I attempt to send a message to the first box,
>mail sits in the active queue for 5 minutes before being (successfully)
>relayed to the cluster of Postfix boxes. which is then delivered correctly.
>I'm trying to figure out why this is and have it relay immediately (as
>expected).
>
>*My box I'm attempting to relay FROM is on:
>*
>
>
>*Here's my (sanitized) postconf -n:*
>
>
>
>Thanks for your help.

I would suggest that you read this: http://www.postfix.com/DEBUG_README.html,
or better yet, this: http://www.postfix.com/DEBUG_README.html#mail

The "postfinger" tool mentioned in the previous URL is excellent.

--
Jerry
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Stephen Satchell
On 05/21/2018 07:06 AM, Postfix User wrote:
> I would suggest that you read this:http://www.postfix.com/DEBUG_README.html


Suggest adding a section "Using a remote filesystem", asking if the
clocks between the PostFix server and the remote file system are
synchronized.
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Wietse Venema
Stephen Satchell:
> On 05/21/2018 07:06 AM, Postfix User wrote:
> > I would suggest that you read this:http://www.postfix.com/DEBUG_README.html
>
>
> Suggest adding a section "Using a remote filesystem", asking if the
> clocks between the PostFix server and the remote file system are
> synchronized.

Upon closer investigation, Postfix will not only log a warning

    warning: file system clock is %d seconds ahead of local clock

but it will also attempt to 'fix' the queue file modification time.
Source: mail_stream.c.

Thus, this form of clock drift should not delay the first email
delivery attempt.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Stephen Satchell
In reply to this post by Stephen Satchell
On 05/21/2018 07:31 AM, Stephen Satchell wrote:
> On 05/21/2018 07:06 AM, Postfix User wrote:
>> I would suggest that you read
>> this:http://www.postfix.com/DEBUG_README.html
>
>
> Suggest adding a section "Using a remote filesystem", asking if the
> clocks between the PostFix server and the remote file system are
> synchronized.

Add a test in Postfix start-up to check if filesystem timestamps
reasonably match the Postfix host system's time?

Create a file within the Postfix folder tree, check the m-time, and then
delete the file.  Log an error if the clocks differ by more than a few
seconds.

Just a thought.  I know that the use of remote file systems is not a
common occurrence, but it's happening enough that the benefits of adding
a clock sanity check outweigh the burden of work.
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

jmct
In reply to this post by jmct
Thanks for all the replies.

Yes, DNS is working as expected. I can resolve the relay host's address no
problem.

There is a firewall between them. Traffic is open between both sides. My
postfix server is running on 25587/tcp and the relay hosts are both
listening on the same. I can successfully telnet into both relay hosts boxes
individually, as well as through the load-balanced address.

I am not hosting this on a fileserver. Hosting on an Ubuntu VM with local
storage. All postfix servers have their times set up correctly though.

Here's a pastebin of postfinger:

https://pastebin.com/t8wa3kjH

Here's a pastebin of my postconf -n:

https://pastebin.com/x7vQkjMP

Here's a copy of a problematic message's log:

May 21 15:14:43 mail postfix/pickup[13108]: 3AA2E3F9AD: uid=0
from=<[hidden email]>
May 21 15:14:43 mail postfix/cleanup[13125]: 3AA2E3F9AD:
message-id=<[hidden email]>
May 21 15:14:43 mail postfix/qmgr[13109]: 3AA2E3F9AD:
from=<[hidden email]>, size=338, nrcpt=1 (queue active)
May 21 15:19:44 mail postfix/smtp[13170]: 3AA2E3F9AD: conversation with
load-balanced-dns-name[10.55.128.92] timed out while receiving the initial
server greeting
May 21 15:19:44 mail postfix/smtp[13170]: 3AA2E3F9AD:
to=<[hidden email]>,
relay=internal-load-balancer[10.55.128.125]:25587, delay=301,
delays=0.01/0.02/301/0.11, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as
6A10E61A76)
May 21 15:19:44 mail postfix/qmgr[13109]: 3AA2E3F9AD: removed

I do see the "conversation with load-balanced-dns-name[10.55.128.92] timed
out while receiving the initial server greeting" message now, but I don't
know where to look here. I can successfully hit both servers behind the load
balanced endpoint, as well as telnet into the correct port I've set up
(25587/tcp).

The message attempts the first time, fails, then succeeds the second time --
every time.

I did attempt to change tcp_window_scaling to "0" in sysctl (found on
another forum), but this unfortunately did not do the trick for me
(net.ipv4.tcp_window_scaling = 0 for reference).

Just to be completely transparent on how my setup is here, mail is flowing
as follows:

mail -> postfix -> openVPN connection -> load balancer -> relay host 1 and 2

Traffic is flowing between this connection stream.

Thank you.



--
Sent from: http://postfix.1071664.n5.nabble.com/Postfix-Users-f2.html
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

jmct
I seem to have fixed this.

It appears that DNS WAS the cause. Resolution was not happening to me when
doing requests, but when Postfix was attempting to make the same request, it
appears to have failed from time to time.

I saw this in the logs:

May 21 15:28:53 mail postfix/scache[13448]: statistics: start interval May
21 15:20:33
May 21 15:28:53 mail postfix/scache[13448]: statistics: domain lookup hits=0
miss=3 success=0%

and went ahead and stuck my load balancer's endpoint into /etc/hosts. Mail
is now sending instantly as expected. I imagine this has something to do
with passing the request through the load balancer.

Thanks for all of the help. :)



--
Sent from: http://postfix.1071664.n5.nabble.com/Postfix-Users-f2.html
Reply | Threaded
Open this post in threaded view
|

Re: Mail being delayed for 5 minutes in active queue before being relayed

Igor Golubkov


On May 21, 2018 7:28:00 PM jmct <[hidden email]> wrote:

> I seem to have fixed this.
>
> It appears that DNS WAS the cause. Resolution was not happening to me when
> doing requests, but when Postfix was attempting to make the same request, it
> appears to have failed from time to time.
>
> I saw this in the logs:
>
> May 21 15:28:53 mail postfix/scache[13448]: statistics: start interval May
> 21 15:20:33
> May 21 15:28:53 mail postfix/scache[13448]: statistics: domain lookup hits=0
> miss=3 success=0%
>
> and went ahead and stuck my load balancer's endpoint into /etc/hosts. Mail
> is now sending instantly as expected. I imagine this has something to do
> with passing the request through the load balancer.
>
> Thanks for all of the help. :)
>
>
>
> --
> Sent from: http://postfix.1071664.n5.nabble.com/Postfix-Users-f2.html
Shouldn't your relayhost be [elbmail.internal.lb.com]:25587 (with brackets)
so postfix doesn't try to find mx records for that name?