*Some* messages getting stuck in outbound queue...

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

*Some* messages getting stuck in outbound queue...

Charles Marcus
Hello,

(postconf -n at bottom of message)

I've been experiencing a weird problem over the last two weeks or so,
where certain messages are getting stuck in the outbound queue, and
eventually being returned to the sender (one of our internal employees)
as undeliverable. I have checked everything I can think of, and have had
a ticket open with Nuvox (our ISP, and who we use for relaying all
outbound mail), but so far they haven't been able to figure out why this
is happening. I don't *think* this is a problem on my end, but I think a
sanity check is in order...

One weird thing is it seems to be limited to just a few domains, mainly
vzw/sprint.blackberry.net, and hearst.com, with an occasional problem
with a gmail or other random domain/recipient, but the ones that always
end up being returned as undeliverable are the ones for blackberry.net
and hearst.com...

Also, we communicate a LOT with hearst.com, and this only happens
occasionally, for certain messages...

Output of mailq currently looks like this:

myhost ~ # mailq
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
3A603359C3F*   35280 Tue Jun 10 10:32:52  prvs=user=[hidden email]
                                          [hidden email]
                                          [hidden email]

12DB017C413*   26505 Wed Jun 11 17:51:23  prvs=user=[hidden email]
                                          [hidden email]
                                          [hidden email]
                                          [hidden email]

00CBD1964E0    29835 Wed Jun 11 11:51:00  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
end of data -- message may be sent more than once)
                                          [hidden email]
                                          [hidden email]

0366137DF24  2961207 Wed Jun 11 14:40:52  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

1E7F838797F    25389 Mon Jun  9 12:18:31  prvs=user=[hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

3BFE618DE86    27179 Wed Jun 11 16:37:06  MAILER-DAEMON
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          prvs=user=[hidden email]

40C3329F8FA    25022 Wed Jun 11 10:37:11  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

5F46B380825    24395 Tue Jun 10 11:39:31  prvs=user=[hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

6B6192A80CB    25560 Wed Jun 11 10:09:36  prvs=user=[hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

766885FADF  2719349 Wed Jun 11 16:49:36  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

9010B2A80E7    21725 Wed Jun 11 17:09:06  MAILER-DAEMON
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
end of data -- message may be sent more than once)
                                          prvs=user=[hidden email]

AE0003856FD   540971 Thu Jun 12 09:56:32  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

A45E62E4EE1    24331 Tue Jun 10 12:19:10  prvs=user=[hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

B04EAA2719    24632 Tue Jun 10 12:07:32  prvs=user=[hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
message body)
                                          [hidden email]

C4CFE3880A5    15866 Tue Jun 10 14:22:47  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
end of data -- message may be sent more than once)
                                          [hidden email]

D2BD82950AE    29506 Wed Jun 11 11:50:57  [hidden email]
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
end of data -- message may be sent more than once)
                                          [hidden email]

-- 6422 Kbytes in 16 Requests.
myhost ~ #

All of the messages destined for blackberry.net and gmail.com are
forwards for 3 users here, but these have been working fine for months
or a year or more, and nothing has been changed related to these.

Also, I noticed this in  the logs, but don't know what if it is
related/significant:

Jun 12 11:45:26 myhost postfix/scache[861]: statistics: start interval
Jun 12 11:35:35
Jun 12 11:45:26 myhost postfix/scache[861]: statistics: domain lookup
hits=0 miss=16 success=0%
Jun 12 11:45:26 myhost postfix/scache[861]: statistics: address lookup
hits=0 miss=16 success=0%
Jun 12 11:45:26 myhost postfix/scache[861]: statistics: max simultaneous
domains=1 addresses=1 connection=1

I'm still trying to capture a log of a message when it gets sent the
first time and gets stuck, but haven't been able to do so yet, but here
is a snippet of when a couple of messages are retried:

Jun 12 12:09:07 moria postfix/smtp[1245]: 9010B2A80E7:
to=<prvs=user=[hidden email]>,
relay=smtp.nuvox.net[70.43.63.17]:25, delay=68401,
delays=67800/0/0.13/600, dsn=4.4.2, status=deferred (conversation with
smtp.nuvox.net[70.43.63.17] timed out while sending end of data --
message may be sent more than once)
Jun 12 12:09:07 moria postfix/smtp[1263]: C4CFE3880A5:
to=<[hidden email]>, relay=smtp.nuvox.net[70.43.63.17]:25,
delay=164779, delays=164179/0.01/0.12/600, dsn=4.4.2, status=deferred
(conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
end of data -- message may be sent more than once)

Postconf -n:

myhost ~ # postconf -n
alias_database = hash:/etc/mail/aliases
alias_maps = hash:/etc/mail/aliases, hash:/var/lib/mailman/data/aliases
anvil_rate_time_unit = 360s
anvil_status_update_time = 3600s
broken_sasl_auth_clients = yes
command_directory = /usr/sbin
config_directory = /etc/postfix
daemon_directory = /usr/lib64/postfix
data_directory = /var/lib/postfix
debug_peer_level = 2
default_destination_concurrency_limit = 20
delay_warning_time = 2h
home_mailbox = .maildir/
html_directory = /usr/share/doc/postfix-2.5.1/html
local_destination_concurrency_limit = 2
mail_owner = postfix
mailq_path = /usr/bin/mailq
manpage_directory = /usr/share/man
message_size_limit = 51200000
mydomain = my-domain.com
myhostname = smtp.my-domain.com
mynetworks = 127.0.0.1
newaliases_path = /usr/bin/newaliases
owner_request_special = no
queue_directory = /var/spool/postfix
readme_directory = /usr/share/doc/postfix-2.5.1/readme
recipient_delimiter = +
relayhost = [smtp.nuvox.net]
sample_directory = /etc/postfix
sendmail_path = /usr/sbin/sendmail
setgid_group = postdrop
smtpd_client_restrictions =
smtpd_hard_error_limit = 3
smtpd_helo_restrictions =
smtpd_recipient_limit = 100
smtpd_recipient_restrictions = permit_mynetworks,
permit_sasl_authenticated,  reject_unauth_destination,
check_client_access cidr:/etc/postfix/allowed_clients.cidr,
check_recipient_access hash:/etc/postfix/gone,
smtpd_sasl_auth_enable = yes
smtpd_sasl_local_domain = $myhostname
smtpd_sasl_security_options = noanonymous
smtpd_sender_restrictions =
smtpd_tls_auth_only = yes
smtpd_tls_cert_file = /etc/ssl/wildcard.crt
smtpd_tls_key_file = /etc/ssl/wildcard.key
smtpd_tls_loglevel = 1
smtpd_use_tls = yes
transport_maps = hash:/etc/postfix/transport
unknown_local_recipient_reject_code = 550
virtual_alias_maps = mysql:/etc/postfix/mysql_virtual_alias_maps.cf,
hash:/var/lib/mailman/data/virtual-mailman
virtual_gid_maps = static:207
virtual_mailbox_base = /var/virtual
virtual_mailbox_domains = mysql:/etc/postfix/mysql_virtual_domain_maps.cf
virtual_mailbox_limit = 51200000
virtual_mailbox_maps = mysql:/etc/postfix/mysql_virtual_mailbox_maps.cf
virtual_minimum_uid = 207
virtual_transport = virtual
virtual_uid_maps = static:207
myhost ~ #

allowed_clients.cidr contains webroots netblocks (they are our
outsourced anti-spam provider that filter all inbound mail).

gone is a file for custom REJECT messages for ex-employees...

Thanks to anyone for taking a few minutes to look this over...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Brent Bice
Charles Marcus wrote:
> -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
> 0366137DF24  2961207 Wed Jun 11 14:40:52  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]

    This means just what it says -- that postfix was sending the message
body (the part of the SMTP transaction after the DATA statement) and
never got an answer from smtp.nuvox.net saying it had accepted delivery.

    One thing seems a bit odd to me though.  If the to address was
[hidden email] why was the message being delivered to
smtp.nuvox.net? The MX records for hearst.com don't point there from
what I can see.

    Is smtp.nuvox.net a smart-relay that your server forwards all of
it's outbound mail to?  Ah, I see below that it is.

    If it were me, I'd run a sniffer on the port 25 traffic to
smtp.nuvox.net, then tell postfix to flush the queue and look at the raw
smtp transaction.  When (I doubt "if" applies here - grin) you find that
smtp.nuvox.net isn't answering after the DATA section is ended with a
single "." and postfix times out waiting for the "message accepted for
delivery" answer, you can contact the owners of smtp.nuvox.net and show
ask them why they're not acknowledging receipt of the message.

Brent
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

mouss-2
In reply to this post by Charles Marcus
Charles Marcus wrote:

> Hello,
>
> (postconf -n at bottom of message)
>
> I've been experiencing a weird problem over the last two weeks or so,
> where certain messages are getting stuck in the outbound queue, and
> eventually being returned to the sender (one of our internal
> employees) as undeliverable. I have checked everything I can think of,
> and have had a ticket open with Nuvox (our ISP, and who we use for
> relaying all outbound mail), but so far they haven't been able to
> figure out why this is happening. I don't *think* this is a problem on
> my end, but I think a sanity check is in order...
>
> One weird thing is it seems to be limited to just a few domains,
> mainly vzw/sprint.blackberry.net, and hearst.com, with an occasional
> problem with a gmail or other random domain/recipient, but the ones
> that always end up being returned as undeliverable are the ones for
> blackberry.net and hearst.com...
>
> Also, we communicate a LOT with hearst.com, and this only happens
> occasionally, for certain messages...

in absence of other info, it looks like a timeout between your postfix
and your ISP relay. if you changed your timeouts (for smtp), then put
them back to their default. if you did not, see if you should increase...

>
> Output of mailq currently looks like this:
>
> myhost ~ # mailq
> -Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
> 3A603359C3F*   35280 Tue Jun 10 10:32:52  prvs=user=[hidden email]
>                                          [hidden email]
>                                          [hidden email]
>
> 12DB017C413*   26505 Wed Jun 11 17:51:23  prvs=user=[hidden email]
>                                          [hidden email]
>                                          [hidden email]
>                                          [hidden email]
>
> 00CBD1964E0    29835 Wed Jun 11 11:51:00  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> end of data -- message may be sent more than once)
>                                          [hidden email]
>                                          [hidden email]
>
> 0366137DF24  2961207 Wed Jun 11 14:40:52  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 1E7F838797F    25389 Mon Jun  9 12:18:31  prvs=user=[hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 3BFE618DE86    27179 Wed Jun 11 16:37:06  MAILER-DAEMON
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          prvs=user=[hidden email]
>
> 40C3329F8FA    25022 Wed Jun 11 10:37:11  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 5F46B380825    24395 Tue Jun 10 11:39:31  prvs=user=[hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 6B6192A80CB    25560 Wed Jun 11 10:09:36  prvs=user=[hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 766885FADF  2719349 Wed Jun 11 16:49:36  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> 9010B2A80E7    21725 Wed Jun 11 17:09:06  MAILER-DAEMON
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> end of data -- message may be sent more than once)
>                                          prvs=user=[hidden email]
>
> AE0003856FD   540971 Thu Jun 12 09:56:32  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> A45E62E4EE1    24331 Tue Jun 10 12:19:10  prvs=user=[hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> B04EAA2719    24632 Tue Jun 10 12:07:32  prvs=user=[hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> message body)
>                                          [hidden email]
>
> C4CFE3880A5    15866 Tue Jun 10 14:22:47  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> end of data -- message may be sent more than once)
>                                          [hidden email]
>
> D2BD82950AE    29506 Wed Jun 11 11:50:57  [hidden email]
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> end of data -- message may be sent more than once)
>                                          [hidden email]
>
> -- 6422 Kbytes in 16 Requests.
> myhost ~ #
>
> All of the messages destined for blackberry.net and gmail.com are
> forwards for 3 users here, but these have been working fine for months
> or a year or more, and nothing has been changed related to these.
>
> Also, I noticed this in  the logs, but don't know what if it is
> related/significant:
>
> Jun 12 11:45:26 myhost postfix/scache[861]: statistics: start interval
> Jun 12 11:35:35
> Jun 12 11:45:26 myhost postfix/scache[861]: statistics: domain lookup
> hits=0 miss=16 success=0%
> Jun 12 11:45:26 myhost postfix/scache[861]: statistics: address lookup
> hits=0 miss=16 success=0%
> Jun 12 11:45:26 myhost postfix/scache[861]: statistics: max
> simultaneous domains=1 addresses=1 connection=1
>
> I'm still trying to capture a log of a message when it gets sent the
> first time and gets stuck, but haven't been able to do so yet, but
> here is a snippet of when a couple of messages are retried:
>
> Jun 12 12:09:07 moria postfix/smtp[1245]: 9010B2A80E7:
> to=<prvs=user=[hidden email]>,
> relay=smtp.nuvox.net[70.43.63.17]:25, delay=68401,
> delays=67800/0/0.13/600, dsn=4.4.2, status=deferred (conversation with
> smtp.nuvox.net[70.43.63.17] timed out while sending end of data --
> message may be sent more than once)
> Jun 12 12:09:07 moria postfix/smtp[1263]: C4CFE3880A5:
> to=<[hidden email]>, relay=smtp.nuvox.net[70.43.63.17]:25,
> delay=164779, delays=164179/0.01/0.12/600, dsn=4.4.2, status=deferred
> (conversation with smtp.nuvox.net[70.43.63.17] timed out while sending
> end of data -- message may be sent more than once)
>
> Postconf -n:
>
> myhost ~ # postconf -n
> alias_database = hash:/etc/mail/aliases
> alias_maps = hash:/etc/mail/aliases, hash:/var/lib/mailman/data/aliases
> anvil_rate_time_unit = 360s
> anvil_status_update_time = 3600s
> broken_sasl_auth_clients = yes
> command_directory = /usr/sbin
> config_directory = /etc/postfix
> daemon_directory = /usr/lib64/postfix
> data_directory = /var/lib/postfix
> debug_peer_level = 2
> default_destination_concurrency_limit = 20
> delay_warning_time = 2h
> home_mailbox = .maildir/
> html_directory = /usr/share/doc/postfix-2.5.1/html
> local_destination_concurrency_limit = 2
> mail_owner = postfix
> mailq_path = /usr/bin/mailq
> manpage_directory = /usr/share/man
> message_size_limit = 51200000
> mydomain = my-domain.com
> myhostname = smtp.my-domain.com
> mynetworks = 127.0.0.1
> newaliases_path = /usr/bin/newaliases
> owner_request_special = no
> queue_directory = /var/spool/postfix
> readme_directory = /usr/share/doc/postfix-2.5.1/readme
> recipient_delimiter = +
> relayhost = [smtp.nuvox.net]
> sample_directory = /etc/postfix
> sendmail_path = /usr/sbin/sendmail
> setgid_group = postdrop
> smtpd_client_restrictions =
> smtpd_hard_error_limit = 3
> smtpd_helo_restrictions =
> smtpd_recipient_limit = 100
> smtpd_recipient_restrictions = permit_mynetworks,
> permit_sasl_authenticated,  reject_unauth_destination,
> check_client_access cidr:/etc/postfix/allowed_clients.cidr,
> check_recipient_access hash:/etc/postfix/gone,
> smtpd_sasl_auth_enable = yes
> smtpd_sasl_local_domain = $myhostname
> smtpd_sasl_security_options = noanonymous
> smtpd_sender_restrictions =
> smtpd_tls_auth_only = yes
> smtpd_tls_cert_file = /etc/ssl/wildcard.crt
> smtpd_tls_key_file = /etc/ssl/wildcard.key
> smtpd_tls_loglevel = 1
> smtpd_use_tls = yes
> transport_maps = hash:/etc/postfix/transport
> unknown_local_recipient_reject_code = 550
> virtual_alias_maps = mysql:/etc/postfix/mysql_virtual_alias_maps.cf,
> hash:/var/lib/mailman/data/virtual-mailman
> virtual_gid_maps = static:207
> virtual_mailbox_base = /var/virtual
> virtual_mailbox_domains = mysql:/etc/postfix/mysql_virtual_domain_maps.cf
> virtual_mailbox_limit = 51200000
> virtual_mailbox_maps = mysql:/etc/postfix/mysql_virtual_mailbox_maps.cf
> virtual_minimum_uid = 207
> virtual_transport = virtual
> virtual_uid_maps = static:207
> myhost ~ #
>
> allowed_clients.cidr contains webroots netblocks (they are our
> outsourced anti-spam provider that filter all inbound mail).
>
> gone is a file for custom REJECT messages for ex-employees...
>
> Thanks to anyone for taking a few minutes to look this over...
>

Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/12/2008, mouss ([hidden email]) wrote:
> in absence of other info, it looks like a timeout between your
> postfix and your ISP relay. if you changed your timeouts (for smtp),
> then put them back to their default. if you did not, see if you
> should increase...

Nope - but wouldn't that show in postconf -n output if I had?

I haven't made any config changes except for adding in the
check_recipient_access for the custom reject messages for ex employees,
but that was after this problem had already shown up...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
In reply to this post by Brent Bice
On 6/12/2008, Brent Bice ([hidden email]) wrote:
> If it were me, I'd run a sniffer on the port 25 traffic to
> smtp.nuvox.net, then tell postfix to flush the queue and look at the
> raw smtp transaction.  When (I doubt "if" applies here - grin) you
> find that smtp.nuvox.net isn't answering after the DATA section is
> ended with a single "." and postfix times out waiting for the
> "message accepted for delivery" answer, you can contact the owners of
> smtp.nuvox.net and show ask them why they're not acknowledging
> receipt of the message.

I'm now leaning toward something weird in the body (or headers?) of
these messages... and I just noticed two more oddities else...

Virtually every one - actually, EVERY one - is either From (received,
then forwarded to the users blackberry.net or gmail.com account) or To
(originating with us someone at hearst.com...

Secondly... what is the significance of the weird characters in the
users localpart:

prvs=user=[hidden email]

?

Every message that is FROM someone at hearst.com has these weird
characters in their username localpart. The messages that are being sent
TO someone at hearst do not...

???

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

mouss-2
Charles Marcus wrote:

> On 6/12/2008, Brent Bice ([hidden email]) wrote:
>> If it were me, I'd run a sniffer on the port 25 traffic to
>> smtp.nuvox.net, then tell postfix to flush the queue and look at the
>> raw smtp transaction.  When (I doubt "if" applies here - grin) you
>> find that smtp.nuvox.net isn't answering after the DATA section is
>> ended with a single "." and postfix times out waiting for the
>> "message accepted for delivery" answer, you can contact the owners of
>> smtp.nuvox.net and show ask them why they're not acknowledging
>> receipt of the message.
>
> I'm now leaning toward something weird in the body (or headers?) of
> these messages... and I just noticed two more oddities else...
>
> Virtually every one - actually, EVERY one - is either From (received,
> then forwarded to the users blackberry.net or gmail.com account) or To
> (originating with us someone at hearst.com...

I don't understand this part. can you be more precise?
>
> Secondly... what is the significance of the weird characters in the
> users localpart:
>
> prvs=user=[hidden email]
>
> ?

BATV. the address is [hidden email], but they tag the MAIL.FROM
(return-path) so that they can block backscatter sent to
[hidden email]. google for more.
>
> Every message that is FROM someone at hearst.com has these weird
> characters in their username localpart.

it should only appear in the envelope from (MAIL FROM command), not in
header (From:, Reply-To:). If From was tagged that way, they would have
problems with major mailing lists (well, they may currently have
problems with lists that check the envelope sender...)

> The messages that are being sent TO someone at hearst do not...
>
> ???
>


Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/12/2008, mouss ([hidden email]) wrote:
>> Virtually every one - actually, EVERY one - is either From
>> (received, then forwarded to the users blackberry.net or gmail.com
>> account) or To (originating with us someone at hearst.com...

> I don't understand this part. can you be more precise?

Yeah, the way I worded that was confusing... how about...

Every message is either ORIGINALLY *From* someone at hearst.com (message
was sent to one of our mailman lists, then when delivered to the users,
forwarded through the users alias to their blackberry or gmail account),
or To (originating with one of our users) someone at hearst.com...

>> Secondly... what is the significance of the weird characters in the
>> users localpart:
>>
>> prvs=user=[hidden email]

> BATV. the address is [hidden email], but they tag the MAIL.FROM
> (return-path) so that they can block backscatter sent to
> [hidden email]. google for more.

Ahh... ok, makes sense... thx...

Now... is it possible that this is somehow the problem? I don't believe
in coincidences... so could this cause some messages to have trouble
being relayed by nuvox?

>> Every message that is FROM someone at hearst.com has these weird
>> characters in their username localpart.

> it should only appear in the envelope from (MAIL FROM command), not
> in header (From:, Reply-To:). If From was tagged that way, they would
> have problems with major mailing lists (well, they may currently have
> problems with lists that check the envelope sender...)

Well, what I pasted was the output of mailq, so I'm assuming that was
the 'envelope from'...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
Well, have some new information on this problem...

I changed my outbound relay from our ISP (Nuvox) to our outsourced
anti-spam service (webroot), did 'postsuper -r ALL', and 13 of the stuck
messages were delivered immediately (these were NDR's for the other
messages that weren't being delivered), but 7 of them are still being
deferred.

But, I noticed a warning that I don't remember seeing before - and it
hopefully is significant, since it immediately precedes the 7 individual
postfix/error deferred messages (it is the first line below):

Jun 14 13:13:28 myhost postfix/error[29218]: warning: open active
D69B7190A32: No such file or directory
Jun 14 13:13:28 myhost postfix/error[29218]: 89C7583E86:
to=<[hidden email]>, relay=none, delay=780,
delays=780/0.72/0/0.05, dsn=4.4.2, status=deferred (delivery temporarily
suspended: conversation with post18.emailfiltering.com[194.116.199.83]
timed out while sending message body)
Jun 14 13:13:28 myhost postfix/error[29220]: D49983BC497:
to=<[hidden email]>, orig_to=<[hidden email]>,
relay=none, delay=268577, delays=268577/0.48/0/0.04, dsn=4.4.2,
status=deferred (delivery temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)
Jun 14 13:13:28 myhost postfix/error[29219]: 9AC353BC160:
to=<[hidden email]>, relay=none, delay=253955,
delays=253955/0.48/0/0.04, dsn=4.4.2, status=deferred (delivery
temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)
Jun 14 13:13:28 myhost postfix/error[29221]: D73CB3BEB8A:
to=<[hidden email]>, relay=none, delay=341441,
delays=341440/0.48/0/0.03, dsn=4.4.2, status=deferred (delivery
temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)
Jun 14 13:13:28 myhost postfix/error[29223]: 0A9FC3BF3CD:
to=<[hidden email]>, relay=none, delay=155379,
delays=155379/0.21/0/0.02, dsn=4.4.2, status=deferred (delivery
temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)
Jun 14 13:13:28 myhost postfix/error[29218]: 2A6E23BF411:
to=<[hidden email]>, relay=none, delay=165235,
delays=165235/0.2/0/0.01, dsn=4.4.2, status=deferred (delivery
temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)
Jun 14 13:13:28 myhost postfix/error[29225]: D921B2A80E7:
to=<[hidden email]>, relay=none, delay=246232,
delays=246232/0.34/0/0.03, dsn=4.4.2, status=deferred (delivery
temporarily suspended: conversation with
post18.emailfiltering.com[194.116.199.83] timed out while sending
message body)

Hopefully this will mean something to someone...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Noel Jones-2
Charles Marcus wrote:

> Well, have some new information on this problem...
>
> I changed my outbound relay from our ISP (Nuvox) to our outsourced
> anti-spam service (webroot), did 'postsuper -r ALL', and 13 of the stuck
> messages were delivered immediately (these were NDR's for the other
> messages that weren't being delivered), but 7 of them are still being
> deferred.
>
> But, I noticed a warning that I don't remember seeing before - and it
> hopefully is significant, since it immediately precedes the 7 individual
> postfix/error deferred messages (it is the first line below):
>
> Jun 14 13:13:28 myhost postfix/error[29218]: warning: open active
> D69B7190A32: No such file or directory
> Jun 14 13:13:28 myhost postfix/error[29218]: 89C7583E86:
> to=<[hidden email]>, relay=none, delay=780,
> delays=780/0.72/0/0.05, dsn=4.4.2, status=deferred (delivery temporarily
> suspended: conversation with post18.emailfiltering.com[194.116.199.83]
> timed out while sending message body)
> Jun 14 13:13:28 myhost postfix/error[29220]: D49983BC497:
> to=<[hidden email]>, orig_to=<[hidden email]>,
> relay=none, delay=268577, delays=268577/0.48/0/0.04, dsn=4.4.2,
> status=deferred (delivery temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
> Jun 14 13:13:28 myhost postfix/error[29219]: 9AC353BC160:
> to=<[hidden email]>, relay=none, delay=253955,
> delays=253955/0.48/0/0.04, dsn=4.4.2, status=deferred (delivery
> temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
> Jun 14 13:13:28 myhost postfix/error[29221]: D73CB3BEB8A:
> to=<[hidden email]>, relay=none, delay=341441,
> delays=341440/0.48/0/0.03, dsn=4.4.2, status=deferred (delivery
> temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
> Jun 14 13:13:28 myhost postfix/error[29223]: 0A9FC3BF3CD:
> to=<[hidden email]>, relay=none, delay=155379,
> delays=155379/0.21/0/0.02, dsn=4.4.2, status=deferred (delivery
> temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
> Jun 14 13:13:28 myhost postfix/error[29218]: 2A6E23BF411:
> to=<[hidden email]>, relay=none, delay=165235,
> delays=165235/0.2/0/0.01, dsn=4.4.2, status=deferred (delivery
> temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
> Jun 14 13:13:28 myhost postfix/error[29225]: D921B2A80E7:
> to=<[hidden email]>, relay=none, delay=246232,
> delays=246232/0.34/0/0.03, dsn=4.4.2, status=deferred (delivery
> temporarily suspended: conversation with
> post18.emailfiltering.com[194.116.199.83] timed out while sending
> message body)
>
> Hopefully this will mean something to someone...
>

I don't think that's related.

We really need to see a packet capture from your server.  If
your server is behind a firewall or other device, we may also
need a capture between that device and the internet to compare
with what your server originally sends.
http://www.postfix.org/DEBUG_README.html#sniffer

If just one destination was behaving this way, I would blame
that destination.  Since you're having the same trouble with
two different destinations, it sounds as if something on your
end is interfering with the network.  Default guess is a
broken firewall on your end.

Further analysis requires a packet capture.

--
Noel Jones
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/14/2008, Noel Jones ([hidden email]) wrote:
> If just one destination was behaving this way, I would blame that
> destination.

Hi Noel,

Thanks for taking a peek...

Maybe you missed this, but every message that is getting stuck
ORIGINATED (meaning, the stuck message *is* either a Reply TO or a Fwd
OF) with a single domain: hearst.com.

So, yes, every message getting stuck has this one single external domain
in common.

I'm also fairly certain that they ALL have attachments - the smallest is
14K, but none is larger than 2.9MB...

I have an example message (2.7MB) that I can successfully send using my
gmail account (and using gmails smtp server) - this message also fails
to send

> Since you're having the same trouble with two different destinations,
> it sounds as if something on your end is interfering with the
> network. Default guess is a broken firewall on your end.

Well, I might agree, but - there have been no changes in the firewall
rules on the MTA for a long time (I'm the only one with *any* access,
much less root access), and, I have an example message that will fail to
send with my dreamhost hosted personal account (same kind of time-out
problem), but will SUCCESSFULLY send using my gmail account (and using
gmails smtp server), that I have sent to both Nuvox and webroots tech
support...

I'm still wondering if this is related to the problem ORIGIN domains use
of BATV...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Noel Jones-2
Charles Marcus wrote:

> On 6/14/2008, Noel Jones ([hidden email]) wrote:
>> If just one destination was behaving this way, I would blame that
>> destination.
>
> Hi Noel,
>
> Thanks for taking a peek...
>
> Maybe you missed this, but every message that is getting stuck
> ORIGINATED (meaning, the stuck message *is* either a Reply TO or a Fwd
> OF) with a single domain: hearst.com.

This isn't related unless the there is some processing ON YOUR
END of the outgoing message.

>
> So, yes, every message getting stuck has this one single external domain
> in common.

We're concerned about the other end of the TCP connection, not
the address on the mail.

>
> I'm also fairly certain that they ALL have attachments - the smallest is
> 14K, but none is larger than 2.9MB...

Do all "big" messages fail?  size shouldn't matter.

>
> I have an example message (2.7MB) that I can successfully send using my
> gmail account (and using gmails smtp server) - this message also fails
> to send

Because it's going to a different TCP end point?  Without
context this doesn't mean much.

>
>> Since you're having the same trouble with two different destinations,
>> it sounds as if something on your end is interfering with the
>> network. Default guess is a broken firewall on your end.
>
> Well, I might agree, but - there have been no changes in the firewall
> rules on the MTA for a long time (I'm the only one with *any* access,
> much less root access), and, I have an example message that will fail to
> send with my dreamhost hosted personal account (same kind of time-out
> problem), but will SUCCESSFULLY send using my gmail account (and using
> gmails smtp server), that I have sent to both Nuvox and webroots tech
> support...
>
> I'm still wondering if this is related to the problem ORIGIN domains use
> of BATV...
>

Either two unrelated providers have the same bug in their
system or something is broken on your end.  and this is a
network issue, not a postfix issue.

--
Noel Jones
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/14/2008 2:45 PM, Noel Jones wrote:
>>> If just one destination was behaving this way, I would blame that
>>> destination.

>> Maybe you missed this, but every message that is getting stuck
>> ORIGINATED (meaning, the stuck message *is* either a Reply TO or a Fwd
>> OF) with a single domain: hearst.com.

> This isn't related unless the there is some processing ON YOUR END of
> the outgoing message.

I'm not trying to be difficult... honest... ;) but...

No processing beyond simply sending the message.

Do I understand you to be saying that it is impossible (or unlikely)
that hearst.com's implementation of BATV - or something else on their
end - is mangling CERTAIN messages with attachments in such a way that
it causes this problem?

It just seems to me that since it is only *certain* messages (far more
are delivered) that originate from hearst.com that get stuck, that it
must be related *somehow*.

> We're concerned about the other end of the TCP connection, not the
> address on the mail.

I'm not talking about who the mail is 'addressed to' - I'm talking about
the CONTENTS of the message itself. It seems to me that:

1. the fact that I get the same KIND of error (times out while trying to
send) when using my dreamhost account - which means, my postfix is not
involved - *and*

2. the fact that doing the same thing with gmails smtp server WORKS

That this means that it is unlikely to be a firewall issue - otherwise
it wouldn't work when using gmails smtp server, right?

The only thing left is that the message itself is somehow malformed in
such a way that some MTAs choke on it and some don't.

Sadly, I don't know what to look for when looking at the contents of the
message.

>> I'm also fairly certain that they ALL have attachments - the smallest
>> is 14K, but none is larger than 2.9MB...
>
> Do all "big" messages fail?  size shouldn't matter.

No, Max size has always been set to 50MB, and we send a LOT of messages
all day long, many with large attachments, with no problems at all.

>> I have an example message (2.7MB) that I can successfully send using
>> my gmail account (and using gmails smtp server) - this message also
>> fails to send

> Because it's going to a different TCP end point?  Without context this
> doesn't mean much.

Doesn't it at least mean that it isn't a firewall issue? And it is the
same recipient, but a different sending MTA.

> Either two unrelated providers have the same bug in their system or
> something is broken on your end.

Or, messages coming from hearst.com sometimes are malformed in such a
way that I can receive them, but not reply to or forward them?

Messages composed from scratch to hearst.com are sent without problem -
both with and without attachments - and a large majority of Replies to
and Forwards of their messages are delivered without a problem.

Again, it is ONLY messages that ORIGINATED FROM someone at hearst.com
AND have attachments, AND that are then replied to OR FORWARDED...

> and this is a network issue, not a postfix issue.

I don't think so, but I guess I could be wrong. It just seems to me that
if it were a network issue:

1. it would be affecting FAR more than just a few occasional messages
that fit a very narrow criteria - one of which being they originated
form one single domain (hearst.com), and

2. it would not successfully send when using smtp.gmail.com?

--

Best regards,

Charles

Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
In reply to this post by Noel Jones-2
On 6/14/2008, Noel Jones ([hidden email]) wrote:
> and this is a network issue, not a postfix issue.

Also... maybe I'm just dense, but...

These messages that get stuck in my queue will send/receive fine on my
LOCAL postfix, as long as they aren't going outside our network.

I have a message in my Inbox that was (re)sent from one of the users'
Sent folder to me, and I received it fine.

If I compose a brand new message, and attach this message to the fresh
message, and send it to one of my personal accounts outside our network
- AGAIN, it gets stuck in the queue.

Doesn't that say plain and simple that there is something malformed
about that message?

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Noel Jones-2
Charles Marcus wrote:

> On 6/14/2008, Noel Jones ([hidden email]) wrote:
>> and this is a network issue, not a postfix issue.
>
> Also... maybe I'm just dense, but...
>
> These messages that get stuck in my queue will send/receive fine on my
> LOCAL postfix, as long as they aren't going outside our network.
>
> I have a message in my Inbox that was (re)sent from one of the users'
> Sent folder to me, and I received it fine.
>
> If I compose a brand new message, and attach this message to the fresh
> message, and send it to one of my personal accounts outside our network
> - AGAIN, it gets stuck in the queue.
>
> Doesn't that say plain and simple that there is something malformed
> about that message?
>

malformed messages won't cause network timeouts unless
something is getting stuck doing some sort of content inspection.

--
Noel Jones
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/14/2008 6:38 PM, Noel Jones wrote:
> malformed messages won't cause network timeouts unless something is
> getting stuck doing some sort of content inspection.

Ok, then, how do you explain this:

We have a test account set up for this company at gmail, so, I just
reconfigured my postfix to use smtp.gmail.com as its outbound relay,
requeued one of the problem messages, and presto, it went straight out,
no problem.

I'm not the brightest bulb on the shelf, but doesn't this PROVE that
there are:

1. no hardware problems,

2. no network/firewall problems, and

3. no postfix problems?

So what else could it be but a problem with something in the CONTENT of
the message itself, as I've been saying all along?

I can't use gmail for my permanent relay, so I've got to get this
problem fixed.

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/14/2008 6:38 PM, Noel Jones wrote:
> malformed messages won't cause network timeouts unless something is
> getting stuck doing some sort of content inspection.

And to repeat - we do ZERO content filtering, either inbound (webroot
does inbound for us) or outbound.

I posted postconf -n at the beginning of this thread. I can provide
master.cf content if necessary...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

mouss-2
Charles Marcus wrote:
> On 6/14/2008 6:38 PM, Noel Jones wrote:
>> malformed messages won't cause network timeouts unless something is
>> getting stuck doing some sort of content inspection.
>
> And to repeat - we do ZERO content filtering, either inbound (webroot
> does inbound for us) or outbound.

do webroot implement "real time" proxying (in contrast to "store and
foward")? can you send directly to hearst and see if you get the problem?

>
> I posted postconf -n at the beginning of this thread. I can provide
> master.cf content if necessary...
>

Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Noel Jones-2
In reply to this post by Charles Marcus
Charles Marcus wrote:

> On 6/14/2008 6:38 PM, Noel Jones wrote:
>> malformed messages won't cause network timeouts unless something is
>> getting stuck doing some sort of content inspection.
>
> And to repeat - we do ZERO content filtering, either inbound (webroot
> does inbound for us) or outbound.
>
> I posted postconf -n at the beginning of this thread. I can provide
> master.cf content if necessary...
>

So what happens if you hand-craft a message with suspect
content?  Is the problem reproducible?

If messages with certain content are causing network timeouts,
then *someone* is doing broken content inspection.

This broken content inspection could be at either your end or
at your provider's end.  You've tried two different providers
that exhibit the same symptom.

One thing I can say for sure is this isn't a postfix problem.

A packet capture is the next thing to look at.  Also, if you
can post one of the problem messages somewhere that might be
useful too.

--
Noel Jones
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
On 6/14/2008, Noel Jones ([hidden email]) wrote:
> So what happens if you hand-craft a message with suspect content?  Is
> the problem reproducible?

Depends on precisely what you mean by 'hand-craft a message with suspect
content'.

I have already said I created a new message from scratch, and then
ATTACHED in its entirety one of the problem messages, and it still gets
stuck.

> If messages with certain content are causing network timeouts, then
> *someone* is doing broken content inspection.

I agree - and both Nuvox and Webroot do content inspection - did I not
alreasy say so? If not, my apologies... I was more focused on how to
prove this problem was on both Nuvox and Webroots end.

I did get thrown initially by the fact that they both had problems...

> This broken content inspection could be at either your end or at your
> provider's end.  You've tried two different providers that exhibit
> the same symptom.
>
> One thing I can say for sure is this isn't a postfix problem.

I agree... in my initial message outlining this problem I said as much,
I was just asking for another pair of eyes.

> A packet capture is the next thing to look at.  Also, if you can post
> one of the problem messages somewhere that might be useful too.

I don't think it is necessary at this point... now that I know for sure
that it is not my postfix, and can prove it to both Nuvox and Webroot,
I'm confident I'll be able to push them for a resolution.

Sorry to drag this out so long, and I do appreciate the responses...
made me think enough to finally try setting up gmail as a relay for
postfix, which confirmed that the problem is NOT postfix...

--

Best regards,

Charles
Reply | Threaded
Open this post in threaded view
|

Re: *Some* messages getting stuck in outbound queue...

Charles Marcus
In reply to this post by mouss-2
On 6/14/2008 7:00 PM, mouss wrote:
>>> malformed messages won't cause network timeouts unless something
>>> is getting stuck doing some sort of content inspection.

>> And to repeat - we do ZERO content filtering, either inbound
>> (webroot does inbound for us) or outbound.

> do webroot implement "real time" proxying (in contrast to "store and
> foward")? can you send directly to hearst and see if you get the
> problem?

Well... I could, but I'm confident that it would send just fine, since:

1. this problem just started occurring over the last two weeks or so,

2. this server has been working just fine like this for over 3 YEARS,

3. there have been no s/w upgrades in the last 2 months,

4. the problem originally started when I was still using our ISP (Nuvox)
as our relay

5. only a very few messages, fitting a very specific criteria, are affected

6. it is limited to messages ORIGINATING from this one domain

Again (and I said this from the beginning) - I do not and never did
believe this was a postfix problem - I was just asking for a sanity
check because I had tried everything else I could think of.

I'll ask again... since it WORKS using gmail, both directly, AND
relaying via postfix, can we at least agree that there MUST be
*something* about the CONTENT of the messages *themselves* that both
Nuvox and Webroots servers don't like?

Lastly, and again as I said in my initial post, the problem domain -
hearst.com - is using BATV. I don't know when they started using it, but
I have already made a formal inquiry. If the time they started using it
was within the last few weeks, then...

--

Best regards,

Charles
12