Increasing logging on queue messages

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Increasing logging on queue messages

Alex Regan
Hi,

I have a postfix mail server that successfully delivers thousands of
messages per day, primarily to the same smarthost connected to it.
However, I have a number of messages that are stuck in the queue with
timeout errors:

 said: 421 smtp01.example.com Error: timeout exceeded (in reply to end
of DATA command))

I've read through the debugging and logging pages at postfix.org, and
have increased the logging, but once you increase it to a certain
point, it becomes impossible to separate the signal from the noise.
How can I increase (or capture) the logging for either specific
messages, or increase it only for messages stuck in the queue?

Can anyone give me an idea why this problem would occur in the first
place, only on specific messages? It doesn't seem to be related to
size, load on the server or receiving system, or network issue,
because it delivers tens of thousands of other messages without
incident. There are no errors on the ethernet interfaces.

There are also errors that indicate the message may be delivered more than once:

timed out while sending end of data -- message may be sent more than once)

It's difficult to track these down, though, so I can't tell if they
actually have been delivered.

Thanks,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Wietse Venema
MySQL Student:
> Hi,
>
> I have a postfix mail server that successfully delivers thousands of
> messages per day, primarily to the same smarthost connected to it.
> However, I have a number of messages that are stuck in the queue with
> timeout errors:
>
>  said: 421 smtp01.example.com Error: timeout exceeded (in reply to end
> of DATA command))

Unfortunately, you deleted lots of useful information from the
logging, including the break-down of handshake delays and of
transmission delays.

When the receiver times out receiving data, the problem is more
likely to be at the TCP level, so no amount of verbose Postfix
logging is going to help.

Instead, consider diagnosing this with tcpdump. Common TCP-level
problems are:

- Broken TCP window scaling support in firewall or end-host. Postfix
  2.6 and later have a main.cf:tcp_windowsize parameter. Setting
  this to some value under 65536 will work around this, otherwise
  you'd have to poke the kernel with sysctl or whatever.

- Broken IP path MTU discovery. The symptom is that TCP packets
  larger than some size never get acknowledged.

- Other TCP-level breakage that only shows up in packet-level
  recordings.

> I've read through the debugging and logging pages at postfix.org, and
> have increased the logging, but once you increase it to a certain
> point, it becomes impossible to separate the signal from the noise.
> How can I increase (or capture) the logging for either specific
> messages, or increase it only for messages stuck in the queue?

debug_peer_list can turn on logging for specific sites, not messages.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Alex Regan
Hi,

>>  said: 421 smtp01.example.com Error: timeout exceeded (in reply to end
>> of DATA command))
>
> Unfortunately, you deleted lots of useful information from the
> logging, including the break-down of handshake delays and of
> transmission delays.

I wasn't sure that I should post the whole queued message here.

> When the receiver times out receiving data, the problem is more
> likely to be at the TCP level, so no amount of verbose Postfix
> logging is going to help.

Ah, makes sense.

> - Broken TCP window scaling support in firewall or end-host. Postfix
>  2.6 and later have a main.cf:tcp_windowsize parameter. Setting
>  this to some value under 65536 will work around this, otherwise
>  you'd have to poke the kernel with sysctl or whatever.

Many of the messages themselves are less than 65k. Does this make a difference?

Why would it only be broken handling some messages and not others? Or
is that what only tcpdump would be able to tell us?

> debug_peer_list can turn on logging for specific sites, not messages.

It's really more of a mail router, so only handles mail for a single domain :-(

Thanks again,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Barney Desmond
2009/10/10 MySQL Student <[hidden email]>:
>> Unfortunately, you deleted lots of useful information from the
>> logging, including the break-down of handshake delays and of
>> transmission delays.
>
> I wasn't sure that I should post the whole queued message here.

The logs for a single message are usually about half a dozen lines or
less. Ingress, cleanup, queue manager, egress, removal; not big at
all, and very useful. Mask "private" information if you must, but keep
in mind that it can make it harder for others to help diagnose
problems.

>> - Broken TCP window scaling support in firewall or end-host. Postfix
>>  2.6 and later have a main.cf:tcp_windowsize parameter. Setting
>>  this to some value under 65536 will work around this, otherwise
>>  you'd have to poke the kernel with sysctl or whatever.
>
> Many of the messages themselves are less than 65k. Does this make a difference?

I can't comment with authority, but I'd say "not directly". The TCP
window generally counts bytes on the wire, which means there's other
overheads in play, not just the message data.

> Why would it only be broken handling some messages and not others? Or
> is that what only tcpdump would be able to tell us?

My first thought is, "we need to know more about these messages". You
say they're mostly for one domain, but you don't really specify
whether it's in or out, etc. Are the problematic messages perhaps:
- all from the same domain, to other domains?
- all to the same domain, from other domains?
- all to/from one user/address?

>> debug_peer_list can turn on logging for specific sites, not messages.
>
> It's really more of a mail router, so only handles mail for a single domain :-(

You mentioned a relayhost (smarthost) earlier. Do all mails go through
it? Do the messages going via the relayhost have problems? Maybe the
ones that don't go via the relayhost have problems? If you can narrow
that down a little, you could use debug_peer_list on that host
specifically, which may help.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Stan Hoeppner
Barney Desmond put forth on 10/10/2009 12:19 AM:

> Mask "private" information if you must, but keep
> in mind that it can make it harder for others to help diagnose
> problems.

Absolutely correct.  Obfuscate left-hand-side only to prevent valid
addresses being scraped from this public mailing list by spammers.
Please do _not_ obfuscate domain names, IP addresses of servers, and
host FQDNs.  Many problems reported here are DNS or network related.
Without real domain names, real IPs, and real host names we can't help
people until prying that info from them in successive emails.  It's
better to post the real info up front.  There's not much a criminal can
do with just a domain, host name, or IP address.  And no one is going to
sue you for listing this information in an attempt to solve a problem.
You're not really protecting anything by obfuscating those.

--
Stan
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Wietse Venema
In reply to this post by Alex Regan
MySQL Student:
> Hi,
>
> >> ?said: 421 smtp01.example.com Error: timeout exceeded (in reply to end
> >> of DATA command))
> >
> > Unfortunately, you deleted lots of useful information from the
----vvvvvvv
> > logging, including the break-down of handshake delays and of
----^^^^^^^
> > transmission delays.
>
> I wasn't sure that I should post the whole queued message here.

I think I was asking for LOGGING, in my reply above.

Please make a tcpdump recording. It will show why your KERNEL TCP
stack stops receiving ackowledgments from the remote side, or why
your KERNEL TCP stack stops sending to the remote side.

The TCP window scale factor is chosen by your KERNEL. Your KERNEL
has no clue about how much mail will be sent over the connection.

Finally, it is possible that hardware mis-behaves with certain
bit patterns. One should never exclude the unlikely.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Alex Regan
In reply to this post by Barney Desmond
Hi,

>>> Unfortunately, you deleted lots of useful information from the
>>> logging, including the break-down of handshake delays and of
>>> transmission delays.
>>
>> I wasn't sure that I should post the whole queued message here.
>
> The logs for a single message are usually about half a dozen lines or
> less. Ingress, cleanup, queue manager, egress, removal; not big at
> all, and very useful. Mask "private" information if you must, but keep
> in mind that it can make it harder for others to help diagnose
> problems.

I appreciate your comments about obfuscation. Hopefully I haven't
mangled the info too much. My log is more than just a few lines.
Perhaps I have the logging level turned up too high. Isn't there a way
to include more detailed logging directly in the deferred/ queue file
itself?

There are currently 19 messages in the second queue, waiting to be
delivered. How can I be sure that when I manually flush the queue to
collect the log information, that I'm following the log information
for a specific message?

I've done my best to include the relevant information, and have pasted
it at the end of this message.

>> Why would it only be broken handling some messages and not others? Or
>> is that what only tcpdump would be able to tell us?
>
> My first thought is, "we need to know more about these messages". You
> say they're mostly for one domain, but you don't really specify
> whether it's in or out, etc. Are the problematic messages perhaps:

All the messages are outbound. It receives mail from only one server,
and that's the domain primary which does content scanning and accepts
it for the domain from the Internet. This server basically forwards it
from the DMZ to an internal Exchange server.

> - all from the same domain, to other domains?

The mail is all destined for the same internal domain on the same
internal server, but may come from any Internet email address.

> - all to the same domain, from other domains?

The mail could originate from any domain.

> - all to/from one user/address?

The mail could be destined for any user in the same domain.

>>> debug_peer_list can turn on logging for specific sites, not messages.
>>
>> It's really more of a mail router, so only handles mail for a single domain :-(
>
> You mentioned a relayhost (smarthost) earlier. Do all mails go through
> it? Do the messages going via the relayhost have problems? Maybe the
> ones that don't go via the relayhost have problems? If you can narrow
> that down a little, you could use debug_peer_list on that host
> specifically, which may help.

All the email is forwarded to just one server.

Below is the log output.

Oct 11 17:06:06 mailrelay postfix/smtpd[4531]: A4B16118006:
client=smtp01.example.com[192.168.100.10]
Oct 11 17:06:06 mailrelay postfix/cleanup[4643]: A4B16118006:
message-id=<4665460.1255305961508.JavaMail.mms@atcac007>
Oct 11 17:06:06 mailrelay postfix/smtpd[4531]: disconnect from
smtp01.example.com[192.168.100.10]
Oct 11 17:06:06 mailrelay postfix/qmgr[13436]: A4B16118006:
from=<[hidden email]>, size=3924, nrcpt=1 (queue active)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 220 localhost ESMTP Postfix
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: connect from
localhost.localdomain[127.0.0.1]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: watchdog_pat: 0x808eff8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: <
localhost.localdomain[127.0.0.1]: EHLO localhost
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250-localhost
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250-PIPELINING
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250-SIZE 13844000
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250-ETRN
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250-XVERP
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250 8BITMIME
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: watchdog_pat: 0x808eff8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: <
localhost.localdomain[127.0.0.1]: MAIL FROM:<[hidden email]>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: extract_addr: input:
<[hidden email]>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: extract_addr: result:
[hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: connect to subsystem
public/cleanup2
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: public/cleanup2
socket: wanted attribute: queue_id
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: queue_id
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute
value: D98301A8005
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: public/cleanup2
socket: wanted attribute: (list terminator)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: (end)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr flags = 2
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: D98301A8005:
client=localhost.localdomain[127.0.0.1]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250 Ok
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: watchdog_pat: 0x808eff8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: <
localhost.localdomain[127.0.0.1]: RCPT TO:<[hidden email]>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: extract_addr: input:
<[hidden email]>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: extract_addr: result:
[hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: fsspace: .: block
size 4096, blocks free 13737093
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: smtpd_check_size:
blocks 4096 avail 13737093 min_free 0 size 0
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: generic_checks: START
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: generic_checks:
name=permit_mynetworks
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: permit_mynetworks:
localhost.localdomain 127.0.0.1
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: match_hostname:
localhost.localdomain ~? 127.0.0.0/8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: match_hostaddr:
127.0.0.1 ~? 127.0.0.0/8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: generic_checks:
name=permit_mynetworks status=1
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: smtpd_check_rcptmap:
[hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr request = rewrite
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr rule = canonicalize
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr address =
[hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: address
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: address
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute
value: [hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: (list terminator)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: (end)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: rewrite_clnt:
canonicalize: [hidden email] -> [hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr request = resolve
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: send attr address =
[hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: transport
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: transport
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute value: smtp
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: nexthop
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: nexthop
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute
value: mail.bestwestern.com
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: recipient
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: recipient
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute
value: [hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: flags
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: flags
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute value: 0
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: private/rewrite
socket: wanted attribute: (list terminator)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: (end)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: resolve_clnt_query:
`[hidden email]' -> t=`smtp' h=`mail.bestwestern.com'
r=`[hidden email]'
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: ctable_locate:
install entry key [hidden email]
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: maps_find:
bestwestern.com: not found
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250 Ok
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: watchdog_pat: 0x808eff8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: <
localhost.localdomain[127.0.0.1]: DATA
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 354 End data with <CR><LF>.<CR><LF>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: public/cleanup2
socket: wanted attribute: status
Oct 11 17:06:06 mailrelay postfix_2/cleanup[4450]: D98301A8005:
message-id=<4665460.1255305961508.JavaMail.mms@atcac007>
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: status
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute value: 0
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: public/cleanup2
socket: wanted attribute: reason
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: reason
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute value: (end)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: public/cleanup2
socket: wanted attribute: (list terminator)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: input attribute name: (end)
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 250 Ok: queued as D98301A8005
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: watchdog_pat: 0x808eff8
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: <
localhost.localdomain[127.0.0.1]: QUIT
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: >
localhost.localdomain[127.0.0.1]: 221 Bye
Oct 11 17:06:06 mailrelay postfix_2/smtpd[4449]: disconnect from
localhost.localdomain[127.0.0.1]
Oct 11 17:06:06 mailrelay amavis[29496]: (29496-338) Passed,
<[hidden email]> -> <[hidden email]>, Message-ID:
<4665460.1255305961508.JavaMail.mms@atcac007> [A4B16118006]
Oct 11 17:06:06 mailrelay postfix_2/qmgr[13492]: D98301A8005:
from=<[hidden email]>, size=4362, nrcpt=1 (queue active)
Oct 11 17:06:06 mailrelay postfix/smtp[4594]: A4B16118006:
to=<[hidden email]>, relay=127.0.0.1[127.0.0.1], delay=0,
status=sent (250 2.6.0 Ok, id=29496-338, from MTA: 250 Ok: queued as
D98301A8005)

Thanks,
Alex
Reply | Threaded
Open this post in threaded view
|

Increasing logging on queue messages

Stan Hoeppner
MySQL Student put forth on 10/11/2009 11:53 PM:

> I appreciate your comments about obfuscation. Hopefully I haven't
> mangled the info too much. My log is more than just a few lines.
> Perhaps I have the logging level turned up too high. Isn't there a way
> to include more detailed logging directly in the deferred/ queue file
> itself?

Ah, found it.  The problem is you've got the Postfix host relaying mail
to itself.  It's right in your log entries:

to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]

--
Stan
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Sahil Tandon
In reply to this post by Alex Regan
On Oct 12, 2009, at 1:14 AM, Stan Hoeppner <[hidden email]>  
wrote:

> MySQL Student put forth on 10/11/2009 11:53 PM:
>
>> I appreciate your comments about obfuscation. Hopefully I haven't
>> mangled the info too much. My log is more than just a few lines.
>> Perhaps I have the logging level turned up too high. Isn't there a  
>> way
>> to include more detailed logging directly in the deferred/ queue file
>> itself?
>
> Ah, found it.  The problem is you've got the Postfix host relaying  
> mail
> to itself.  It's right in your log entries:
>
> to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]

That is because the OP is using amavisd-new running on localhost, so  
Postfix (presumably because it has so been configured) is passing mail  
to amavisd-new on 127.0.0.1.  Not so problematic in and of itself.
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Alex Regan
In reply to this post by Stan Hoeppner
Hi,

> Ah, found it.  The problem is you've got the Postfix host relaying mail
> to itself.  It's right in your log entries:
>
> to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]

Yes, my apologies; as Sahil said, I'm using amavisd-new, although there is
no longer spam and virus scanning happening on this host.

The message is just getting requeued back to the second queue.

Thanks,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Wietse Venema
MySQL Student:
[ Charset ISO-8859-1 unsupported, converting... ]

> Hi,
>
> > Ah, found it. ?The problem is you've got the Postfix host relaying mail
> > to itself. ?It's right in your log entries:
> >
> > to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]
>
> Yes, my apologies; as Sahil said, I'm using amavisd-new, although there is
> no longer spam and virus scanning happening on this host.
>
> The message is just getting requeued back to the second queue.

Have you already looked at a tcpdump recording as requested?

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Alex Regan
Hi,

>> > Ah, found it. ?The problem is you've got the Postfix host relaying mail
>> > to itself. ?It's right in your log entries:
>> >
>> > to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]
>>
>> Yes, my apologies; as Sahil said, I'm using amavisd-new, although there is
>> no longer spam and virus scanning happening on this host.
>>
>> The message is just getting requeued back to the second queue.
>
> Have you already looked at a tcpdump recording as requested?

I will try and investigate that today. Can anyone recommend a
front-end to make browsing
through the output easier? Is it ethereal/ettercap/wireshark that I
should be using?

Can you recommend how I can isolate the output to a specific message? Perhaps by
process ID or the ability to somehow track the queue ID?

Thanks,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Corey Chandler
MySQL Student wrote:

> Hi,
>
>  
>>>> Ah, found it. ?The problem is you've got the Postfix host relaying mail
>>>> to itself. ?It's right in your log entries:
>>>>
>>>> to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]
>>>>        
>>> Yes, my apologies; as Sahil said, I'm using amavisd-new, although there is
>>> no longer spam and virus scanning happening on this host.
>>>
>>> The message is just getting requeued back to the second queue.
>>>      
>> Have you already looked at a tcpdump recording as requested?
>>    
>
> I will try and investigate that today. Can anyone recommend a
> front-end to make browsing
> through the output easier? Is it ethereal/ettercap/wireshark that I
> should be using?
>
>  
Yes, wireshark is fine, and can read pcap files.
> Can you recommend how I can isolate the output to a specific message? Perhaps by
> process ID or the ability to somehow track the queue ID?
>  

The "Follow TCP stream" feature is valuable; past that you'll have to
either grep through the output, or correlate timestamps accordingly.

-- Corey
Reply | Threaded
Open this post in threaded view
|

Re: Increasing logging on queue messages

Wietse Venema
In reply to this post by Alex Regan
MySQL Student:

> Hi,
>
> >> > Ah, found it. ?The problem is you've got the Postfix host relaying mail
> >> > to itself. ?It's right in your log entries:
> >> >
> >> > to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]
> >>
> >> Yes, my apologies; as Sahil said, I'm using amavisd-new, although there is
> >> no longer spam and virus scanning happening on this host.
> >>
> >> The message is just getting requeued back to the second queue.
> >
> > Have you already looked at a tcpdump recording as requested?
>
> I will try and investigate that today. Can anyone recommend a
> front-end to make browsing
> through the output easier? Is it ethereal/ettercap/wireshark that I
> should be using?
>
> Can you recommend how I can isolate the output to a specific message? Perhaps by
> process ID or the ability to somehow track the queue ID?

Postfix logs the approximate time of failed transactions. Use
that information to narrow down the search.

You need to look at the TCP-level features (stuff negotiated during
the initial handshake, and the behavior of acks and windows during
message delivery).

Broken implementations of SACK and WSCALE have resulted in timeouts
during message delivery, but we have also seen "security" appliances
that simply drop packets with flags that they did not understand.

        Wietse