Mail server in loopback network (fairly common?)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RE: Delivery delay problems

Marco TCHI HONG
>One possible explanation is that the filter queries a
>broken DNS (blocklist) server
When the problem occured, I already thought about the broken DNSBL query.
The content filter isn't doing any DNSBL check and the problem persists.

>Another possibility is that the
>Postfix SMTP server behind the content filter has problems when it
>tries to resolve the 127.0.0.1 client IP address to a hostname.
How would be resolving a problem 127.0.0.1 when in my /etc/hosts.conf I
have : order hosts,bind and the right entry in /etc/hosts

How does the qmgr determines when it can send the message to the content
filter ?

Marco
-----Message d'origine-----
De : [hidden email]
[mailto:[hidden email]] De la part de Wietse Venema
Envoyé : vendredi 26 septembre 2008 15:40
À : Postfix users
Objet : Re: Delivery delay problems

Marco TCHI HONG:
> Our problem is that mail stay a long time in the active queue before the
> content filter, but when it's sent to the server where mailboxes are
> stored there's no problem:
>
> Sep 26 10:11:57 mx postfix/smtp[1387]: 1906C718411:
> to=<[hidden email]>, relay=127.0.0.1[127.0.0.1]:9026, conn_use=5,
> delay=230, delays=0.93/210/0/19, dsn=2.0.0, status=sent (250 2.0.0 Ok
> (2.0.0 Ok: queued as 5A0B6718ACF ))

Wietse:
> The mail spends 19 seconds in the content filter.  This means that
> your content filter performance sucks.

Marco TCHI HONG:
> Ok, so the problem is definitely a content filter problem ?
> This has nothing to do with our Postfix configuration if I understand.

You need to find out why the mail spends 19 seconds in the content
filter.  One possible explanation is that the filter queries a
broken DNS (blocklist) server. Another possibility is that the
Postfix SMTP server behind the content filter has problems when it
tries to resolve the 127.0.0.1 client IP address to a hostname.
And then there are a billion other possibilities.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Delivery delay problems

Wietse Venema
Wietse:
> >Another possibility is that the
> >Postfix SMTP server behind the content filter has problems when it
> >tries to resolve the 127.0.0.1 client IP address to a hostname.

Marco TCHI HONG:
> How would be resolving a problem 127.0.0.1 when in my /etc/hosts.conf I
> have : order hosts,bind and the right entry in /etc/hosts

Just telnet to the "after-filter" SMTP port and see if there is a
delay before Postfix responds.

> How does the qmgr determines when it can send the message to the content
> filter ?

This is controlled by the concurrency limit for the content filter.

http://www.postfix.org/postconf.5.html#default_destination_concurrency_limit
http://www.postfix.org/postconf.5.html#default_destination_recipient_limit

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Delivery delay problems

Victor Duchovni
On Fri, Sep 26, 2008 at 09:19:50AM -0400, Wietse Venema wrote:

> Wietse:
> > >Another possibility is that the
> > >Postfix SMTP server behind the content filter has problems when it
> > >tries to resolve the 127.0.0.1 client IP address to a hostname.
>
> Marco TCHI HONG:
> > How would be resolving a problem 127.0.0.1 when in my /etc/hosts.conf I
> > have : order hosts,bind and the right entry in /etc/hosts
>
> Just telnet to the "after-filter" SMTP port and see if there is a
> delay before Postfix responds.
>

KAV versions I've seen are not fully transparent proxies, they respond
with banner 220 and EHLO 250 before making a downstream connection. The
connection to the downstream server may happen as late as "." (after
the content is scanned).  It is certainly important to make sure that
the configured concurrency into the filter is not too high and that
the downstream re-injection port concurrency is at least that high.

It is also a good idea to configure the scanner to use RAM-disk for
capturing message content for scanning (on a modern machine with
multiple GB of RAM).

The OP should measure process concurrency, CPU utilization, disk
utilization, ... Possibly tcpdump the Postfix -> KAV and
KAV->postfix traffic and look for delays.

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

RE: Delivery delay problems

Marco TCHI HONG
>KAV versions I've seen are not fully transparent proxies, they respond
>with banner 220 and EHLO 250 before making a downstream connection. The
>connection to the downstream server may happen as late as "." (after
>the content is scanned).  It is certainly important to make sure that
>the configured concurrency into the filter is not too high and that
>the downstream re-injection port concurrency is at least that high.

The downstream re-injection port concurrency is a bit higher than the
concurrency into the filter.

Below is the chain a message follows :

smtpd (port 25, 100) -> spawn/kas-pipe (port 9026, 50) ->
spawn/smtpscanner (port 10025, 60) -> spawn (port 9025, 70).

spawn/kas-pipe (port 9026, 50) : to antispam
spawn/smtpscanner (port 10025, 60) : to antivirus
spawn (port 9025, 70) : back to postfix

[root@mx marco]# telnet 127.0.0.1 9026
Trying 127.0.0.1...
Connected to mx.dts.mg (127.0.0.1).
Escape character is '^]'.
220 kas30pipe.dts.mg ESMTP Service ready -> I get this message
instantaneously

[root@mx marco]# telnet 127.0.0.1 9025
Trying 127.0.0.1...
Connected to mx.dts.mg (127.0.0.1).
Escape character is '^]'.
220 mx.dts.mg ESMTP Postfix (DATA TELECOM SERVICE) -> I get this message
instantaneously

[root@mx marco]# telnet 127.0.0.1 10025
Trying 127.0.0.1...
Connected to mx.dts.mg (127.0.0.1).
Escape character is '^]'.
220 mx.dts.mg ESMTP Kaspersky Lab. -> It takes forever to get this one ...

So I guess my problem is with the antivirus...

>The OP should measure process concurrency, CPU utilization, disk
>utilization, ... Possibly tcpdump the Postfix -> KAV and
>KAV->postfix traffic and look for delays.

However I have no issue with CPU,Ram utilisation and Disk IO Wait.


Reply | Threaded
Open this post in threaded view
|

Re: Delivery delay problems

Victor Duchovni
On Fri, Sep 26, 2008 at 05:44:28PM +0300, Marco TCHI HONG wrote:

> >KAV versions I've seen are not fully transparent proxies, they respond
> >with banner 220 and EHLO 250 before making a downstream connection. The
> >connection to the downstream server may happen as late as "." (after
> >the content is scanned).  It is certainly important to make sure that
> >the configured concurrency into the filter is not too high and that
> >the downstream re-injection port concurrency is at least that high.
>
> The downstream re-injection port concurrency is a bit higher than the
> concurrency into the filter.
>
> Below is the chain a message follows :
>
> smtpd (port 25, 100) -> spawn/kas-pipe (port 9026, 50) ->
> spawn/smtpscanner (port 10025, 60) -> spawn (port 9025, 70).

These concurrency numbers are very high. Running A/V scanning
at concurrency substantially higher than ~20 (on Dual CPU boxes) is
generally counter-productive.

What is the destination concurrency limit for mail heading to the
"kas-pipe" process?

How many concurrent threads is "aveserver" configured for?

> Escape character is '^]'.
> 220 mx.dts.mg ESMTP Kaspersky Lab. -> It takes forever to get this one ...
>
> So I guess my problem is with the antivirus...

The smtpscanner may be taking a long time to connect to aveserver...

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

RE: Delivery delay problems

Marco TCHI HONG
>These concurrency numbers are very high. Running A/V scanning
>at concurrency substantially higher than ~20 (on Dual CPU boxes) is
>generally counter-productive.

I have two Xeon 5160 Dual-Core 3,0 GHz on my box and 4Gb RAM.
About 500k mail go through this MX (50Gb traffic).

>What is the destination concurrency limit for mail heading to the
>"kas-pipe" process?

Right now it is set to 200. Well it was set to 40 before the problem
appeared, but I increased it ... because I thought it was the problem.

I don't know if lowering my concurrency limits would help.

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: Delivery delay problems

Victor Duchovni
On Fri, Sep 26, 2008 at 06:30:03PM +0300, Marco TCHI HONG wrote:

> >These concurrency numbers are very high. Running A/V scanning
> >at concurrency substantially higher than ~20 (on Dual CPU boxes) is
> >generally counter-productive.
>
> I have two Xeon 5160 Dual-Core 3,0 GHz on my box and 4Gb RAM.
> About 500k mail go through this MX (50Gb traffic).
>
> >What is the destination concurrency limit for mail heading to the
> >"kas-pipe" process?
>
> Right now it is set to 200. Well it was set to 40 before the problem
> appeared, but I increased it ... because I thought it was the problem.

The client concurrency must not exceed the service concurrency, and
virus scanning is CPU intensive, and in my experience this is too much.

Of course in a multi-filter chain, if any of the filters are high
latency, the combined filter latency can throttle the CPU demand
of any CPU intensive stage, and in that case, higher concurrency
may be appropriate, but if filters in a chain have vastly different
characteristics (impedance mismatch), it may be appropriate to
insert a Postfix queue in between:


        Postfix -> filter1 -> Postfix -> filter2 -> Postfix ...

Each Postix stage (except the last) sets a suitable content filter, has
appropriate concurrency settings, ...

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: Mail server in loopback network (fairly common?)

Bill Cole-3
In reply to this post by Juan Miscaro-2
Juan Miscaro wrote:
> 2008/9/25 Brian Evans - Postfix List <[hidden email]>:
[...]
>> The Problem the OP appears to fall into is that mail coming from outside
>> the mynetworks is being trapped to do a "local" DNS MX/A record.
>> It is probably pointing mail to the "example.com" as 127.0.0.1 (not
>> uncommon).
>
> It points mail for the domain to the local server's FQDN.  And that
> translates to localhost because of entries in /etc/hosts.

Don't map your FQDN to 127.0.0.1 in your hosts file. Your FQDN should
resolve to your primary non-loopback IP address.


Reply | Threaded
Open this post in threaded view
|

RE: Delivery delay problems

Marco TCHI HONG
In reply to this post by Victor Duchovni
>The client concurrency must not exceed the service concurrency, and
>virus scanning is CPU intensive, and in my experience this is too much.
>
>Of course in a multi-filter chain, if any of the filters are high
>latency, the combined filter latency can throttle the CPU demand
>of any CPU intensive stage, and in that case, higher concurrency
>may be appropriate, but if filters in a chain have vastly different
>characteristics (impedance mismatch), it may be appropriate to
>insert a Postfix queue in between:
>
>
> Postfix -> filter1 -> Postfix -> filter2 -> Postfix ...
>
>Each Postix stage (except the last) sets a suitable content filter, has
>appropriate concurrency settings, ...

I will update to the last Kaspersky AntiSpam, lower and do some changes so
that client concurrency does not exceed service concurrency.
If the problem remains, I'll try the chain you suggest and maybe move the
Antivirus to another box.
12