Timeout when delivering to large group of aliases

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Timeout when delivering to large group of aliases

list@airstreamcomm.net
I wanted to confirm the behavior we are experiencing at the moment when
delivering messages to addresses aliased to thousands of local users.  
For example we have the address [hidden email] which is an alias
to 3000 local users.  When our inbound spam filter connects to the
Postfix server to relay a message to this user we are seeing a timeout
after 60 seconds and the message gets deferred on the filter, but the
message has actually been delivered to the alias and subsequently all
the recipients.  The filter then retries the deferred message and we
start having duplicate messages to the users.

Is it true that Postfix is waiting to send 250 OK back to the filter
until all the recipients have had a copy of the message delivered to
their inbox?  If so is there a more efficient way to go about delivering
to many thousands of aliases?

Reply | Threaded
Open this post in threaded view
|

Re: Timeout when delivering to large group of aliases

Wietse Venema
List:
> I wanted to confirm the behavior we are experiencing at the moment when
> delivering messages to addresses aliased to thousands of local users.  
> For example we have the address [hidden email] which is an alias
> to 3000 local users.  When our inbound spam filter connects to the
> Postfix server to relay a message to this user we are seeing a timeout
> after 60 seconds and the message gets deferred on the filter, but the
> message has actually been delivered to the alias and subsequently all
> the recipients.  The filter then retries the deferred message and we
> start having duplicate messages to the users.

This problem is described inm RFC 1047 (Duplicate Messages and
SMTP).  The document was published in 1988, but apperently not
everyone who needs to know has gotten it.

> Is it true that Postfix is waiting to send 250 OK back to the filter
> until all the recipients have had a copy of the message delivered to
> their inbox?  If so is there a more efficient way to go about delivering
> to many thousands of aliases?

As required by SMTP, the SMTP client MUST wait until the server
replies to the end-of-data indication ("." on a line by itself).

You can work around a slow SMTP server by  by reducing the number
of recipients per MAIL transaction. Postfix by default sends no
more than 50.

Or you can fix the SMTP server so it responds in a reasonable time,

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Timeout when delivering to large group of aliases

Robert Sander
In reply to this post by list@airstreamcomm.net
Am 18.10.2013 17:56, schrieb List:
> If so is there a more efficient way to go about delivering
> to many thousands of aliases?

By using a mailing list software for that task?

Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin


signature.asc (919 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Timeout when delivering to large group of aliases

Viktor Dukhovni
In reply to this post by list@airstreamcomm.net
On Fri, Oct 18, 2013 at 10:56:59AM -0500, List wrote:

> For example we have the address [hidden email] which
> is an alias to 3000 local users.  

What kind of "alias"?  Are you using virtual(5) aliases via
virtual_alias_maps, and with backend database, the database schema
and query used as well as information about available indexes may
be pertinent?

Or are you using local aliases(5)?

> When our inbound spam filter
> connects to the Postfix server to relay a message to this user we
> are seeing a timeout after 60 seconds and the message gets deferred
> on the filter, but the message has actually been delivered to the
> alias and subsequently all the recipients.  

Therefore (as Wietse points out) your timeout is at the "." command,
since earlier timeouts would not see the message delivered.  The
RFC recommended minimum timeout for "." is 600s, not 60s.  For
clients feeding MTAs that expand large recipient lists, I've
sometimes set timeouts of 1200s (or more as required).

> Is it true that Postfix is waiting to send 250 OK back to the filter
> until all the recipients have had a copy of the message delivered to
> their inbox?

No.  Delivery happends asynchronously.  However, virtual alias
expansion (which is recursive) happens synchronously during cleanup(8)
processing.  Large lists can take time to expand, especially if your
database is poorly indexed.

> If so is there a more efficient way to go about
> delivering to many thousands of aliases?

Index the alias database properly and use queries that can use the
index and don't force table scans.  Query databases with short
network round-trip times that are not overloaded (network, disk,
CPU, ...).

Do not use aggressive timeouts, they are counter-productive.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: Timeout when delivering to large group of aliases

list@airstreamcomm.net
On 10/19/13 3:24 PM, Viktor Dukhovni wrote:

> On Fri, Oct 18, 2013 at 10:56:59AM -0500, List wrote:
>
>> For example we have the address [hidden email] which
>> is an alias to 3000 local users.
> What kind of "alias"?  Are you using virtual(5) aliases via
> virtual_alias_maps, and with backend database, the database schema
> and query used as well as information about available indexes may
> be pertinent?
>
> Or are you using local aliases(5)?
I am using a virtual alias via virtual_alias_maps.  The query is very
efficient and runs sub-second.

>> When our inbound spam filter
>> connects to the Postfix server to relay a message to this user we
>> are seeing a timeout after 60 seconds and the message gets deferred
>> on the filter, but the message has actually been delivered to the
>> alias and subsequently all the recipients.
> Therefore (as Wietse points out) your timeout is at the "." command,
> since earlier timeouts would not see the message delivered.  The
> RFC recommended minimum timeout for "." is 600s, not 60s.  For
> clients feeding MTAs that expand large recipient lists, I've
> sometimes set timeouts of 1200s (or more as required).
Which timeout setting would need be set higher?
>
>> Is it true that Postfix is waiting to send 250 OK back to the filter
>> until all the recipients have had a copy of the message delivered to
>> their inbox?
> No.  Delivery happends asynchronously.  However, virtual alias
> expansion (which is recursive) happens synchronously during cleanup(8)
> processing.  Large lists can take time to expand, especially if your
> database is poorly indexed.
>
The database query replies with a comma delimited list of aliases in
less than a second.
>> If so is there a more efficient way to go about
>> delivering to many thousands of aliases?
> Index the alias database properly and use queries that can use the
> index and don't force table scans.  Query databases with short
> network round-trip times that are not overloaded (network, disk,
> CPU, ...).
>
> Do not use aggressive timeouts, they are counter-productive.
>
Thanks for the detail reply, I appreciate the response.


Reply | Threaded
Open this post in threaded view
|

Re: Timeout when delivering to large group of aliases

Viktor Dukhovni
On Mon, Oct 21, 2013 at 01:20:25PM -0500, List wrote:

> >What kind of "alias"?  Are you using virtual(5) aliases via
> >virtual_alias_maps, and with backend database, the database schema
> >and query used as well as information about available indexes may
> >be pertinent?
> >
> >Or are you using local aliases(5)?
>
> I am using a virtual alias via virtual_alias_maps.  The query is
> very efficient and runs sub-second.

You're forgetting that virtual expansion is *recursive*, each
resulting user is then also expanded.  So the initial query time
is not material.  Instead, split that into lines on commas, and
then time "postmap -q - type:table < listfile".

> >For clients feeding MTAs that expand large recipient lists, I've
> >sometimes set timeouts of 1200s (or more as required).
>
> Which timeout setting would need be set higher?

    $ postconf -d | egrep -v '_tls_' | grep '^smtp_.*_timeout = '
    smtp_connect_timeout = 30s
    smtp_data_done_timeout = 600s
    smtp_data_init_timeout = 120s
    smtp_data_xfer_timeout = 180s
    smtp_helo_timeout = 300s
    smtp_mail_timeout = 300s
    smtp_quit_timeout = 300s
    smtp_rcpt_timeout = 300s
    smtp_rset_timeout = 20s
    smtp_starttls_timeout = 300s
    smtp_xforward_timeout = 300s

The timeout in question is:

    smtp_data_done_timeout = 600s

> >No.  Delivery happends asynchronously.  However, virtual alias
> >expansion (which is recursive) happens synchronously during cleanup(8)
> >processing.  Large lists can take time to expand, especially if your
> >database is poorly indexed.
>
> The database query replies with a comma delimited list of aliases in
> less than a second.

You're forgetting recursion.

--
        Viktor.