We've set up two servers (ssd raid/64GB/fast dual cpu) with postmulti to deliver the mail for our newsletter service (legitimate mail). Both servers have 12 postfix instances running. Mails is injected from php scripts running at a different server.
Lately we're seeing timeouts from the php scripts at busy moments. I've been trying to debug this, but can't find the issue. Some information:
- Mail is injected through smtp, port 25
- The active queue is almost empty, even at peak moments (qshape)
- Number of deferred mails is low
- I've set default_process_limit to 10000 and smtpd_client_connection_count_limit to 3000
- Load on the servers is never higher then 2
- Number of smtp processes peaks at about 2000
- We're running local DNS
- We're using opendkim to sign every message
What can I do to find the reason for the timeouts?
> Thanks for answering. The timeouts happened because postfix was waiting for
> opendkim. Changing the socket from tcp to unix domainsockets solved this,
> almost: at busy moments postfix now logs:
> telemann postfix-13/smtpd: warning: connect to Milter service
> unix:/var/run/opendkim/opendkim.sock: Resource temporarily unavailable
> And opendkim says:
> opendkim: OpenDKIM Filter: accept() returned invalid socket (Numerical
> result out of range), try again
> Is postfix requesting something invalid, or is this a problem with opendkim?