Want to be sure i am not throttling user.

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Want to be sure i am not throttling user.

Fazzina, Angelo

Hi, i am troubleshooting a client complaint.

This user “wellness”

 

Aug 28 10:22:27 mail5 postfix/smtpd[7534]: EE46E2FB: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness

 

Some user feedback :

                                On Friday I sent a batch of 436 and it took 11 minutes to send

                                This morning I sent a batch of 725 and it took 1 hour and 21 minutes

 

 

Do any of my settings throttle their ability to send to my postfix server ?

 

I think it is the client they use. MS Outlook.

I think i accept as fast as they send, and Outlook is sending slow and dictating the rate.

 

Thanks for looking.

 

 

smtp      inet  n       -       n       -       -       smtpd

submission inet n       -       n       -       -       smtpd

  -o smtpd_tls_security_level=encrypt

  -o smtpd_recipient_limit=5000

  -o smtpd_sasl_auth_enable=yes

  -o smtpd_client_restrictions=permit_sasl_authenticated,reject

  -o milter_macro_daemon_name=ORIGINATING

smtps     inet  n       -       n       -       -       smtpd

  -o smtpd_tls_wrappermode=yes

  -o smtpd_recipient_limit=5000

  -o smtpd_sasl_auth_enable=yes

  -o smtpd_client_restrictions=permit_sasl_authenticated,reject

  -o milter_macro_daemon_name=ORIGINATING

 

[root@mail4 postfix]# postconf -n

alias_database = hash:/etc/aliases

alias_maps = hash:/etc/aliases

command_directory = /usr/sbin

config_directory = /etc/postfix

daemon_directory = /usr/libexec/postfix

data_directory = /var/lib/postfix

debug_peer_level = 2

header_checks = regexp:/etc/postfix/header_checks

html_directory = no

inet_interfaces = all

inet_protocols = all

mail_owner = postfix

mailq_path = /usr/bin/mailq.postfix

manpage_directory = /usr/share/man

message_size_limit = 31457280

mydestination = $myhostname, localhost.$mydomain, localhost, appmail2.uconn.edu

mynetworks = 137.99.28.110 137.99.28.117 137.99.28.90 137.99.28.116 137.99.188.202 10.3.28.90 10.3.28.116 10.4.25.183 10.4.40.212 10.4.40.194 10.4.40.189  66.29.212.37

newaliases_path = /usr/bin/newaliases.postfix

queue_directory = /var/spool/postfix

readme_directory = /usr/share/doc/postfix-2.6.6/README_FILES

sample_directory = /usr/share/doc/postfix-2.6.6/samples

sendmail_path = /usr/sbin/sendmail.postfix

setgid_group = postdrop

smtpd_recipient_restrictions = check_recipient_access hash:/etc/postfix/static, permit_mynetworks, permit_sasl_authenticated, reject_unauth_destination

smtpd_tls_cert_file = /etc/pki/tls/certs/massmail_uconn_edu_cert_interm.cer

smtpd_tls_exclude_ciphers = IDEA-CBC-SHA, DES-CBC3-SHA, EDH-RSA-DES-CBC3-SHA, RC4, aNULL

smtpd_tls_key_file = /etc/pki/tls/private/massmail_key.key

smtpd_tls_mandatory_protocols = !SSLv2,!SSLv3

unknown_local_recipient_reject_code = 550

virtual_alias_maps = hash:/etc/postfix/virtual

 

-ANGELO FAZZINA

 

ITS Service Manager:

Spam and Virus Prevention

Mass Mailing

G Suite/Gmail

 

[hidden email]

University of Connecticut,  ITS, SSG, Server Systems

860-486-9075

 

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni


> On Aug 28, 2018, at 1:47 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> Hi, i am troubleshooting a client complaint.
> This user “wellness”
>  
> Aug 28 10:22:27 mail5 postfix/smtpd[7534]: EE46E2FB: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
>  
> Some user feedback :
>                                 On Friday I sent a batch of 436 and it took 11 minutes to send
>                                 This morning I sent a batch of 725 and it took 1 hour and 21 minutes

The answer is in your logs.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

RE: Want to be sure i am not throttling user.

Fazzina, Angelo
Hi, Thank you for the bread crumbs. After looking at my logs I do see differences in the "delay" from 0.72 to 2.2 seconds to send an email.
Then I googled this little tid-bit:
Postfix logs additional delay information as "delays=a/b/c/d" where a=time before queue manager, including message transmission; b=time in queue manager; c=connection setup time including DNS, HELO and TLS; d=message transmission time.

If most of the time is in "d", what can I do to speed that up ?

I also went here http://www.postfix.org/TUNING_README.html and the only offense I seem to be making is using the header_checks variable so I can get the email subjects in my logs.

Should my first test be to comment out  "header_checks = regexp:/etc/postfix/header_checks" from my main.cf ?

Cheers.

Below is a sample of 4 emails:


9:22:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:22:43 mail4 postfix/smtpd[16278]: 42829300: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
Aug 28 09:22:43 mail4 postfix/cleanup[15320]: 42829300: warning: header Subject:  ALCOHOLEDU INCOMPLETE from unknown[137.99.149.148]; from=<[hidden email]> to=<[hidden email]> proto=ESMTP helo=<SAIT0007739>
Aug 28 09:22:43 mail4 postfix/cleanup[15320]: 42829300: message-id=<00c201d43ed2$33b792f0$9b26b8d0$@uconn.edu>
Aug 28 09:22:43 mail4 postfix/qmgr[3142]: 42829300: from=<[hidden email]>, size=3714, nrcpt=1 (queue active)
Aug 28 09:22:44 mail4 postfix/smtp[15268]: 42829300: to=<[hidden email]>, relay=uconn-edu.mail.protection.outlook.com[216.32.181.170]:25, delay=1.2, delays=0.02/0/0.3/0.88, dsn=2.6.0, status=sent (250 2.6.0 <00c201d43ed2$33b792f0$9b26b8d0$@uconn.edu> [InternalId=4677219391064, Hostname=SN6PR05MB4558.namprd05.prod.outlook.com] 11262 bytes in 0.219, 50.159 KB/sec Queued mail for delivery)
Aug 28 09:22:44 mail4 postfix/qmgr[3142]: 42829300: removed
Aug 28 09:22:45 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]

Aug 28 09:23:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:06 mail4 postfix/smtpd[16278]: 6A9C5300: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
Aug 28 09:23:06 mail4 postfix/cleanup[15320]: 6A9C5300: warning: header Subject:  ALCOHOLEDU INCOMPLETE from unknown[137.99.149.148]; from=<[hidden email]> to=<[hidden email]> proto=ESMTP helo=<SAIT0007739>
Aug 28 09:23:06 mail4 postfix/cleanup[15320]: 6A9C5300: message-id=<01cc01d43ed2$418645c0$c492d140$@uconn.edu>
Aug 28 09:23:06 mail4 postfix/qmgr[3142]: 6A9C5300: from=<[hidden email]>, size=3712, nrcpt=1 (queue active)
Aug 28 09:23:06 mail4 postfix/smtpd[15840]: connect from msb-p-zbxprx.uits.uconn.edu[10.4.18.26]
Aug 28 09:23:06 mail4 postfix/smtpd[15840]: disconnect from msb-p-zbxprx.uits.uconn.edu[10.4.18.26]
Aug 28 09:23:08 mail4 postfix/smtp[15268]: 6A9C5300: to=<[hidden email]>, relay=uconn-edu.mail.protection.outlook.com[216.32.181.170]:25, delay=2.2, delays=0.02/0/0.25/1.9, dsn=2.6.0, status=sent (250 2.6.0 <01cc01d43ed2$418645c0$c492d140$@uconn.edu> [InternalId=4728758996924, Hostname=BYAPR05MB4792.namprd05.prod.outlook.com] 11298 bytes in 0.468, 23.558 KB/sec Queued mail for delivery)
Aug 28 09:23:08 mail4 postfix/qmgr[3142]: 6A9C5300: removed
Aug 28 09:23:08 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]

Aug 28 09:23:12 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:12 mail4 postfix/smtpd[16278]: 8F244300: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
Aug 28 09:23:12 mail4 postfix/cleanup[15320]: 8F244300: warning: header Subject:  ALCOHOLEDU INCOMPLETE from unknown[137.99.149.148]; from=<[hidden email]> to=<[hidden email]> proto=ESMTP helo=<SAIT0007739>
Aug 28 09:23:12 mail4 postfix/cleanup[15320]: 8F244300: message-id=<01e101d43ed2$452f5090$cf8df1b0$@uconn.edu>
Aug 28 09:23:12 mail4 postfix/qmgr[3142]: 8F244300: from=<[hidden email]>, size=3722, nrcpt=1 (queue active)
Aug 28 09:23:13 mail4 postfix/smtp[15268]: 8F244300: to=<[hidden email]>, relay=uconn-edu.mail.protection.outlook.com[216.32.180.170]:25, delay=0.72, delays=0.02/0/0.06/0.64, dsn=2.6.0, status=sent (250 2.6.0 <01e101d43ed2$452f5090$cf8df1b0$@uconn.edu> [InternalId=4423816319248, Hostname=BN6PR05MB2898.namprd05.prod.outlook.com] 11326 bytes in 0.215, 51.358 KB/sec Queued mail for delivery)36737 Aug 28 09:23:13 mail4 postfix/qmgr[3142]: 8F244300: removed
Aug 28 09:23:15 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]

Aug 28 09:23:17 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:17 mail4 postfix/smtpd[16278]: 98698300: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
Aug 28 09:23:17 mail4 postfix/cleanup[15320]: 98698300: warning: header Subject:  ALCOHOLEDU INCOMPLETE from unknown[137.99.149.148]; from=<[hidden email]> to=<[hidden email]> proto=ESMTP helo=<SAIT0007739>
Aug 28 09:23:17 mail4 postfix/cleanup[15320]: 98698300: message-id=<01e601d43ed2$482f7130$d88e5390$@uconn.edu>
Aug 28 09:23:17 mail4 postfix/qmgr[3142]: 98698300: from=<[hidden email]>, size=3712, nrcpt=1 (queue active)36744 Aug 28 09:23:18 mail4 postfix/smtp[15268]: 98698300: to=<[hidden email]>, relay=uconn-edu.mail.protection.outlook.com[216.32.180.170]:25, delay=0.61, delays=0.01/0/0.05/0.54, dsn=2.6.0, status=sent (250 2.6.0 <01e601d43ed2$482f7130$d88e5390$@uconn.edu> [InternalId=1511828492319, Hostname=SN2PR05MB2494.namprd05.prod.outlook      .com] 11312 bytes in 0.195, 56.403 KB/sec Queued mail for delivery)36745 Aug 28 09:23:18 mail4 postfix/qmgr[3142]: 98698300: removed36746 Aug 28 09:23:20 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]36747 Aug 28 09:23:28 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:28 mail4 postfix/smtpd[16278]: B9F70300: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness


-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

[hidden email]
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075

-----Original Message-----
From: [hidden email] <[hidden email]> On Behalf Of Viktor Dukhovni
Sent: Tuesday, August 28, 2018 2:39 PM
To: Postfix users <[hidden email]>
Subject: Re: Want to be sure i am not throttling user.



> On Aug 28, 2018, at 1:47 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> Hi, i am troubleshooting a client complaint.
> This user “wellness”
>  
> Aug 28 10:22:27 mail5 postfix/smtpd[7534]: EE46E2FB: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
>  
> Some user feedback :
>                                 On Friday I sent a batch of 436 and it took 11 minutes to send
>                                 This morning I sent a batch of 725 and it took 1 hour and 21 minutes

The answer is in your logs.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni


> On Aug 29, 2018, at 10:55 AM, Fazzina, Angelo <[hidden email]> wrote:
>
> Hi, Thank you for the bread crumbs. After looking at my logs I do see differences in the "delay"
> from 0.72 to 2.2 seconds to send an email.

The time that it takes your server to deliver each message to its destination
nexthop is not visible to and does not slow down the sender.  Senders deliver
to your queue, your queue delivers to the destination nexthop.

> Then I googled this little tid-bit:
> Postfix logs additional delay information as "delays=a/b/c/d" where a=time before queue manager, including message transmission; b=time in queue manager; c=connection setup time including DNS, HELO and TLS; d=message transmission time.

You don't generally need to be concerned with the "c" and "d" values.  They
matter only under extreme stress, when the "active" queue is full, and the
queue manager stops reading the "incoming" queue until some mail is drained.
When that happens, messages may encounter an "inflow_delay", but that is
extremely unlikely in your case.

> If most of the time is in "d", what can I do to speed that up ?

Ignore the red-herring.

> I also went here http://www.postfix.org/TUNING_README.html and the only offense I seem to be making is using the header_checks variable so I can get the email subjects in my logs.
>
> Should my first test be to comment out  "header_checks = regexp:/etc/postfix/header_checks" from my main.cf ?

Should not be a problem.  More plausible is a DNS timeout issue with:

        connect from unknown[137.99.149.148]

that could lead to connection setup delays, IIRC Postfix measures the
"a" time from the "MAIL FROM" command, not from the TCP connection
setup.  Any latency doing DNS lookups for the client name is not
included.  When I query for:

        148.149.99.137.in-addr.arpa. IN PTR ?

I get a quick NXDOMAIN.  Is that also true for your mail server?

> Below is a sample of 4 emails:

You're not seeing the forest for the trees, a sample of 4 is not that interesting.
How many messages were sent by that user during a sustained transmission window.
What was the arrival rate?  Did it change over that window?

> 09:22:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> 09:22:45 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
>
> 09:23:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> 09:23:08 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
>
> 09:23:12 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> 09:23:15 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
>
> 09:23:17 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> 09:23:20 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]

If the client is doing one delivery at a time with a new connection for each
message, with no concurrency, what's interesting to see here is the spacing
*between* connections, which is considerably longer than the duration of
connections, which again hints at a possible DNS issue, but you have to
look more closely.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Wietse Venema
Viktor Dukhovni:

> > 09:22:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:22:45 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:08 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:12 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:15 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:17 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:20 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
>
> If the client is doing one delivery at a time with a new connection for each
> message, with no concurrency, what's interesting to see here is the spacing
> *between* connections, which is considerably longer than the duration of
> connections, which again hints at a possible DNS issue, but you have to
> look more closely.

The time from 'TCP connect' to the time that Postfix logs 'connect
from' includes the time to look up the client hostname (and if
available, IP address for that hostname). This -should- be quick,
but may be slow because of a problem in your local DNS.

        Wietse
Reply | Threaded
Open this post in threaded view
|

RE: Want to be sure i am not throttling user.

Fazzina, Angelo
Hi, the client/[sender] ip 137.99.149.148 is a users desktop running Outlook, likely with a DHCP address.

In answer to: "I get a quick NXDOMAIN.  Is that also true for your mail server?"
        Yes i get the same results when i do a "dig -x 137.99.149.148" or "nslookup 137.99.149.148"

My response to the user has always been it is the client that is sending slow, i am just learning how to prove it with my logs.
I also noticed the repeated new connections, but always blamed the client for doing that and not holding onto the connection, and send multiple emails.
I take this literally "disconnect from unknown[137.99.149.148]" and not that Postfix disconnected from the client, but the client disconnected from Postfix server.

In answer to : "How many messages were sent by that user during a sustained transmission window."
                "What was the arrival rate?  Did it change over that window?"

        My claim that i am trying to prove is there is no "sustained transmission window" hence the constant  connect and disconnect seen in the logs.
        Unless i don't know what you mean by a "sustained transmission window" ?

Client connects to massmail.uconn.edu and Load balancer sends email for massmail.uconn.edu to 3 servers
If the "arrival rate" is calculated from all the "connect from unknown[137.99.149.148]" lines,
        then i will have to crunch the numbers. across 3 servers for 9:40AM 10 emails were processed.
                                                across 3 servers for 9:30AM 36 emails were processed.
This is what i saw in the logs,
start = 2018-08-28-09:22:43
166 emails sent on mail4
end = 2018-08-28-10:22:20

start = 2018-08-28-09:21:55
231 emails sent on mail5
end = 2018-08-28-10:22:27

start = 2018-08-28-08:36:42
257 emails send on mail6
end = 2018-08-28-10:22:06

I am going to recommend user requests a static IP with an A record in our DNS servers.
I don't see any down side to asking for that.
Thank you.


-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

[hidden email]
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075


-----Original Message-----
From: [hidden email] <[hidden email]> On Behalf Of Wietse Venema
Sent: Wednesday, August 29, 2018 12:03 PM
To: Postfix users <[hidden email]>
Subject: Re: Want to be sure i am not throttling user.

Viktor Dukhovni:

> > 09:22:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:22:45 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:08 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:12 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:15 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
> >
> > 09:23:17 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
> > 09:23:20 mail4 postfix/smtpd[16278]: disconnect from unknown[137.99.149.148]
>
> If the client is doing one delivery at a time with a new connection for each
> message, with no concurrency, what's interesting to see here is the spacing
> *between* connections, which is considerably longer than the duration of
> connections, which again hints at a possible DNS issue, but you have to
> look more closely.

The time from 'TCP connect' to the time that Postfix logs 'connect
from' includes the time to look up the client hostname (and if
available, IP address for that hostname). This -should- be quick,
but may be slow because of a problem in your local DNS.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni


> On Aug 29, 2018, at 12:19 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> In answer to: "I get a quick NXDOMAIN.  Is that also true for your mail server?"
> Yes i get the same results when i do a "dig -x 137.99.149.148" or "nslookup 137.99.149.148"

Are you doing the test on the MTA, or a nearby machine? As "root", or as the "postfix" user?
Is the Postfix smtpd(8) service the user is connecting to chrooted?  Look carefully at the
relevant master.cf entries.  If chrooted, check for a working etc/resolv.conf in the chroot
jail (queue_directory).

> My response to the user has always been it is the client that is sending slow, i am just learning how to prove it with my logs.
> I also noticed the repeated new connections, but always blamed the client for doing that and not holding onto the connection, and send multiple emails.

Though a new connection for each message is less efficient, it should not be prohibitively so,
the user should still be able to send O(10) messages per second.  Not O(10s) per message.

> I take this literally "disconnect from unknown[137.99.149.148]" and not that Postfix disconnected from the client, but the client disconnected from Postfix server.

Yes, the client sends "QUIT" and disconnects.

> In answer to : "How many messages were sent by that user during a sustained transmission window."
> "What was the arrival rate?  Did it change over that window?"
>
> My claim that i am trying to prove is there is no "sustained transmission window" hence the constant  connect and disconnect seen in the logs.

A sustained transmission window is a period of time during which the client is
actively sending a batch of mail.  

> This is what i saw in the logs,
> start = 2018-08-28-09:22:43
> 166 emails sent on mail4
> end = 2018-08-28-10:22:20

166 messages per hour is rather slow.  Was this a sustained batch,
or did you arbitrarily choose an hour.  Perhaps most of the 166
arrived during the first few minutes???  You need to aggregate
the deliveries by the arrival minute and look at a histogram
of messages per minute.

This is a data analysis problem, you should be able to figure it out,
by rolling up your sleeves and looking carefully at the data.  You
may also need PCAP files for the next file this user sends a batch
of mail, so you can see what happens after TCP connection setup.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

RE: Want to be sure i am not throttling user.

Fazzina, Angelo
Thanks I'll try to figure it out,
In case you have more time to help,

No chroot
[root@mail4 log]# head -n12 /etc/postfix/master.cf
#
# Postfix master process configuration file.  For details on the format
# of the file, see the master(5) manual page (command: "man 5 master").
#
# Do not forget to execute "postfix reload" after editing this file.
#
# ==========================================================================
# service type  private unpriv  chroot  wakeup  maxproc command + args
#               (yes)   (yes)   (yes)   (never) (100)
# ==========================================================================
smtp      inet  n       -       n       -       -       smtpd
submission inet n       -       n       -       -       smtpd


I didn't pick an hour that's all that was in the file before log rotation of the maillog file.

I'm confused what i said that led you to think this?
          Perhaps most of the 166 arrived during the first few minutes???  

The user is simply connecting and disconnecting repeatedly. I think the question is why ?
I want to make sure Postfix is no way involved with why client is connecting and disconnecting constantly.
I appreciate your patience.

[root@mail4 log]# cat maillog-20180829 |grep 137.99.149.148 |grep -v disconnect |grep -v submission|grep connect
Aug 28 09:22:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:12 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:17 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:28 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:23:39 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:24:02 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:24:43 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:24:55 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:25:06 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:25:29 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:25:51 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:26:02 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:26:38 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:27:10 mail4 postfix/smtpd[16278]: connect from unknown[137.99.149.148]
Aug 28 09:31:37 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:31:54 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:32:16 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:32:31 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:32:42 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:32:53 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:33:06 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:33:21 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:33:43 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:33:58 mail4 postfix/smtpd[16828]: connect from unknown[137.99.149.148]
Aug 28 09:34:24 mail4 postfix/smtpd[17119]: connect from unknown[137.99.149.148]
Aug 28 09:34:35 mail4 postfix/smtpd[17119]: connect from unknown[137.99.149.148]
Aug 28 09:34:56 mail4 postfix/smtpd[17172]: connect from unknown[137.99.149.148]
Aug 28 09:35:02 mail4 postfix/smtpd[17191]: connect from unknown[137.99.149.148]
Aug 28 09:35:25 mail4 postfix/smtpd[17218]: connect from unknown[137.99.149.148]
Aug 28 09:35:42 mail4 postfix/smtpd[17218]: connect from unknown[137.99.149.148]
Aug 28 09:35:53 mail4 postfix/smtpd[17218]: connect from unknown[137.99.149.148]
Aug 28 09:36:05 mail4 postfix/smtpd[17218]: connect from unknown[137.99.149.148]
Aug 28 09:36:21 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:37:01 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:37:36 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:37:47 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:38:15 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:38:42 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:38:52 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:39:03 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:39:43 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:39:59 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:40:03 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:40:09 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:40:15 mail4 postfix/smtpd[17564]: connect from unknown[137.99.149.148]
Aug 28 09:40:36 mail4 postfix/smtpd[17564]: connect from unknown[137.99.149.148]
Aug 28 09:40:55 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:41:06 mail4 postfix/smtpd[17564]: connect from unknown[137.99.149.148]
Aug 28 09:41:22 mail4 postfix/smtpd[17275]: connect from unknown[137.99.149.148]
Aug 28 09:41:50 mail4 postfix/smtpd[17645]: connect from unknown[137.99.149.148]
Aug 28 09:42:00 mail4 postfix/smtpd[17659]: connect from unknown[137.99.149.148]
Aug 28 09:42:40 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:43:10 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:43:27 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:43:45 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:44:01 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:44:43 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:45:07 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:45:39 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:45:54 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:46:04 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:46:20 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:46:38 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:47:11 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:47:28 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:47:45 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:47:57 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:48:37 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:49:06 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:49:18 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:49:30 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:49:46 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:50:13 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:50:27 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:50:59 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:51:14 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:51:26 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:51:41 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:51:59 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:52:24 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:52:36 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:52:53 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:53:04 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:53:15 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:53:32 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:53:50 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:54:02 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:54:21 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:54:39 mail4 postfix/smtpd[17743]: connect from unknown[137.99.149.148]
Aug 28 09:54:44 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:59:42 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 09:59:57 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:00:03 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:00:14 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:00:30 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:00:51 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:01:07 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:01:43 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:02:29 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:02:55 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:03:29 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:03:40 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:03:51 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:04:10 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:04:27 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:04:43 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:05:04 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:05:19 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:05:41 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:05:53 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:06:04 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:06:22 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:06:32 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:06:50 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:06:57 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:07:08 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:07:29 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:07:56 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:08:13 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:09:39 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:10:07 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:10:32 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:10:47 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:10:59 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:11:12 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:11:33 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:11:48 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:12:06 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:13:03 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:13:08 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:13:29 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:13:41 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:13:56 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:14:02 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:14:17 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:14:23 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:14:33 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:14:51 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:15:23 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:15:44 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:15:59 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:16:10 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:16:37 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:16:59 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:17:17 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:17:44 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:18:06 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:18:17 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:18:38 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:19:15 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:19:35 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:19:46 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:19:56 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:20:13 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:20:26 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:20:41 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:20:58 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:21:14 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:21:25 mail4 postfix/smtpd[17739]: connect from unknown[137.99.149.148]
Aug 28 10:21:36 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:21:45 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:22:11 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]
Aug 28 10:22:20 mail4 postfix/smtpd[19099]: connect from unknown[137.99.149.148]

-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

[hidden email]
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075


-----Original Message-----
From: [hidden email] <[hidden email]> On Behalf Of Viktor Dukhovni
Sent: Wednesday, August 29, 2018 1:00 PM
To: Postfix users <[hidden email]>
Subject: Re: Want to be sure i am not throttling user.



> On Aug 29, 2018, at 12:19 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> In answer to: "I get a quick NXDOMAIN.  Is that also true for your mail server?"
> Yes i get the same results when i do a "dig -x 137.99.149.148" or "nslookup 137.99.149.148"

Are you doing the test on the MTA, or a nearby machine? As "root", or as the "postfix" user?
Is the Postfix smtpd(8) service the user is connecting to chrooted?  Look carefully at the
relevant master.cf entries.  If chrooted, check for a working etc/resolv.conf in the chroot
jail (queue_directory).

> My response to the user has always been it is the client that is sending slow, i am just learning how to prove it with my logs.
> I also noticed the repeated new connections, but always blamed the client for doing that and not holding onto the connection, and send multiple emails.

Though a new connection for each message is less efficient, it should not be prohibitively so,
the user should still be able to send O(10) messages per second.  Not O(10s) per message.

> I take this literally "disconnect from unknown[137.99.149.148]" and not that Postfix disconnected from the client, but the client disconnected from Postfix server.

Yes, the client sends "QUIT" and disconnects.

> In answer to : "How many messages were sent by that user during a sustained transmission window."
> "What was the arrival rate?  Did it change over that window?"
>
> My claim that i am trying to prove is there is no "sustained transmission window" hence the constant  connect and disconnect seen in the logs.

A sustained transmission window is a period of time during which the client is
actively sending a batch of mail.  

> This is what i saw in the logs,
> start = 2018-08-28-09:22:43
> 166 emails sent on mail4
> end = 2018-08-28-10:22:20

166 messages per hour is rather slow.  Was this a sustained batch,
or did you arbitrarily choose an hour.  Perhaps most of the 166
arrived during the first few minutes???  You need to aggregate
the deliveries by the arrival minute and look at a histogram
of messages per minute.

This is a data analysis problem, you should be able to figure it out,
by rolling up your sleeves and looking carefully at the data.  You
may also need PCAP files for the next file this user sends a batch
of mail, so you can see what happens after TCP connection setup.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni


> On Aug 29, 2018, at 1:53 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> [root@mail4 log]# cat maillog-20180829 |grep 137.99.149.148 |grep -v disconnect |grep -v submission|grep connect

You forgot to aggregate:

  $ ... | awk '{print $3}' | sed -e 's/.:..$/0/' | uniq -c
  15 09:20
  28 09:30
  30 09:40
  20 09:50
  28 10:00
  33 10:10
  10 10:20

So that's a peak rate of of 33 messages per 10 minutes, or 3
messages a minute!  With mail transactions lasting just a couple
of seconds, and average connection spacing at ~20 seconds,
the client is spending most of its time disconnected or waiting
for the SMTP banner.

One possibility is that you don't have enough smtpd(8) process
concurrency.  If your server is busy handling lots of concurrent
traffic, perhaps it reaches the $default_process_limit often, and
the client's connection are stuck waiting for a free process to
accept a new connection.

You should try connecting to the relevant port on your server
a few times (spaced minutes apart) and see how long you typically
need to wait before you see a 220 banner?  (The same port the
user is using to submit mail, not clear whether it is 25 or 587).

The default process limits in Postfix date back to Y2K hardware,
and servers have a lot more RAM, CPU and network bandwidth these
days.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Matus UHLAR - fantomas
In reply to this post by Fazzina, Angelo
On 28.08.18 17:47, Fazzina, Angelo wrote:
>Aug 28 10:22:27 mail5 postfix/smtpd[7534]: EE46E2FB: client=unknown[137.99.149.148], sasl_method=LOGIN, sasl_username=wellness
>
>Some user feedback :
>                                On Friday I sent a batch of 436 and it took 11 minutes to send
>                                This morning I sent a batch of 725 and it took 1 hour and 21 minutes

>header_checks = regexp:/etc/postfix/header_checks

what's in /etc/postfix/header_checks?



--
Matus UHLAR - fantomas, [hidden email] ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
On the other hand, you have different fingers.
Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni


> On Aug 30, 2018, at 6:46 AM, Matus UHLAR - fantomas <[hidden email]> wrote:
>
> what's in /etc/postfix/header_checks?

Nothing relevant.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

RE: Want to be sure i am not throttling user.

Fazzina, Angelo
In reply to this post by Viktor Dukhovni
Hi, I was able to run a packet capture with tcpdump on the 3 load balanced servers that handle massmail.uconn.edu during the users mail merge today.
It was looking like one email every 12 seconds from me doing [tail -f /var/log/maillog |grep 137.99.31.52] on each server, during the capture as well.
I am using this link to get up to speed on wireshark  
https://blogs.technet.microsoft.com/eopfieldnotes/2015/08/27/useful-wireshark-filters-for-mail-flow-troubleshooting/

I will try to merge the captures of the 3 servers so I can see the real picture as the LB does round robin.
Hopefully I will have news to report this issue is not related to Postfix.

1. They use sasl on port 587

2. this time they used different IP that had a DNS entry.
        Aug 31 13:32:38 mail4 postfix/smtpd[24239]: 99867627: client=d31h52.public.uconn.edu[137.99.31.52], sasl_method=LOGIN, sasl_username=wellness

3.  This is instantaneous every time I test
[root@mail4 ~]# telnet massmail.uconn.edu  587
Trying 137.99.26.55...
Connected to massmail.uconn.edu.
Escape character is '^]'.
220 mail4.uits.uconn.edu ESMTP Postfix
quit
221 2.0.0 Bye
Connection closed by foreign host.
[root@mail4 ~]# telnet massmail.uconn.edu  587
Trying 137.99.26.55...
Connected to massmail.uconn.edu.
Escape character is '^]'.
220 mail5.uits.uconn.edu ESMTP Postfix
quit
221 2.0.0 Bye
Connection closed by foreign host.
[root@mail4 ~]# telnet massmail.uconn.edu  587
Trying 137.99.26.55...
Connected to massmail.uconn.edu.
Escape character is '^]'.
220 mail6.uits.uconn.edu ESMTP Postfix
quit
221 2.0.0 Bye
Connection closed by foreign host.

4.  Servers are never that busy IMHO but I have 7 days of graphs that would show that here [mail4,mail5,mail6] are the servers.
        http://ssgunix.uits.uconn.edu/ssgunix.php

thanks again.

P.S. header_checks file has one line to dump email subjects in my logs.
P.P.S.  I take it this is not the place to ask someone to look at my capture file, to help speed up forensics ?




-ANGELO FAZZINA

ITS Service Manager:
Spam and Virus Prevention
Mass Mailing
G Suite/Gmail

[hidden email]
University of Connecticut,  ITS, SSG, Server Systems
860-486-9075

-----Original Message-----
From: [hidden email] <[hidden email]> On Behalf Of Viktor Dukhovni
Sent: Wednesday, August 29, 2018 2:09 PM
To: Postfix users <[hidden email]>
Subject: Re: Want to be sure i am not throttling user.



> On Aug 29, 2018, at 1:53 PM, Fazzina, Angelo <[hidden email]> wrote:
>
> [root@mail4 log]# cat maillog-20180829 |grep 137.99.149.148 |grep -v disconnect |grep -v submission|grep connect

You forgot to aggregate:

  $ ... | awk '{print $3}' | sed -e 's/.:..$/0/' | uniq -c
  15 09:20
  28 09:30
  30 09:40
  20 09:50
  28 10:00
  33 10:10
  10 10:20

So that's a peak rate of of 33 messages per 10 minutes, or 3
messages a minute!  With mail transactions lasting just a couple
of seconds, and average connection spacing at ~20 seconds,
the client is spending most of its time disconnected or waiting
for the SMTP banner.

One possibility is that you don't have enough smtpd(8) process
concurrency.  If your server is busy handling lots of concurrent
traffic, perhaps it reaches the $default_process_limit often, and
the client's connection are stuck waiting for a free process to
accept a new connection.

You should try connecting to the relevant port on your server
a few times (spaced minutes apart) and see how long you typically
need to wait before you see a 220 banner?  (The same port the
user is using to submit mail, not clear whether it is 25 or 587).

The default process limits in Postfix date back to Y2K hardware,
and servers have a lot more RAM, CPU and network bandwidth these
days.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: Want to be sure i am not throttling user.

Viktor Dukhovni
On Fri, Aug 31, 2018 at 06:16:01PM +0000, Fazzina, Angelo wrote:

> Hi, I was able to run a packet capture with tcpdump on the 3 load balanced
> servers that handle massmail.uconn.edu during the users mail merge today.
> It was looking like one email every 12 seconds

The main thing you're looking for is any substantial gaps between
the client's TCP SYN packet and the server's 220 response, that
might account for the low connection rate.  If on the other hand
you see the client disconnect at the completion of a message, and
not reconnect for O(10s), then of course the message delivery will
be low.  Postfix can't accept mail faster than the client is willing
to send it.  That feature remains to be developed...

> 3.  This is instantaneous every time I test
>
> [root@mail4 ~]# telnet massmail.uconn.edu  587

That's encouraging.

> 4.  Servers are never that busy IMHO but I have 7 days of graphs that would show that here [mail4,mail5,mail6] are the servers.
> http://ssgunix.uits.uconn.edu/ssgunix.php

Those are not the right metrics, what's relevant is the number of
active TCP connections to port 587, and how close this number is
to your process limit for that service (or default process limit).
If you're getting close to capacity raise the process limit from
the default (100) to 200 or even 500, assuming you have a server
with fast disks (battery backed RAID conroller that batches and
speeds up writes) and enough RAM and network capacity.

If you never get close to 100, this is not your problem, but it may
be prudent anyway.

So far, it looks most likely that your user is throttled by his own
client software.

--
        Viktor.